How Easy is it to Cheat? Solving Programming Exercises Automatically with AI - UTU Tutkimustietojärjestelmä

O2 Muu julkaisu

How Easy is it to Cheat? Solving Programming Exercises Automatically with AI

Tekijät: Rytilahti, Juuso; Kaila, Erkki

Konferenssin vakiintunut nimi: International CDIO Conference

Julkaisuvuosi: 2024

Verkko-osoite: https://cdio.esprit.tn:9080/documents/1717971995327-218_b_Final.pdf

Tiivistelmä

Large Language Models (LLMs) and other AI tools have become a permanent part of the workflow in several areas, and their role is getting even bigger. While these tools can be useful in studying, they also offer possibilities for cheating by generating answers automatically. In this paper, we explore the capabilities of LLM tools to automatically solve coding exercises. The context is a large introductory programming course. Exercises are primarily coding exercises, where students solve a given task by writing program code with Python. The answers are automatically assessed using a dedicated learning management system (LMS), and the students get immediate feedback when submitting their solutions. We utilized a popular and freely available LLM called ChatGPT in the study, with two different approaches: in the first one, we took the role of an amateur programmer with no experience or understanding of coding at all. In the second one, we assumed that the person having the exercises solved had some understanding of programming and could hence fix the solutions based on the feedback. Our results indicate that an LLM can indeed be quite effective in solving the coding exercises. Depending on the version of the tool, the approach selected, and the prompt used, the simulated ChatGPT-assisted student was able to achieve between 63.4% and 86.2% of course total points by answering only programming questions with the help of ChatGPT. If only programming exercises were considered, the simulated ChatGPT-assisted student could answer 100% correctly to 107 (75.9%) to 139 (98.6%) of the course’s 141 programming exercises. The tools were also able to pass 3 different exams of the course. This means that even students with no previous experience in programming can successfully complete programming courses by utilizing freely available tools. We did the testing with GPT3.5 and GPT-4 based ChatGPT. We also discuss the features in exercises and other tasks that can make the automatic solving easier or more difficult, and the effect of obfuscation on their pedagogical value. Finally, we discuss the possible future implications the AI tools may have on similar courses and provide some suggestions on the course design.

Julkaisussa olevat rahoitustiedot:
The author(s) received no financial support for this work.