Can GPT-4 Enhance Teaching? A Pilot Study on AI-Driven Analysis of Student Course Feedback - UTU Tutkimustietojärjestelmä

A4 Vertaisarvioitu artikkeli konferenssijulkaisussa

Can GPT-4 Enhance Teaching? A Pilot Study on AI-Driven Analysis of Student Course Feedback

Tekijät: Weerakoon, Oshani; Puhtila, Panu; Mäkilä, Tuomas; Kaila, Erkki

Toimittaja: Tatti, Nikolaj; Kasurinen, Jussi; Päivärinta, Tero

Konferenssin vakiintunut nimi: Annual Doctoral Symposium of Computer Science

Kustantaja: R. Piskac c/o Redaktion Sun SITE, Informatik V, RWTH Aachen

Julkaisuvuosi: 2026

Lehti: CEUR Workshop Proceedings

Kokoomateoksen nimi: Proceedings of the Annual Doctoral Symposium of Computer Science 2025 (TKTP 2025), Helsinki, Finland, June, 2025

Artikkelin numero: paper01

Vuosikerta: 4181

eISSN: 1613-0073

Julkaisun avoimuus kirjaamishetkellä: Avoimesti saatavilla

Julkaisukanavan avoimuus : Kokonaan avoin julkaisukanava

Verkko-osoite: https://ceur-ws.org/Vol-4181/paper01.pdf

Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/508257582

Rinnakkaistallenteen lisenssi: CC BY

Rinnakkaistallennetun julkaisun versio: Kustantajan versio

Tiivistelmä

In this pilot study, we explored the use of generative AI—specifically GPT-4—to evaluate student feedback in a bilingual software engineering course offered at the University of Turku. Our aim was twofold: to examine whether ChatGPT can meaningfully evaluate student course feedback and propose suitable enhancements, and to compare its evaluations with those made by a course teacher. We collected voluntary feedback from 18 consenting students across three course instances in 2023 and 2024, resulting in a total of 390 feedback entries. These responses were first translated into English and then anonymized. Using structured questionnaires aligned with defined pedagogical goals, we then analyzed the responses through a dual evaluation process: (1) AI-based assessment using a custom JavaScript application integrating GPT-4 and GPT-4o-mini, and (2) manual evaluation by the teacher. Both followed a standardized Likert-scale format with brief textual comments, and all evaluations were consolidated into thirty-six manually maintained recording sheets. Evaluation results were visualized using heat maps across five key themes derived from the pedagogical goals. Our comparative analysis showed general alignment between the two evaluators, with key differences in the perceived content clarity and video quality of the course. We further extended our discussion to examine GPT’s applicability and limitations as a feedback evaluator. In particular, we identified its potential to quickly assess structured student feedback in courses with high participation, where manual evaluation may be time-consuming for course teachers. These findings collectively provide insights into using generative AI in course feedback analysis to enhance teaching within software engineering curricula.

Ladattava julkaisu

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.

paper01.pdf

Julkaisussa olevat rahoitustiedot:
This work has been supported by FAST, the Finnish Software Engineering Doctoral Research Network, funded by the Ministry of Education and Culture, Finland.