A4 Vertaisarvioitu artikkeli konferenssijulkaisussa

Automated Emotion Annotation of Finnish Parliamentary Speeches Using GPT-4




TekijätTarkka, Otto; Koljonen, Jaakko; Korhonen, Markus; Laine, Juuso; Martiskainen, Kristian; Elo, Kimmo; Laippala, Veronika

ToimittajaFišer, Darja; Eskevich, Maria; Bordon, David

Konferenssin vakiintunut nimiParlaCLARIN Workshop

Julkaisuvuosi2024

JournalLREC Proceedings

Kokoomateoksen nimiProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) : ParlaCLARIN IV Workshop on Creating, Analysing, and Increasing Accessibility of Parliamentary Corpora

Aloitussivu70

Lopetussivu76

eISBN978-2-493814-24-1

eISSN 2522-2686

Verkko-osoitehttps://aclanthology.org/2024.parlaclarin-1.11.pdf

Rinnakkaistallenteen osoitehttps://research.utu.fi/converis/portal/detail/Publication/457172276


Tiivistelmä

Annotating datasets can often be prohibitively expensive and laborious. Emotion annotation specifically has been shown to be a difficult task in which even trained annotators rarely reach high agreement. With the introduction of ChatGPT, GPT-4 and other Large Language Models (LLMs), however, a new line of research has emerged that explores the possibilities of automated data annotation. In this paper, we apply GPT-4 to the task of annotating a dataset, which is subsequently used to train a BERT model for emotion analysis of Finnish parliamentary speeches. In our experiment, GPT-4 performs on par with trained annotators and the annotations it produces can be used to train a classifier that reaches micro F1 of 0.690. We compare this model to two other models that are trained on machine translated datasets and find that the model trained on GPT-4 annotated data outperforms them. Our paper offers new insight into the possibilities that LLMs have to offer for the analysis of parliamentary corpora.


Ladattava julkaisu

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.




Julkaisussa olevat rahoitustiedot
This research was funded by the Research Council of Finland [grant number 353569].


Last updated on 2025-11-03 at 07:53