Automated Emotion Annotation of Finnish Parliamentary Speeches Using GPT-4




Tarkka, Otto; Koljonen, Jaakko; Korhonen, Markus; Laine, Juuso; Martiskainen, Kristian; Elo, Kimmo; Laippala, Veronika

Fišer, Darja; Eskevich, Maria; Bordon, David

ParlaCLARIN Workshop

2024

LREC Proceedings

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) : ParlaCLARIN IV Workshop on Creating, Analysing, and Increasing Accessibility of Parliamentary Corpora

70

76

978-2-493814-24-1

2522-2686

https://aclanthology.org/2024.parlaclarin-1.11.pdf

https://research.utu.fi/converis/portal/detail/Publication/457172276



Annotating datasets can often be prohibitively expensive and laborious. Emotion annotation specifically has been shown to be a difficult task in which even trained annotators rarely reach high agreement. With the introduction of ChatGPT, GPT-4 and other Large Language Models (LLMs), however, a new line of research has emerged that explores the possibilities of automated data annotation. In this paper, we apply GPT-4 to the task of annotating a dataset, which is subsequently used to train a BERT model for emotion analysis of Finnish parliamentary speeches. In our experiment, GPT-4 performs on par with trained annotators and the annotations it produces can be used to train a classifier that reaches micro F1 of 0.690. We compare this model to two other models that are trained on machine translated datasets and find that the model trained on GPT-4 annotated data outperforms them. Our paper offers new insight into the possibilities that LLMs have to offer for the analysis of parliamentary corpora.


This research was funded by the Research Council of Finland [grant number 353569].


Last updated on 2025-11-03 at 07:53