Automated Emotion Annotation of Finnish Parliamentary Speeches Using GPT-4

: Tarkka, Otto; Koljonen, Jaakko; Korhonen, Markus; Laine, Juuso; Martiskainen, Kristian; Elo, Kimmo; Laippala, Veronika

: Fišer, Darja; Eskevich, Maria; Bordon, David

: ParlaCLARIN Workshop

: 2024

LREC Proceedings

: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) : ParlaCLARIN IV Workshop on Creating, Analysing, and Increasing Accessibility of Parliamentary Corpora

: 70

: 76

: 978-2-493814-24-1

: 2522-2686

: https://aclanthology.org/2024.parlaclarin-1.11.pdf

: https://research.utu.fi/converis/portal/detail/Publication/457172276

Annotating datasets can often be prohibitively expensive and laborious. Emotion annotation specifically has been shown to be a difficult task in which even trained annotators rarely reach high agreement. With the introduction of ChatGPT, GPT-4 and other Large Language Models (LLMs), however, a new line of research has emerged that explores the possibilities of automated data annotation. In this paper, we apply GPT-4 to the task of annotating a dataset, which is subsequently used to train a BERT model for emotion analysis of Finnish parliamentary speeches. In our experiment, GPT-4 performs on par with trained annotators and the annotations it produces can be used to train a classifier that reaches micro F1 of 0.690. We compare this model to two other models that are trained on machine translated datasets and find that the model trained on GPT-4 annotated data outperforms them. Our paper offers new insight into the possibilities that LLMs have to offer for the analysis of parliamentary corpora.

2024.parlaclarin-1.11_CC-BY-NC.pdf

:
This research was funded by the Research Council of Finland [grant number 353569].