Automated Emotion Annotation of Finnish Parliamentary Speeches Using GPT-4
: Tarkka, Otto; Koljonen, Jaakko; Korhonen, Markus; Laine, Juuso; Martiskainen, Kristian; Elo, Kimmo; Laippala, Veronika
: Fišer, Darja; Eskevich, Maria; Bordon, David
: ParlaCLARIN Workshop
: 2024
: LREC Proceedings
: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) : ParlaCLARIN IV Workshop on Creating, Analysing, and Increasing Accessibility of Parliamentary Corpora
: 70
: 76
: 978-2-493814-24-1
: 2522-2686
: https://aclanthology.org/2024.parlaclarin-1.11.pdf
: https://research.utu.fi/converis/portal/detail/Publication/457172276
Annotating datasets can often be prohibitively expensive and laborious. Emotion annotation specifically has been shown to be a difficult task in which even trained annotators rarely reach high agreement. With the introduction of ChatGPT, GPT-4 and other Large Language Models (LLMs), however, a new line of research has emerged that explores the possibilities of automated data annotation. In this paper, we apply GPT-4 to the task of annotating a dataset, which is subsequently used to train a BERT model for emotion analysis of Finnish parliamentary speeches. In our experiment, GPT-4 performs on par with trained annotators and the annotations it produces can be used to train a classifier that reaches micro F1 of 0.690. We compare this model to two other models that are trained on machine translated datasets and find that the model trained on GPT-4 annotated data outperforms them. Our paper offers new insight into the possibilities that LLMs have to offer for the analysis of parliamentary corpora.
:
This research was funded by the Research Council of Finland [grant number 353569].