Amanda Myntti
amanda.a.myntti@utu.fi |
Publications
- An Expanded Massive Multilingual Dataset for High-Performance Language Technologies (HPLT) (2025)
- Annual Meeting of the Association for Computational Linguistics
(A4 Refereed article in a conference publication ) - Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation (2025) Proceedings of the Second Conference on Language Modeling, COLM 2025 Myntti, Amanda; Henriksson, Erik; Laippala,Veronika; Pyysalo, Sampo
(D3 Article in a professional conference publication) - Building Question-Answer Data Using Web Register Identification (2024)
- LREC Proceedings
(A4 Refereed article in a conference publication ) - From Discrete to Continuous Classes: A Situational Analysis of Multilingual Web Registers with LLM Annotations (2024) Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities Henriksson, Erik; Myntti, Amanda; Hellström, Saara; Erten-Johansson, Selcen; Eskelinen, Anni; Repo, Liina; Laippala, Veronika
(A4 Refereed article in a conference publication ) - Intersecting Register and Genre: Understanding the Contents of Web-Crawled Corpora (2024) Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities Myntti, Amanda; Repo, Liina; Freyermuth, Elian; Kanner, Antti; Laippala, Veronika; Henriksson, Erik
(A4 Refereed article in a conference publication ) - Explaining Classes through Stable Word Attributions (2022)
- Annual Meeting of the Association for Computational Linguistics
(A4 Refereed article in a conference publication )