Veronika Laippala
mavela@utu.fi +358 29 450 3330 +358 50 328 9739 Arcanuminkuja 1 Turku |
Areas of expertise
Computational linguistics; text linguistics; corpus linguistics; digital discourse analysis.
Computational linguistics; text linguistics; corpus linguistics; digital discourse analysis.
Biography
I am a linguist who likes computers. My main research topics include language variation across different communicative situations and the development of automatic tools so that we could better benefit from large, web-crawled corpora.
My ongoing projects include "A piece of news, an opinion or something else? Different texts and their detection from the multilingual Internet" funded by Emil Aaltonen foundation and "Massively multilingual modeling of registers in web-scale data" funded by Academy of Finland.
For more information, please have a look at our lab website at https://turkunlp.github.io/
Publications
- The Topical Landscape of Web Registers : Exploring the Interplay of Registers and Topicality in a Web-scale Corpus (2024) Linguistics across Disciplinary Borders : The March of Data Skantsi, Valtteri; Laippala, Veronika; Kyröläinen, Aki
(A3 Refereed book chapter or chapter in a compilation book) - Towards Automatic Register Classification in Unrestricted Databases of Historical English (2024) Linguistics across Disciplinary Borders : the March of Data Repo Liina, Hashimoto Brett, Liimatta Aatu, Saario Lassi, Säily Tanja, Tiihonen Iiro, Tolonen Mikko, Laippala Veronika
(A3 Refereed book chapter or chapter in a compilation book) - Analyzing the unrestricted web: The finnish corpus of online registers (2023)
- Nordic Journal of Linguistics
(A1 Refereed original research article in a scientific journal) - FinGPT: Large Generative Models for a Small Language (2023) Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing Luukkonen Risto, Komulainen Ville, Luoma Jouni, Eskelinen Anni, Kanerva Jenna, Kupari Hanna-Mari, Ginter Filip, Laippala Veronika, Muennighoff Niklas, Piktus Aleksandra, Wang Thomas, Tazi Nouamane, Scao Le Teven, Wolf Thomas, Suominen Osma, Sairanen Samuli, Merioksa Mikko, Heinonen Jyrki, Vahtola Aija, Antao Samuel, Pyysalo Sampo
(A4 Refereed article in a conference publication ) - Helsingin kielilukion vierailu uusiin Arcanumin tiloihin (2023)
- Leala-tutkimuskeskuksen blogi
(D1 Article in a professional journal) - In search of founding era registers: automatic modeling of registers from the corpus of Founding Era American English (2023)
- Digital Scholarship in the Humanities
(A1 Refereed original research article in a scientific journal) - Predictive keywords: Using machine learning to explain document characteristics (2023)
- Frontiers in Artificial Intelligence
(A1 Refereed original research article in a scientific journal) - Toxicity Detection in Finnish Using Machine Translation (2023)
- NEALT proceedings series
(A4 Refereed article in a conference publication ) - Etäyhteyksistä paluu normaaliin arkeen: yliopistovierailu kampuksella (2022)
- Leala-tutkimuskeskuksen blogi
(D1 Article in a professional journal) - Explaining Classes through Stable Word Attributions (2022)
- Annual Meeting of the Association for Computational Linguistics
(A4 Refereed article in a conference publication ) - Register identification from the unrestricted open Web using the Corpus of Online Registers of English (2022)
- Language Resources and Evaluation
(A1 Refereed original research article in a scientific journal) - Selkosten Proust taipuu moneen - Iijoki-korpus ja digitaalisen tekstilouhinnan mahdollisuudet (2022) Kalle Päätalo tutkijoiden silmin Karkulehto Sanna, Laippala Veronika, Launis Kati, Märsynaho Jaana, Saviniemi Maija, Sääskilahti Minna
(A3 Refereed book chapter or chapter in a compilation book) - Towards better structured and less noisy Web data: Oscar with Register annotations (2022)
- International Conference on Computational Linguistics
(A4 Refereed article in a conference publication ) - Beyond the English web: Zero-shot cross-lingual and lightweight monolingual classification of registers (2021) Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop Repo Liina, Skantsi Valtteri, Rönnqvist Samuel, Hellström Saara, Oinonen Miika, Salmela Anna, Biber Douglas, Egbert Jesse, Pyysalo Sampo, Laippala Veronika
(A4 Refereed article in a conference publication ) - Exploring the role of lexis and grammar for the stable identification of register in an unrestricted corpus of web documents (2021)
- Language Resources and Evaluation
(A1 Refereed original research article in a scientific journal) - Multilingual and Zero-Shot is Closing in on Monolingual Web Register Classification (2021)
- Linköping Electronic Conference Proceedings
(A4 Refereed article in a conference publication ) - A broad-coverage corpus for finnish named entity recognition (2020) 12th International Conference on Language Resources and Evaluation Jouni Luoma, Miika Oinonen, Maria Pyykönen, Veronika Laippala, Sampo Pyysalo
(A4 Refereed article in a conference publication ) - Affectivity in the #jesuisCharlie Twitter discussion (2020)
- Pragmatics
(A1 Refereed original research article in a scientific journal) - Commenting on poverty online: A corpus-assisted discourse study of the Suomi24 forum (2020)
- SKY Journal of Linguistics
(A1 Refereed original research article in a scientific journal) - From Web Crawl to Clean Register-Annotated Corpora (2020) Proceedings of the 12th Web as Corpus Workshop Laippala Veronika, Rönnqvist Samuel, Hellström Saara, Luotolahti, Juhani, Repo Liina, Salmela Anna, Skantsi Valtteri and Pyysalo Sampo
(A4 Refereed article in a conference publication )



