A4 Vertaisarvioitu artikkeli konferenssijulkaisussa

Collecting Linguistic Resources for Assessing Children’s Pronunciation of Nordic Languages




TekijätOlstad Anne Marte Haug, Smolander Anna, Strömbergsson Sofia, Ylinen Sari, Lehtonen Minna, Kurimo Mikko, Getman Yaroslav, Grósz Tamás, Cao Xinwei, Svendsen Torbjørn, Salvi Giampiero

ToimittajaCalzolari Nicoletta, Kan Min-Yen, Hoste Veronique, Lenci Alessandro, Sakti Sakriani, Xue Nianwen

Konferenssin vakiintunut nimiLanguage Resources and Evaluation

Julkaisuvuosi2024

JournalLREC Proceedings

Kokoomateoksen nimiProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Aloitussivu3529

Lopetussivu3537

eISBN978-2-493814-10-4

ISSN2522-2686

Verkko-osoitehttps://aclanthology.org/2024.lrec-main.313.pdf

Rinnakkaistallenteen osoitehttps://research.utu.fi/converis/portal/detail/Publication/404724456


Tiivistelmä

This paper reports on the experience collecting a number of corpora of Nordic languages spoken by children. The aim of the data collection is providing annotated data to develop and evaluate computer assisted pronunciation assessment systems both for non-native children learning a Nordic language (L2) and for L1 children with speech sound disorder (SSD). The paper presents the challenges encountered recording and annotating data for Finnish, Swedish and Norwegian, as well as the ethical considerations related with making this data publicly available. We hope that sharing this experience will encourage others to collect similar data for other languages. Of the different data collections, we were able to make the Norwegian corpus publicly available in the hope that it will serve as a reference in pronunciation assessment research.


Ladattava julkaisu

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.





Last updated on 2025-22-01 at 14:08