A3 Vertaisarvioitu kirjan tai muun kokoomateoksen osa

Donate Speech: Collecting and Sharing a Large-Scale Speech Database for Social Sciences, Humanities and Artificial Intelligence Research and Innovation




TekijätLindén Krister, Jauhiainen Tommi, Lennes Mietta, Kurimo Mikko, Rossi Aleksi, Kurki Tommi, Pitkänen Olli.

ToimittajaDarja Fišer and Andreas Witt

KustannuspaikkaBerlin & Boston

Julkaisuvuosi2022

Kokoomateoksen nimiCLARIN: The Infrastructure for Language Resources

Aloitussivu481

Lopetussivu510

ISBN978-3-11-076734-6

eISBN978-3-11-076737-7

DOIhttps://doi.org/10.1515/9783110767377-019

Verkko-osoitehttps://doi.org/10.1515/9783110767377-019

Rinnakkaistallenteen osoitehttps://research.utu.fi/converis/portal/detail/Publication/176591290


Tiivistelmä

The Donate Speech campaign aimed to collect 10,000 hours of ordinary, casual Finnish speech to be used for studying language as well as for develop-ing technology and services that can be readily used in the languages spoken in Finland. In this project, particular attention has been devoted to allowing for both academic and commercial use of the material. Even though this ambitious target currently seems likely to evade us, the Donate Speech campaign has managed to amass an extensive resource of more than 4,000 hours of Finnish colloquial speech comprising more than 220,000 speech recordings by more than 25,000 speakers from all over Finland in just a few months.

Keywords: speech resources, colloquial speech, large-scale data collection, aca-demic and commercial use


Ladattava julkaisu

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.





Last updated on 2024-26-11 at 14:13