Vertaisarvioitu artikkeli konferenssijulkaisussa (A4)

Finnish Paraphrase Corpus




Julkaisun tekijät: Kanerva Jenna, Ginter Filip, Chang Li-Hsin, Rastas Iiro, Skantsi Valtteri, Kilpeläinen Jemina, Kupari Hanna-Mari, Saarni Jenna, Sevón Maija, Tarkka Otto

Konferenssin vakiintunut nimi: Nordic Conference on Computational Linguistics

Julkaisuvuosi: 2021

Journal: Linköping Electronic Conference Proceedings

Kirjan nimi *: Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021)

Sarjan nimi: Linköping Electronic Conference Proceedings

Numero sarjassa: 178

ISBN: 978-91-7929-614-8

ISSN: 1650-3686

Verkko-osoite: https://ep.liu.se/en/conference-article.aspx?series=ecp&issue=178&Article_No=29

Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/Publication/53727016


Tiivistelmä

In this paper, we introduce the firstfully manually annotated paraphrase cor-pus for Finnish containing 53,572 para-phrase pairs harvested from alternative subtitles and news headings. Out of all paraphrase pairs in our corpus 98% are manually classified to be paraphrases at least in their given context, if not in all contexts. Additionally, we establish a manual candidate selection method and demonstrate its feasibility in high quality paraphrase selection in terms of both costand quality.


Ladattava julkaisu

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.




Last updated on 2021-24-06 at 09:33