Vertaisarvioitu artikkeli konferenssijulkaisussa (A4)
Applying BLAST to Text Reuse Detection in Finnish Newspapers and Journals, 1771–1910
Julkaisun tekijät: Aleksi Vesanto, Asko Nivala, Heli Rantala, Tapio Salakoski, Hannu Salmi, Filip Ginter
Konferenssin vakiintunut nimi: Workshop on Processing Historical Language
Paikka: Gothenburg
Julkaisuvuosi: 2017
Kirjan nimi *: Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language
Sarjan nimi: NEALT Proceedings Series
Numero sarjassa: 133
Volyymi: 32
Aloitussivu: 54
Lopetussivun numero: 58
ISBN: 978-91-7685-503-4
ISSN: 1650-3686
Verkko-osoite: http://www.ep.liu.se/ecp/133/010/ecp17133010.pdf
Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/20562472
We present the results of text reuse de-
tection, based on the corpus of scanned
and OCR-recognized Finnish newspapers
and journals from 1771 to 1910. Our
study draws on BLAST, a software cre-
ated for comparing and aligning biologi-
cal sequences. We show different types of
text reuse in this corpus, and also present
a comparison to the software Passim, de-
veloped at the Northeastern University in
Boston, for text reuse detection.
Ladattava julkaisu This is an electronic reprint of the original article. |