A4 Vertaisarvioitu artikkeli konferenssijulkaisussa
Applying BLAST to Text Reuse Detection in Finnish Newspapers and Journals, 1771–1910
Tekijät: Aleksi Vesanto, Asko Nivala, Heli Rantala, Tapio Salakoski, Hannu Salmi, Filip Ginter
Toimittaja: Gerlof Bouma, Yvonne Adesam
Konferenssin vakiintunut nimi: Workshop on Processing Historical Language
Kustannuspaikka: Gothenburg
Julkaisuvuosi: 2017
Kokoomateoksen nimi: Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language
Sarjan nimi: NEALT Proceedings Series
Numero sarjassa: 133
Vuosikerta: 32
Aloitussivu: 54
Lopetussivu: 58
ISBN: 978-91-7685-503-4
ISSN: 1650-3686
Verkko-osoite: http://www.ep.liu.se/ecp/133/010/ecp17133010.pdf
Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/20562472
We present the results of text reuse de-
tection, based on the corpus of scanned
and OCR-recognized Finnish newspapers
and journals from 1771 to 1910. Our
study draws on BLAST, a software cre-
ated for comparing and aligning biologi-
cal sequences. We show different types of
text reuse in this corpus, and also present
a comparison to the software Passim, de-
veloped at the Northeastern University in
Boston, for text reuse detection.
Ladattava julkaisu This is an electronic reprint of the original article. |