Applying BLAST to Text Reuse Detection in Finnish Newspapers and Journals, 1771–1910
: Aleksi Vesanto, Asko Nivala, Heli Rantala, Tapio Salakoski, Hannu Salmi, Filip Ginter
: Gerlof Bouma, Yvonne Adesam
: Workshop on Processing Historical Language
: Gothenburg
: 2017
: Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language
: NEALT Proceedings Series
: 133
: 32
: 54
: 58
: 978-91-7685-503-4
: 1650-3686
: http://www.ep.liu.se/ecp/133/010/ecp17133010.pdf
: https://research.utu.fi/converis/portal/detail/Publication/20562472
We present the results of text reuse de-
tection, based on the corpus of scanned
and OCR-recognized Finnish newspapers
and journals from 1771 to 1910. Our
study draws on BLAST, a software cre-
ated for comparing and aligning biologi-
cal sequences. We show different types of
text reuse in this corpus, and also present
a comparison to the software Passim, de-
veloped at the Northeastern University in
Boston, for text reuse detection.