Refereed article in conference proceedings (A4)
Applying BLAST to Text Reuse Detection in Finnish Newspapers and Journals, 1771–1910
List of Authors: Aleksi Vesanto, Asko Nivala, Heli Rantala, Tapio Salakoski, Hannu Salmi, Filip Ginter
Conference name: Workshop on Processing Historical Language
Place: Gothenburg
Publication year: 2017
Book title *: Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language
Title of series: NEALT Proceedings Series
Number in series: 133
Volume number: 32
ISBN: 978-91-7685-503-4
ISSN: 1650-3686
URL: http://www.ep.liu.se/ecp/133/010/ecp17133010.pdf
Self-archived copy’s web address: https://research.utu.fi/converis/portal/detail/Publication/20562472
We present the results of text reuse de-
tection, based on the corpus of scanned
and OCR-recognized Finnish newspapers
and journals from 1771 to 1910. Our
study draws on BLAST, a software cre-
ated for comparing and aligning biologi-
cal sequences. We show different types of
text reuse in this corpus, and also present
a comparison to the software Passim, de-
veloped at the Northeastern University in
Boston, for text reuse detection.
Downloadable publication This is an electronic reprint of the original article. |