A4 Refereed article in a conference publication
Applying BLAST to Text Reuse Detection in Finnish Newspapers and Journals, 1771–1910
Authors: Aleksi Vesanto, Asko Nivala, Heli Rantala, Tapio Salakoski, Hannu Salmi, Filip Ginter
Editors: Gerlof Bouma, Yvonne Adesam
Conference name: Workshop on Processing Historical Language
Publishing place: Gothenburg
Publication year: 2017
Book title : Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language
Series title: NEALT Proceedings Series
Number in series: 133
Volume: 32
First page : 54
Last page: 58
ISBN: 978-91-7685-503-4
ISSN: 1650-3686
Web address : http://www.ep.liu.se/ecp/133/010/ecp17133010.pdf
Self-archived copy’s web address: https://research.utu.fi/converis/portal/detail/Publication/20562472
We present the results of text reuse de-
tection, based on the corpus of scanned
and OCR-recognized Finnish newspapers
and journals from 1771 to 1910. Our
study draws on BLAST, a software cre-
ated for comparing and aligning biologi-
cal sequences. We show different types of
text reuse in this corpus, and also present
a comparison to the software Passim, de-
veloped at the Northeastern University in
Boston, for text reuse detection.
Downloadable publication This is an electronic reprint of the original article. |