Extracting Geographical References from Finnish Literature: Fully Automated Processing of Plain-Text Corpora




Kiiskinen Harri, Nivala Asko, Westerlund Jasmine, Saarelainen Juhana

2023

Journal of Computational Literary Studies

JCLS

2

1

1

20

DOIhttps://doi.org/10.48694/jcls.3584

https://doi.org/10.48694/jcls.3584

https://research.utu.fi/converis/portal/detail/Publication/380977362



In the Atlas of Finnish Literature 1870-1940 project, we extract geo- graphical information from a Finnish-language corpus of literary texts published between 1870 and 1940. The texts are transformed from plain texts to TEI/XML, and further processed with named entity recognition and linking tools. The results are presented in a web-based environment. This article describes the technical structure of the analysis chain, the tools used and the metaprocesses used to manage the research dataset.


Last updated on 2024-26-11 at 13:02