Scaling up bibliographic data science
: Tolonen M., Marjanen J., Roivainen H., Lahti L.
: Costanza Navarretta, Manex Agirrezabal, Bente Maegaard
: Digital Humanities in the Nordic Countries
Publisher: CEUR-WS
: 2019
CEUR Workshop Proceedings
: Proceedings of the Digital Humanities in the Nordic Countries 4th Conference
CEUR Workshop Proceedings
: CEUR Workshop Proceedings
: 2364
: 450
: 456
: 1613-0073
: http://ceur-ws.org/Vol-2364/41_paper.pdf
: https://research.utu.fi/converis/portal/detail/Publication/41240748
Bibliographic data science is an emerging research paradigm in digital 
humanities. It aims at systematic quantification of the trends in 
knowledge production based on large-scale analysis of bibliographic 
metadata collections and the methods of modern data science. Compared to
 the earlier related attempts in book history and sociology of 
literature, advances in data processing and quality control are now 
making it possible for the first time to scale up the analysis to 
millions of print products while at the same time paying attention to 
data quality, representativity and completeness. This provides a new 
quantitative method that can support the analysis of classical research 
questions in intellectual history. Here, we discuss the methodological 
challenges that we have encountered in such studies and how to scale up 
the solutions based on collaborative research efforts.