Filip Ginter
figint@utu.fi Työhuone: 4th floor, 451A ORCID-tunniste: https://orcid.org/0000-0002-5484-6103 |
natural language processing; human language technology; machine learning; deep learning; resource development
human language technology, natural language processing, machine learning applied to human language, both methodological and resource creation research
I am a researcher at the Department of Computing, University of Turku. My research is in the area of natural language processing. I belong to the TurkuNLP (turkunlp.org) research group.
I was born in 1978 in Ostrava, Czech Republic (Czechoslovakia back then). In 2001, I got a M.Sc. (tech) in computer science at the computer science department of VSB - Technical University Ostrava. My major subject was artificial intelligence. I gained a PhD in computer science in 2007. The title of my thesis is Towards Information Extraction in the Biomedical Domain: Methods and Resources.
As of 2022, I am a professor of language technology and as of 2021 the deputy director of the Department of Computing.
My primary field of research is language technology / natural language processing. In my post-PhD career, I have focused on the development of NLP tools and resources primarily for Finnish, but later also numerous other languages via the Universal Dependencies project. My work is heavy on resource development, both in terms of data and machine learning pipelines. Open science and resources play an important role in my research, much of which is carried out in the open on GitHub and as a rule, all resources are openly available for unrestricted use. I work collaboratively, especially with my younger colleagues, rather than striving for deeper, primary author inquiries.
I have been actively teaching since early on during my PhD studies. I independently prepared my first advanced level NLP course in 2004, and since ca. 2008 I have been teaching at least one course every year, substantially more during my bioinformatics lecturer appointment. While a lecturer in the bioinformatics MSc degree programme, I was lecturing international students in two cities. In 2016, I was tasked with developing and coordinating the introduction of a new 20 ECTS study module on natural language processing. This module is, with modifications, still in use and shared between the departments of Languages and Computing, both in terms of teaching and in terms of students. In 2019-2020 and 2020-2021 I was also co-lecturing, upon invitation, two courses in natural language processing in the Arcada University of Applied Sciences in Helsinki.
- WikiBERT Models: Deep Transfer Learning for Many Languages (2021)
- Linköping Electronic Conference Proceedings
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Assisting nurses in care documentation: from automated sentence classification to coherent document structures with subject headings (2020)
- Journal of Biomedical Semantics
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Classifying online corporate reputation with machine learning: a study in the banking domain (2020)
- Internet Research
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Dependency parsing of biomedical text with BERT (2020)
- BMC Bioinformatics
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Entity-pair embeddings for improving relation extraction in the biomedical domain (2020)
- European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Manuscripts, Qualitative Analysis and Features on Vectors: An Attempt for a Synthesis of Conventional and Computational Methods in the Attribution of Late Medieval Anti-Heretical Treatises (2020) Digital Histories. Emergent Approaches within the New Digital History Reima Välimäki, Aleksi Vesanto, Anni Hella, Adam Poznański, Filip Ginter
(A3 Vertaisarvioitu kirjan tai muun kokoomateoksen osa) - Supporting the use of standardized nursing terminologies with automatic subject heading prediction: a comparison of sentence-level text classification methods (2020)
- Journal of the American Medical Informatics Association
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - The FISKMO project: Resources and tools for Finnish-Swedish machine translation and cross-linguistic research (2020) Proceedings of the 12th Language Resources and Evaluation Conference Jörg Tiedemann, Tommi Nieminen, Mikko Aulamo, Jenna Kanerva, Akseli Leino, Filip Ginter, Niko Papula
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - The reuse of texts in Finnish newspapers and journals, 1771–1920: A digital humanities perspective (2020)
- Historical Methods: A Journal of Quantitative and Interdisciplinary History
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Turku Enhanced Parser Pipeline: From Raw Text to Enhanced Graphs in the IWPT 2020 Shared Task (2020)
- Annual Meeting of the Association for Computational Linguistics
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection (2020) Proceedings of the 12th Language Resources and Evaluation Conference Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajič, Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis Tyers, Daniel Zeman
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Is Multilingual BERT Fluent in Language Generation? (2019)
- Linköping Electronic Conference Proceedings
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Neural Dependency Parsing of Biomedical Text: TurkuNLP entry in the CRAFT Structural Annotation Task (2019) Proceedings of the 5th Workshop on BioNLP Open Shared Tasks Thang Minh Ngo, Jenna Kanerva, Filip Ginter, Sampo Pyysalo
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Parse me if you can: Artificial treebanks for parsing experiments on elliptical constructions (2019) LREC 2018 - 11th International Conference on Language Resources and Evaluation Droganova K., Zeman D., Kanerva J., Ginter F.
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Proceedings of the First NLPL Workshop on Deep Learning for Natural Language Processing (2019) Joakim Nivre, Filip Ginter, Stephan Oepen, Jörg Tiedemann
(C2 Toimitustyö tieteelliselle kokoomateokselle) - Reconsidering Authorship in the Ciceronian Corpus through Computational Authorship Attribution (2019)
- Ciceroniana On Line
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Tekstien pitkä elämä: Ajassa liikkuvat tekstit suomalaisessa sanomalehdistössä 1771-1920 (2019)
- Ennen ja nyt : Historian tietosanomat
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Tekstien uudelleenkäyttö suomalaisessa sanoma- ja aikakauslehdistössä 1771–1920. Digitaalisten ihmistieteiden näkökulma (2019)
- Historiallinen Aikakauskirja
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Template-free Data-to-Text Generation of Finnish Sports News (2019)
- Linköping Electronic Conference Proceedings
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens (2019)
- Genome Biology
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä )



