Filip Ginter
figint@utu.fi Työhuone: 4th floor, 451A ORCID-tunniste: https://orcid.org/0000-0002-5484-6103 |
natural language processing; human language technology; machine learning; deep learning; resource development
human language technology, natural language processing, machine learning applied to human language, both methodological and resource creation research
I am a researcher at the Department of Computing, University of Turku. My research is in the area of natural language processing. I belong to the TurkuNLP (turkunlp.org) research group.
I was born in 1978 in Ostrava, Czech Republic (Czechoslovakia back then). In 2001, I got a M.Sc. (tech) in computer science at the computer science department of VSB - Technical University Ostrava. My major subject was artificial intelligence. I gained a PhD in computer science in 2007. The title of my thesis is Towards Information Extraction in the Biomedical Domain: Methods and Resources.
As of 2022, I am a professor of language technology and as of 2021 the deputy director of the Department of Computing.
My primary field of research is language technology / natural language processing. In my post-PhD career, I have focused on the development of NLP tools and resources primarily for Finnish, but later also numerous other languages via the Universal Dependencies project. My work is heavy on resource development, both in terms of data and machine learning pipelines. Open science and resources play an important role in my research, much of which is carried out in the open on GitHub and as a rule, all resources are openly available for unrestricted use. I work collaboratively, especially with my younger colleagues, rather than striving for deeper, primary author inquiries.
I have been actively teaching since early on during my PhD studies. I independently prepared my first advanced level NLP course in 2004, and since ca. 2008 I have been teaching at least one course every year, substantially more during my bioinformatics lecturer appointment. While a lecturer in the bioinformatics MSc degree programme, I was lecturing international students in two cities. In 2016, I was tasked with developing and coordinating the introduction of a new 20 ECTS study module on natural language processing. This module is, with modifications, still in use and shared between the departments of Languages and Computing, both in terms of teaching and in terms of students. In 2019-2020 and 2020-2021 I was also co-lecturing, upon invitation, two courses in natural language processing in the Arcada University of Applied Sciences in Helsinki.
- Detecting Sequential Genre Change in Eighteenth-Century Texts (2022)
- CEUR Workshop Proceedings
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Explainable Publication Year Prediction of Eighteenth Century Texts with the BERT Model (2022) Proceedings of the 3rd Workshop on Computational Approaches to Historical Language Change Rastas Iiro, Ryan Yann, Tiihonen Iiro, Qaraei Mohammedreza, Repo Liina, Babbar Rohit, Mäkelä Eetu, Tolonen Mikko, Ginter Filip
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Explaining Classes through Stable Word Attributions (2022)
- Annual Meeting of the Association for Computational Linguistics
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - GEMv2: Multilingual NLG Benchmarking in a Single Line of Code (2022) Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations Gehrmann Sebastian, Bhattacharjee Abhik, Mahendiran Abinaya, Wang Alex, Papangelis Alexandros, Madaan Aman, McMillan-Major Angelina, Shvets Anna, Upadhyay Ashish, Bohnet Bernd, Yao Bingsheng, Wilie Bryan, Bhagavatula Chandra, You Chaobin, Thomson Craig, Garbacea Cristina, Wang, Dakuo, Deutsch Daniel, Xiong Deyi, Jin Di, Gkatzia Dimitra, Radev Dragomir, Clark Elizabeth, Durmus Esin, Ladhak Faisal, Ginter Filip, Winata Genta Indra, Strobelt, Hendrik, Hayashi, Hiroaki, Novikova Jekaterina, Kanerva Jenna, Chim Jenny, Zhou Jiawei, Clive Jordan, Maynez Joshua, Sedoc João, Juraska Juraj, Dhole Kaustubh, Chandu Khyathi Raghavi, Perez-Beltrachini Laura, Ribeiro Leonardo F.R., Tunstall Lewis, Zhang Li, Pushkarna Mahima, Creutz Mathias, White Michael, Kale Mihir Sanjay, Eddine Moussa Kamal, Daheim Nico, Subramani, Nishant, Dusek Ondrej, Liang Paul Pu, Ammanamanchi Pawan Sasanka, Zhu Qi, Puduppully Ratish, Kriz Reno, Shahriyar Rifat, Cardenas Ronald, Mahamood Saad, Osei Salomey, Cahyawijaya Samuel, Štajner Sanja, Montella Sebastien, Jolly Shailza, Mille Simon, Hasan Tahmid, Shen Tianhao, Adewumi Tosin, Raunak Vikas, Raheja Vipul, Nikolaev Vitaly, Tsai Vivian, Jernite Yacine, Xu Ying, Sang Yisi, Liu Yixin, Hou Yufang
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Neural Network and Random Forest Models in Protein Function Prediction (2022)
- IEEE/ACM Transactions on Computational Biology and Bioinformatics
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Out-of-Domain Evaluation of Finnish Dependency Parsing (2022)
- LREC Proceedings
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Paimen, piika ja emäntä. Arvot ja ammatit suomalaisessa näytelmäelokuvassa 1907–2017 (2022)
- Lähikuva
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Textual Paraphrase Dataset for Deep Language Modelling (2022) European Language Grid: A Language Technology Platform for Multilingual Europe Kanerva Jenna, Ginter Filip, Chang Li-Hsin, Skantsi Valtteri, Kilpeläinen Jemina, Kupari Hanna-Mari, Piirto Aurora, Saarni Jenna, Sevón Maija, Tarkka Otto
(A3 Vertaisarvioitu kirjan tai muun kokoomateoksen osa) - Towards Automatic Short Answer Assessment for Finnish as a Paraphrase Retrieval Task (2022) Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022) Chang Li-Hsin, Kanerva Jenna, Ginter Filip
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Deep learning for sentence clustering in essay grading support (2021) Proceedings of the 14th International Conference on Educational Data Mining (EDM 2021) Chang Li-Hsin, Rastas Iiro, Pyysalo Sampo, Ginter Filip
(D3 Artikkeli ammatillisessa konferenssijulkaisussa ) - Fine-grained Named Entity Annotation for Finnish (2021)
- Linköping Electronic Conference Proceedings
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Finnish Paraphrase Corpus (2021)
- Linköping Electronic Conference Proceedings
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Quantitative Evaluation of Alternative Translations in a Corpus of Highly Dissimilar Finnish Paraphrases (2021) Proceedings for the First Workshop on Modelling Translation: Translatology in the Digital Age Chang Li-Hsin, Pyysalo Sampo, Kanerva Jenna, Ginter Filip
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Universal Lemmatizer: A sequence-to-sequence model for lemmatizing Universal Dependencies treebanks (2021)
- Natural Language Engineering
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - WikiBERT Models: Deep Transfer Learning for Many Languages (2021)
- Linköping Electronic Conference Proceedings
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Assisting nurses in care documentation: from automated sentence classification to coherent document structures with subject headings (2020)
- Journal of Biomedical Semantics
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Classifying online corporate reputation with machine learning: a study in the banking domain (2020)
- Internet Research
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Dependency parsing of biomedical text with BERT (2020)
- BMC Bioinformatics
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Entity-pair embeddings for improving relation extraction in the biomedical domain (2020)
- European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Manuscripts, Qualitative Analysis and Features on Vectors: An Attempt for a Synthesis of Conventional and Computational Methods in the Attribution of Late Medieval Anti-Heretical Treatises (2020) Digital Histories. Emergent Approaches within the New Digital History Reima Välimäki, Aleksi Vesanto, Anni Hella, Adam Poznański, Filip Ginter
(A3 Vertaisarvioitu kirjan tai muun kokoomateoksen osa)



