Sampo Pyysalo
sampo.pyysalo@utu.fi ORCID identifier: https://orcid.org/0000-0002-6279-5000 |
natural language processing; machine learning; scientific text mining
I am a researcher in the TurkuNLP group (https://turkunlp.org/) and Research Fellow at the Department of Computing, University of Turku. My work focuses on machine learning for natural language processing, with particular application domains including scientific text mining, Finnish language technology, and large language models.
After defending my PhD thesis in computer science at the University of Turku, I held researcher positions at the University of Tokyo, University of Manchester and University of Cambridge before returning to the University of Turku in 2019.
The primary focus of my research is on natural language processing using machine learning approaches, with recent emphasis on deep learning methods and large language models. I have been working on scientific text mining as an application area for nearly 20 years, with specific focus on the English biomedical literature, and have in recent years also addressed a variety of tasks in the processing of Finnish text as well as multi- and cross-lingual applications. My work covers the full range of natural language processing development from initial task design to the development of practical applications and organizing community challenges, including also running manual annotation efforts and developing annotation tools and machine learning methods for various natural language processing tasks.
My current teaching focuses on the natural language processing study module shared between the departments of Languages and Computing, with courses ranging from introductory to a course on deep learning for natural language processing.
- Large-scale event extraction from literature with multi-level gene normalization (2013)
- PLoS ONE
(A1 Refereed original research article in a scientific journal) - Matrix representations, linear transformations, and kernels for disambiguation in natural language (2009)
- Machine Learning
(A1 Refereed original research article in a scientific journal) - A Graph Kernel for Protein-Protein Interaction Extraction (2008) Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing (BioNLP 2008) Airola A, Pyysalo S, Björne J, Pahikkala T, Ginter F, Salakoski T
(A4 Refereed article in a conference publication ) - Comparative analysis of five protein-protein interaction corpora (2008)
- BMC Bioinformatics
(A1 Refereed original research article in a scientific journal) - Machine Learning to Automate the Assignment of Diagnosis Codes to Free-text Radiology Reports: a Method Description (2008) Proceedings of the ICML/UAI workshop on Machine Learning in health care applications Suominen H, Ginter F, Pyysalo S, Airola A, Pahikkala T, Salanterä S, Salakoski T
(A4 Refereed article in a conference publication ) - Regularized Least-Squares for parse ranking (2005)
- Lecture Notes in Computer Science
(A4 Refereed article in a conference publication )



