Jouni Luoma
jouni.a.luoma@utu.fi ORCID-tunniste: https://orcid.org/0000-0001-9286-1868 |
NLP; NER
TurkuNLP
After working for several years in managerial positions in IT-companies I found a passion for AI/ML. It started with completing MOOC courses on my freetime and continued with a study leave for full time studies in University of Turku. Now I am a doctoral student and part-time project researcher at the TurkuNLP group, University of Turku.
I am researching potential ways to create better train-
ing data and training strategies for transformer-based deep learning models, with primary emphasis
on the NER task and large-scale application to English biomedical scientific text.
The research topics include data augmentation for NER to increase the classification per-
formance of NER models, combining already available but separate NER resources for
training models that simultaneously predict multiple named entity types (as biomedical
resources often consider only one entity type at a time) and utilizing large-scale data in
setups with distant or weak supervision from noisy and automatically classified data to
create new or enhance the available data sets and increase the performance of classifiers.
The research will be carried out as part of a four-year project funded by the Academy
of Finland that is in collaboration with University of Copenhagen researchers who de-
velop the largest biomedical text mining resource, STRING database. The primary
practical outcome of the work is to be able to efficiently (both in tagging performance
and run-time performance) use the methods planned in this research to improve this key
biomedical resource.
No current teaching activities.
- RegulaTome: a corpus of typed, directed, and signed relations between biomedical entities in the scientific literature (2024)
- Database: The Journal of Biological Databases and Curation
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - FinGPT: Large Generative Models for a Small Language (2023) Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing Luukkonen Risto, Komulainen Ville, Luoma Jouni, Eskelinen Anni, Kanerva Jenna, Kupari Hanna-Mari, Ginter Filip, Laippala Veronika, Muennighoff Niklas, Piktus Aleksandra, Wang Thomas, Tazi Nouamane, Scao Le Teven, Wolf Thomas, Suominen Osma, Sairanen Samuli, Merioksa Mikko, Heinonen Jyrki, Vahtola Aija, Antao Samuel, Pyysalo Sampo
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - (2023)
- Database: The Journal of Biological Databases and Curation
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - S1000: a better taxonomic name corpus for biomedical information extraction (2023)
- Bioinformatics
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä ) - Fine-grained Named Entity Annotation for Finnish (2021)
- Linköping Electronic Conference ProceedingsProceedings of COLING: International Conference on Computational Linguistics
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Overview of DrugProt BioCreative VII track: quality evaluation and large scale text mining of drug-gene/protein relations (2021) Proceedings of the BioCreative VII Challenge Evaluation Workshop Miranda Antonio, Mehryary Farrokh, Luoma Jouni, Pyysalo Sampo, Valencia Alfonso, Krallinger Martin
(B3 Vertaisarvioimaton artikkeli konferenssijulkaisussa) - A broad-coverage corpus for finnish named entity recognition (2020) 12th International Conference on Language Resources and Evaluation Jouni Luoma, Miika Oinonen, Maria Pyykönen, Veronika Laippala, Sampo Pyysalo
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa) - Exploring Cross-sentence Contexts for Named Entity Recognition with BERT (2020)
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa)