A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä
Question Answering models for information extraction from perovskite materials science literature
Tekijät: Sipilä, Matilda; Mehryary, Farrokh; Pyysalo, Sampo; Ginter, Filip; Todorović, Milica
Kustantaja: Springer Science and Business Media LLC
Julkaisuvuosi: 2025
Lehti: Communications materials
Artikkelin numero: 260
Vuosikerta: 6
eISSN: 2662-4443
DOI: https://doi.org/10.1038/s43246-025-00979-w
Julkaisun avoimuus kirjaamishetkellä: Avoimesti saatavilla
Julkaisukanavan avoimuus : Kokonaan avoin julkaisukanava
Verkko-osoite: https://doi.org/10.1038/s43246-025-00979-w
Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/505920997
Scientific text is a promising source of data in materials science, with ongoing research into utilising textual data for materials discovery. In this study, we developed and tested a Question Answering (QA) approach to extract material-property relationships from scientific publications. QA performance was evaluated for information extraction of perovskite bandgaps based on a human query. We observed considerable variation in results with five different large language models fine-tuned for the QA task. Best extraction accuracy was achieved with the QA MatSciBERT and F1-scores improved on the current state-of-the-art. QA also outperformed three latest generative large language models on the information extraction task, except the GPT-4 model. This work demonstrates the QA workflow and paves the way towards further applications. The simplicity and versatility of the QA approach all point to its considerable potential for text-driven discoveries in materials research.
Ladattava julkaisu This is an electronic reprint of the original article. |
Julkaisussa olevat rahoitustiedot:
The research was funded by the Research Council of Finland through grant number 345698. M.S. thanks the University of Turku Graduate School (UTUGS) and Finnish Cultural Foundation (grant number 241085) grants for doctoral research.