A1 Refereed original research article in a scientific journal

Dependency parsing of biomedical text with BERT




AuthorsKanerva Jenna, Ginter Filip, Pyysalo Sampo

PublisherBiomed Central Ltd.

Publication year2020

JournalBMC Bioinformatics

Article number580

Volume21

IssueSuppl 23

Number of pages12

ISSN1471-2105

eISSN1471-2105

DOIhttps://doi.org/10.1186/s12859-020-03905-8

Self-archived copy’s web addresshttps://research.utu.fi/converis/portal/detail/Publication/51380265


Abstract

Abstract
Background: : Syntactic analysis, or parsing, is a key task in natural language processing and a required component for many text mining approaches. In recent years,
Universal Dependencies (UD) has emerged as the leading formalism for dependency
parsing. While a number of recent tasks centering on UD have substantially advanced
the state of the art in multilingual parsing, there has been only little study of parsing
texts from specialized domains such as biomedicine.
Methods: : We explore the application of state-of-the-art neural dependency parsing methods to biomedical text using the recently introduced CRAFT-SA shared task
dataset. The CRAFT-SA task broadly follows the UD representation and recent UD task
conventions, allowing us to fne-tune the UD-compatible Turku Neural Parser and
UDify neural parsers to the task. We further evaluate the efect of transfer learning using
a broad selection of BERT models, including several models pre-trained specifcally for
biomedical text processing.
Results: : We fnd that recently introduced neural parsing technology is capable of
generating highly accurate analyses of biomedical text, substantially improving on
the best performance reported in the original CRAFT-SA shared task. We also fnd that
initialization using a deep transfer learning model pre-trained on in-domain texts is key
to maximizing the performance of the parsing methods.
Keywords: Parsing, Deep learning, CRAFT


Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.





Last updated on 2024-26-11 at 22:33