A4 Refereed article in a conference publication

Comparison of Word and Character Level Information for Medical Term Identification Using Convolutional Neural Networks and Transformers




AuthorsSeneviratne Sandaru, Lenskiy Artem, Nolan Christopher, Daskalaki Eleni, Suominen Hanna

EditorsMichelle Honey, Charlene Ronquillo, Ting-Ting Lee, Lucy Westbrooke

Conference nameInternational Congress in Nursing Informatics

Publication year2021

JournalStudies in Health Technology and Informatics

Book title Nurses and Midwives in the Digital Age: Selected Papers, Posters and Panels from the 15th International Congress in Nursing Informatics

Journal name in sourceStudies in health technology and informatics

Journal acronymStud Health Technol Inform

Series titleStudies in Health Technology and Informatics

Volume284

First page 249

Last page253

ISSN0926-9630

eISSN1879-8365

DOIhttps://doi.org/10.3233/SHTI210717

Web address https://ebooks.iospress.nl/doi/10.3233/SHTI210717

Self-archived copy’s web addresshttps://research.utu.fi/converis/portal/detail/Publication/176265377


Abstract
Complexity and domain-specificity make medical text hard to understand for patients and their next of kin. To simplify such text, this paper explored how word and character level information can be leveraged to identify medical terms when training data is limited. We created a dataset of medical and general terms using the Human Disease Ontology from BioPortal and Wikipedia pages. Our results from 10-fold cross validation indicated that convolutional neural networks (CNNs) and transformers perform competitively. The best F score of 93.9% was achieved by a CNN trained on both word and character level embeddings. Statistical significance tests demonstrated that general word embeddings provide rich word representations for medical term identification. Consequently, focusing on words is favorable for medical term identification if using deep learning architectures.

Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.





Last updated on 2024-26-11 at 15:30