A4 Refereed article in a conference publication
Comparison of Word and Character Level Information for Medical Term Identification Using Convolutional Neural Networks and Transformers
Authors: Seneviratne Sandaru, Lenskiy Artem, Nolan Christopher, Daskalaki Eleni, Suominen Hanna
Editors: Michelle Honey, Charlene Ronquillo, Ting-Ting Lee, Lucy Westbrooke
Conference name: International Congress in Nursing Informatics
Publication year: 2021
Journal: Studies in Health Technology and Informatics
Book title : Nurses and Midwives in the Digital Age: Selected Papers, Posters and Panels from the 15th International Congress in Nursing Informatics
Journal name in source: Studies in health technology and informatics
Journal acronym: Stud Health Technol Inform
Series title: Studies in Health Technology and Informatics
Volume: 284
First page : 249
Last page: 253
ISSN: 0926-9630
eISSN: 1879-8365
DOI: https://doi.org/10.3233/SHTI210717
Web address : https://ebooks.iospress.nl/doi/10.3233/SHTI210717
Self-archived copy’s web address: https://research.utu.fi/converis/portal/detail/Publication/176265377
Complexity and domain-specificity make medical text hard to understand for patients and their next of kin. To simplify such text, this paper explored how word and character level information can be leveraged to identify medical terms when training data is limited. We created a dataset of medical and general terms using the Human Disease Ontology from BioPortal and Wikipedia pages. Our results from 10-fold cross validation indicated that convolutional neural networks (CNNs) and transformers perform competitively. The best F score of 93.9% was achieved by a CNN trained on both word and character level embeddings. Statistical significance tests demonstrated that general word embeddings provide rich word representations for medical term identification. Consequently, focusing on words is favorable for medical term identification if using deep learning architectures.
Downloadable publication This is an electronic reprint of the original article. |