A4 Refereed article in a conference publication

Clustering Nursing Sentences - Comparing Three Sentence Embedding Methods




AuthorsMoen Hans, Suhonen Henry, Salanterä Sanna, Salakoski Tapio, Peltonen Laura-Maria

EditorsBrigitte Séroussi, Patrick Weber, Ferdinand Dhombres, Cyril Grouin, Jan-David Liebe, Sylvia Pelayo, Andrea Pinna, Bastien Rance, Lucia Sacchi, Adrien Ugon, Arriel Benis, Parisis Gallos

Conference nameMedical Informatics Europe

Publication year2022

JournalMedical informatics Europe

Book title Challenges of Trustable AI and Added-Value on Health

Journal name in sourceStudies in health technology and informatics

Journal acronymStud Health Technol Inform

Series titleStudies in Health Technology and Informatics

Volume294

First page 854

Last page858

ISBN978-1-64368-284-6

eISBN978-1-64368-285-3

ISSN0926-9630

eISSN1879-8365

DOIhttps://doi.org/10.3233/SHTI220606

Web address https://ebooks.iospress.nl/doi/10.3233/SHTI220606

Self-archived copy’s web addresshttps://research.utu.fi/converis/portal/detail/Publication/178641781


Abstract

In health sciences, high-quality text embeddings may augment qualitative data analysis of large amounts of text by enabling, e.g., searching and clustering of health information. This study aimed to evaluate three different sentence-level embedding methods in clustering sentences in nursing narratives from individual patients' hospital care episodes. Two of these embeddings are generated from language models based on the BERT framework, and the third on the Sent2Vec method. These embedding methods were used to cluster sentences from 20 patient care episodes and the results were manually evaluated. Findings suggest that the best clusters were produced by the embeddings from a BERT model fine-tuned for the proxy task of predicting subject headings for nursing text.


Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.





Last updated on 2024-26-11 at 11:41