A3 Book chapter
Improving layman readability of clinical narratives with unsupervised synonym replacement

List of Authors: Moen H., Peltonen L., Koivumäki M., Suhonen H., Salakoski T., Ginter F., Salanterä S.
Publisher: IOS Press
Place: Amsterdam
Publication year: 2018
Book title *: Building Continents of Knowledge in Oceans of Data: The Future of Co-Created eHealth
Journal name in source: Studies in Health Technology and Informatics
Title of series: Studies in Health Technology and Informatics
Volume number: 247
Number of pages: 5
ISBN: 978-1-61499-851-8
eISBN: 978-1-61499-852-5
ISSN: 0926-9630


We report on the development and evaluation of a prototype tool aimed to assist laymen/patients in understanding the content of clinical narratives. The tool relies largely on unsupervised machine learning applied to two large corpora of unlabeled text – a clinical corpus and a general domain corpus. A joint semantic word-space model is created for the purpose of extracting easier to understand alternatives for words considered difficult to understand by laymen. Two domain experts evaluate the tool and inter-rater agreement is calculated. When having the tool suggest ten alternatives to each difficult word, it suggests acceptable lay words for 55.51% of them. This and future manual evaluation will serve to further improve performance, where also supervised machine learning will be used.

Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.

Last updated on 2019-20-07 at 04:03