A4 Article in conference proceedings
Machine Learning to Automate the Assignment of Diagnosis Codes to Free-text Radiology Reports: a Method Description

List of Authors: Suominen H, Ginter F, Pyysalo S, Airola A, Pahikkala T, Salanterä S, Salakoski T
Publication year: 2008
Book title *: Proceedings of the ICML/UAI workshop on Machine Learning in health care applications


We introduce a multi-label classification system for the automated assignment of diagnostic codes to radiology reports. The system is a cascade of text enrichment, feature selection and two classifiers. It was evaluated in the Computational Medicine Center’s 2007 Medical Natural Language Processing Challenge and achieved a 87.7% micro-averaged F1-score and third place out of 44 submissions in the task, where 45 different ICD-9-CM codes were present in 94 combinations. Especially the text enrichment and feature selection components are shown to contribute to our success.  Our study provides insight into the development of applications for real-life usage, which are currently rare.

Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.

Last updated on 2019-21-08 at 22:36