A1 Refereed original research article in a scientific journal

Evaluation of machine learning algorithms for improved risk assessment for Down's syndrome




AuthorsAki Koivu, Teemu Korpimäki, Petri Kivelä, Tapio Pahikkala, Mikko Sairanen

PublisherPERGAMON-ELSEVIER SCIENCE LTD

Publication year2018

JournalComputers in Biology and Medicine

Journal name in sourceCOMPUTERS IN BIOLOGY AND MEDICINE

Journal acronymCOMPUT BIOL MED

Volume98

First page 1

Last page7

Number of pages7

ISSN0010-4825

eISSN1879-0534

DOIhttps://doi.org/10.1016/j.compbiomed.2018.05.004


Abstract
Prenatal screening generates a great amount of data that is used for predicting risk of various disorders. Prenatal risk assessment is based on multiple clinical variables and overall performance is defined by how well the risk algorithm is optimized for the population in question. This article evaluates machine learning algorithms to improve performance of first trimester screening of Down syndrome. Machine learning algorithms pose an adaptive alternative to develop better risk assessment models using the existing clinical variables. Two real-world data sets were used to experiment with multiple classification algorithms. Implemented models were tested with a third, real-world, data set and performance was compared to a predicate method, a commercial risk assessment software. Best performing deep neural network model gave an area under the curve of 0.96 and detection rate of 78% with 1% false positive rate with the test data. Support vector machine model gave area under the curve of 0.95 and detection rate of 61% with 1% false positive rate with the same test data. When compared with the predicate method, the best support vector machine model was slightly inferior, but an optimized deep neural network model was able to give higher detection rates with same false positive rate or similar detection rate but with markedly lower false positive rate. This finding could further improve the first trimester screening for Down syndrome, by using existing clinical variables and a large training data derived from a specific population.



Last updated on 2024-26-11 at 10:33