A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä

Evaluation of machine learning algorithms for improved risk assessment for Down's syndrome




TekijätAki Koivu, Teemu Korpimäki, Petri Kivelä, Tapio Pahikkala, Mikko Sairanen

KustantajaPERGAMON-ELSEVIER SCIENCE LTD

Julkaisuvuosi2018

JournalComputers in Biology and Medicine

Tietokannassa oleva lehden nimiCOMPUTERS IN BIOLOGY AND MEDICINE

Lehden akronyymiCOMPUT BIOL MED

Vuosikerta98

Aloitussivu1

Lopetussivu7

Sivujen määrä7

ISSN0010-4825

eISSN1879-0534

DOIhttps://doi.org/10.1016/j.compbiomed.2018.05.004


Tiivistelmä
Prenatal screening generates a great amount of data that is used for predicting risk of various disorders. Prenatal risk assessment is based on multiple clinical variables and overall performance is defined by how well the risk algorithm is optimized for the population in question. This article evaluates machine learning algorithms to improve performance of first trimester screening of Down syndrome. Machine learning algorithms pose an adaptive alternative to develop better risk assessment models using the existing clinical variables. Two real-world data sets were used to experiment with multiple classification algorithms. Implemented models were tested with a third, real-world, data set and performance was compared to a predicate method, a commercial risk assessment software. Best performing deep neural network model gave an area under the curve of 0.96 and detection rate of 78% with 1% false positive rate with the test data. Support vector machine model gave area under the curve of 0.95 and detection rate of 61% with 1% false positive rate with the same test data. When compared with the predicate method, the best support vector machine model was slightly inferior, but an optimized deep neural network model was able to give higher detection rates with same false positive rate or similar detection rate but with markedly lower false positive rate. This finding could further improve the first trimester screening for Down syndrome, by using existing clinical variables and a large training data derived from a specific population.



Last updated on 2024-26-11 at 10:33