Vertaisarvioitu alkuperäisartikkeli tai data-artikkeli tieteellisessä aikakauslehdessä (A1)
Tournament leave-pair-out cross-validation for receiver operating characteristic analysis
Julkaisun tekijät: Montoya Perez I., Airola A., Boström P., Jambor I., Pahikkala T.
Kustantaja: SAGE Publications Ltd
Julkaisuvuosi: 2019
Journal: Statistical Methods in Medical Research
Tietokannassa oleva lehden nimi: Statistical Methods in Medical Research
Volyymi: 28
Julkaisunumero: 10-11
Sivujen määrä: 17
ISSN: 0962-2802
eISSN: 1477-0334
DOI: http://dx.doi.org/10.1177/0962280218795190
Verkko-osoite: http://journals.sagepub.com/doi/pdf/10.1177/0962280218795190
Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/36131645
Receiver operating characteristic analysis is widely used for evaluating diagnostic systems. Recent studies have shown that estimating an area under receiver operating characteristic curve with standard cross-validation methods suffers from a large bias. The leave-pair-out cross-validation has been shown to correct this bias. However, while leave-pair-out produces an almost unbiased estimate of area under receiver operating characteristic curve, it does not provide a ranking of the data needed for plotting and analyzing the receiver operating characteristic curve. In this study, we propose a new method called tournament leave-pair-out cross-validation. This method extends leave-pair-out by creating a tournament from pair comparisons to produce a ranking for the data. Tournament leave-pair-out preserves the advantage of leave-pair-out for estimating area under receiver operating characteristic curve, while it also allows performing receiver operating characteristic analyses. We have shown using both synthetic and real-world data that tournament leave-pair-out is as reliable as leave-pair-out for area under receiver operating characteristic curve estimation and confirmed the bias in leave-one-out cross-validation on low-dimensional data. As a case study on receiver operating characteristic analysis, we also evaluate how reliably sensitivity and specificity can be estimated from tournament leave-pair-out receiver operating characteristic curves.
Ladattava julkaisu This is an electronic reprint of the original article. |