A1 Refereed original research article in a scientific journal

Tournament leave-pair-out cross-validation for receiver operating characteristic analysis




AuthorsMontoya Perez I., Airola A., Boström P., Jambor I., Pahikkala T.

PublisherSAGE Publications Ltd

Publication year2019

JournalStatistical Methods in Medical Research

Journal name in sourceStatistical Methods in Medical Research

Volume28

Issue10-11

First page 2975

Last page2991

Number of pages17

ISSN0962-2802

eISSN1477-0334

DOIhttps://doi.org/10.1177/0962280218795190

Web address http://journals.sagepub.com/doi/pdf/10.1177/0962280218795190

Self-archived copy’s web addresshttps://research.utu.fi/converis/portal/detail/Publication/36131645


Abstract

Receiver operating characteristic analysis is widely used for evaluating diagnostic systems. Recent studies have shown that estimating an area under receiver operating characteristic curve with standard cross-validation methods suffers from a large bias. The leave-pair-out cross-validation has been shown to correct this bias. However, while leave-pair-out produces an almost unbiased estimate of area under receiver operating characteristic curve, it does not provide a ranking of the data needed for plotting and analyzing the receiver operating characteristic curve. In this study, we propose a new method called tournament leave-pair-out cross-validation. This method extends leave-pair-out by creating a tournament from pair comparisons to produce a ranking for the data. Tournament leave-pair-out preserves the advantage of leave-pair-out for estimating area under receiver operating characteristic curve, while it also allows performing receiver operating characteristic analyses. We have shown using both synthetic and real-world data that tournament leave-pair-out is as reliable as leave-pair-out for area under receiver operating characteristic curve estimation and confirmed the bias in leave-one-out cross-validation on low-dimensional data. As a case study on receiver operating characteristic analysis, we also evaluate how reliably sensitivity and specificity can be estimated from tournament leave-pair-out receiver operating characteristic curves.


Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.





Last updated on 2024-26-11 at 18:52