Unknown

Dataset Information

0

ROC Curve Analysis in the Presence of Imperfect Reference Standards.


ABSTRACT: The receiver operating characteristic (ROC) curve is an important tool for the evaluation and comparison of predictive models when the outcome is binary. If the class membership of the outcomes are known, ROC can be constructed for a model, and the ROC with greater area under the curve (AUC) indicates better performance. However in practice, imperfect reference standards often exist, in which class membership of every data point are not fully determined. This situation is especially prevalent in high-throughput biomedical data because obtaining perfect reference standards for all data points is either too costly or technically impractical. To construct ROC curves for these data, the common practice is to either ignore the uncertainties in references, or remove data points with high uncertainties. Such approaches may cause bias to the ROC curves and generate misleading results in method evaluation. Here we present a framework to incorporate membership uncertainties into the construction of ROC curve, termed the expected ROC or "eROC" curve. We develop an efficient procedure for the estimation of eROC curve. The advantages of using eROC are demonstrated using simulated and real data.

SUBMITTER: Liao P 

PROVIDER: S-EPMC5501420 | biostudies-literature | 2017 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

ROC Curve Analysis in the Presence of Imperfect Reference Standards.

Liao Peizhou P   Wu Hao H   Yu Tianwei T  

Statistics in biosciences 20160719 1


The receiver operating characteristic (ROC) curve is an important tool for the evaluation and comparison of predictive models when the outcome is binary. If the class membership of the outcomes are known, ROC can be constructed for a model, and the ROC with greater area under the curve (AUC) indicates better performance. However in practice, imperfect reference standards often exist, in which class membership of every data point are not fully determined. This situation is especially prevalent in  ...[more]

Similar Datasets

| S-EPMC3596883 | biostudies-literature
| S-EPMC8162996 | biostudies-literature
| S-EPMC2924391 | biostudies-literature
| S-EPMC8148335 | biostudies-literature
| S-EPMC6587928 | biostudies-literature
| S-EPMC3577107 | biostudies-literature
| S-EPMC6668995 | biostudies-literature
| S-EPMC6212326 | biostudies-literature
| S-EPMC9793859 | biostudies-literature
| S-EPMC2211327 | biostudies-literature