Dataset Information

Assessing the accuracy of predictive models for numerical data: Not r nor r2, why not? Then what?

ABSTRACT: Assessing the accuracy of predictive models is critical because predictive models have been increasingly used across various disciplines and predictive accuracy determines the quality of resultant predictions. Pearson product-moment correlation coefficient (r) and the coefficient of determination (r2) are among the most widely used measures for assessing predictive models for numerical data, although they are argued to be biased, insufficient and misleading. In this study, geometrical graphs were used to illustrate what were used in the calculation of r and r2 and simulations were used to demonstrate the behaviour of r and r2 and to compare three accuracy measures under various scenarios. Relevant confusions about r and r2, has been clarified. The calculation of r and r2 is not based on the differences between the predicted and observed values. The existing error measures suffer various limitations and are unable to tell the accuracy. Variance explained by predictive models based on cross-validation (VEcv) is free of these limitations and is a reliable accuracy measure. Legates and McCabe's efficiency (E1) is also an alternative accuracy measure. The r and r2 do not measure the accuracy and are incorrect accuracy measures. The existing error measures suffer limitations. VEcv and E1 are recommended for assessing the accuracy. The applications of these accuracy measures would encourage accuracy-improved predictive models to be developed to generate predictions for evidence-informed decision-making.

SUBMITTER: Li J

PROVIDER: S-EPMC5570302 | biostudies-literature | 2017

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Assessing the accuracy of predictive models for numerical data: Not r nor r2, why not? Then what?

Li Jin J

PloS one 20170824 8

Assessing the accuracy of predictive models is critical because predictive models have been increasingly used across various disciplines and predictive accuracy determines the quality of resultant predictions. Pearson product-moment correlation coefficient (r) and the coefficient of determination (r2) are among the most widely used measures for assessing predictive models for numerical data, although they are argued to be biased, insufficient and misleading. In this study, geometrical graphs wer ...[more]

PMID: 28837692

Dataset Information

Assessing the accuracy of predictive models for numerical data: Not r nor r2, why not? Then what?

Publications

Assessing the accuracy of predictive models for numerical data: Not r nor r2, why not? Then what?

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Assessing the accuracy of predictive models with interval-censored data.
| S-EPMC8974097 | biostudies-literature

Better null models for assessing predictive accuracy of disease models.
| S-EPMC10162537 | biostudies-literature

Assessing the effect of data integration on predictive ability of cancer survival models.
| S-EPMC7712491 | biostudies-literature

Graphical and numerical diagnostic tools to assess multiple imputation models by posterior predictive checking.
| S-EPMC10285146 | biostudies-literature

Data based predictive models for odor perception.
| S-EPMC7553929 | biostudies-literature

Learning patient-specific predictive models from clinical data.
| S-EPMC2933959 | biostudies-literature

Data describing the accuracy of non-numerical visual features in predicting fMRI responses to numerosity.
| S-EPMC5702870 | biostudies-literature

Numerical models for assessing the risk of leaflet thrombosis post-transcatheter aortic valve-in-valve implantation.
| S-EPMC7813235 | biostudies-literature

Dynamic Predictive Models With Visualized Machine Learning for Assessing Chondrosarcoma Overall Survival.
| S-EPMC9351692 | biostudies-literature

Assessing the capacity of social determinants of health data to augment predictive models identifying patients in need of wraparound social services.
| S-EPMC7647142 | biostudies-literature