Dataset Information

Assessment of performance of survival prediction models for cancer prognosis.

ABSTRACT: BACKGROUND: Cancer survival studies are commonly analyzed using survival-time prediction models for cancer prognosis. A number of different performance metrics are used to ascertain the concordance between the predicted risk score of each patient and the actual survival time, but these metrics can sometimes conflict. Alternatively, patients are sometimes divided into two classes according to a survival-time threshold, and binary classifiers are applied to predict each patient's class. Although this approach has several drawbacks, it does provide natural performance metrics such as positive and negative predictive values to enable unambiguous assessments. METHODS: We compare the survival-time prediction and survival-time threshold approaches to analyzing cancer survival studies. We review and compare common performance metrics for the two approaches. We present new randomization tests and cross-validation methods to enable unambiguous statistical inferences for several performance metrics used with the survival-time prediction approach. We consider five survival prediction models consisting of one clinical model, two gene expression models, and two models from combinations of clinical and gene expression models. RESULTS: A public breast cancer dataset was used to compare several performance metrics using five prediction models. 1) For some prediction models, the hazard ratio from fitting a Cox proportional hazards model was significant, but the two-group comparison was insignificant, and vice versa. 2) The randomization test and cross-validation were generally consistent with the p-values obtained from the standard performance metrics. 3) Binary classifiers highly depended on how the risk groups were defined; a slight change of the survival threshold for assignment of classes led to very different prediction results. CONCLUSIONS: 1) Different performance metrics for evaluation of a survival prediction model may give different conclusions in its discriminatory ability. 2) Evaluation using a high-risk versus low-risk group comparison depends on the selected risk-score threshold; a plot of p-values from all possible thresholds can show the sensitivity of the threshold selection. 3) A randomization test of the significance of Somers' rank correlation can be used for further evaluation of performance of a prediction model. 4) The cross-validated power of survival prediction models decreases as the training and test sets become less balanced.

SUBMITTER: Chen HC

PROVIDER: S-EPMC3410808 | biostudies-other | 2012

REPOSITORIES: biostudies-other

ACCESS DATA

Publications

Assessment of performance of survival prediction models for cancer prognosis.

Chen Hung-Chia HC Kodell Ralph L RL Cheng Kuang Fu KF Chen James J JJ

BMC medical research methodology 20120723

<h4>Background</h4>Cancer survival studies are commonly analyzed using survival-time prediction models for cancer prognosis. A number of different performance metrics are used to ascertain the concordance between the predicted risk score of each patient and the actual survival time, but these metrics can sometimes conflict. Alternatively, patients are sometimes divided into two classes according to a survival-time threshold, and binary classifiers are applied to predict each patient's class. Alt ...[more]

PMID: 22824262

Similar Datasets

Project description:BackgroundSeveral studies have explored the predictive performance of machine learning-based breast cancer risk prediction models and have shown controversial conclusions. Thus, the performance of the current machine learning-based breast cancer risk prediction models and their benefits and weakness need to be evaluated for the future development of feasible and efficient risk prediction models.ObjectiveThe aim of this review was to assess the performance and the clinical feasibility of the currently available machine learning-based breast cancer risk prediction models.MethodsWe searched for papers published until June 9, 2021, on machine learning-based breast cancer risk prediction models in PubMed, Embase, and Web of Science. Studies describing the development or validation models for predicting future breast cancer risk were included. The Prediction Model Risk of Bias Assessment Tool (PROBAST) was used to assess the risk of bias and the clinical applicability of the included studies. The pooled area under the curve (AUC) was calculated using the DerSimonian and Laird random-effects model.ResultsA total of 8 studies with 10 data sets were included. Neural network was the most common machine learning method for the development of breast cancer risk prediction models. The pooled AUC of the machine learning-based optimal risk prediction model reported in each study was 0.73 (95% CI 0.66-0.80; approximate 95% prediction interval 0.56-0.96), with a high level of heterogeneity between studies (Q=576.07, I2=98.44%; P<.001). The results of head-to-head comparison of the performance difference between the 2 types of models trained by the same data set showed that machine learning models had a slightly higher advantage than traditional risk factor-based models in predicting future breast cancer risk. The pooled AUC of the neural network-based risk prediction model was higher than that of the nonneural network-based optimal risk prediction model (0.71 vs 0.68, respectively). Subgroup analysis showed that the incorporation of imaging features in risk models resulted in a higher pooled AUC than the nonincorporation of imaging features in risk models (0.73 vs 0.61; Pheterogeneity=.001, respectively). The PROBAST analysis indicated that many machine learning models had high risk of bias and poorly reported calibration analysis.ConclusionsOur review shows that the current machine learning-based breast cancer risk prediction models have some technical pitfalls and that their clinical feasibility and reliability are unsatisfactory.

Project description:BackgroundCancer prognosis prediction is valuable for patients and clinicians because it allows them to appropriately manage care. A promising direction for improving the performance and interpretation of expression-based predictive models involves the aggregation of gene-level data into biological pathways. While many studies have used pathway-level predictors for cancer survival analysis, a comprehensive comparison of pathway-level and gene-level prognostic models has not been performed. To address this gap, we characterized the performance of penalized Cox proportional hazard models built using either pathway- or gene-level predictors for the cancers profiled in The Cancer Genome Atlas (TCGA) and pathways from the Molecular Signatures Database (MSigDB).ResultsWhen analyzing TCGA data, we found that pathway-level models are more parsimonious, more robust, more computationally efficient and easier to interpret than gene-level models with similar predictive performance. For example, both pathway-level and gene-level models have an average Cox concordance index of ~ 0.85 for the TCGA glioma cohort, however, the gene-level model has twice as many predictors on average, the predictor composition is less stable across cross-validation folds and estimation takes 40 times as long as compared to the pathway-level model. When the complex correlation structure of the data is broken by permutation, the pathway-level model has greater predictive performance while still retaining superior interpretative power, robustness, parsimony and computational efficiency relative to the gene-level models. For example, the average concordance index of the pathway-level model increases to 0.88 while the gene-level model falls to 0.56 for the TCGA glioma cohort using survival times simulated from uncorrelated gene expression data.ConclusionThe results of this study show that when the correlations among gene expression values are low, pathway-level analyses can yield better predictive performance, greater interpretative power, more robust models and less computational cost relative to a gene-level model. When correlations among genes are high, a pathway-level analysis provides equivalent predictive power compared to a gene-level analysis while retaining the advantages of interpretability, robustness and computational efficiency.

Project description:Objective: This work aims to systematically identify, describe, and appraise all prognostic models for cervical cancer and provide a reference for clinical practice and future research. Methods: We systematically searched PubMed, EMBASE, and Cochrane library databases up to December 2020 and included studies developing, validating, or updating a prognostic model for cervical cancer. Two reviewers extracted information based on the CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modeling Studies checklist and assessed the risk of bias using the Prediction model Risk Of Bias ASsessment Tool. Results: Fifty-six eligible articles were identified, describing the development of 77 prognostic models and 27 external validation efforts. The 77 prognostic models focused on three types of cervical cancer patients at different stages, i.e., patients with early-stage cervical cancer (n = 29; 38%), patients with locally advanced cervical cancer (n = 27; 35%), and all-stage cervical cancer patients (n = 21; 27%). Among the 77 models, the most frequently used predictors were lymph node status (n = 57; 74%), the International Federation of Gynecology and Obstetrics stage (n = 42; 55%), histological types (n = 38; 49%), and tumor size (n = 37; 48%). The number of models that applied internal validation, presented a full equation, and assessed model calibration was 52 (68%), 16 (21%), and 45 (58%), respectively. Twenty-four models were externally validated, among which three were validated twice. None of the models were assessed with an overall low risk of bias. The Prediction Model of Failure in Locally Advanced Cervical Cancer model was externally validated twice, with acceptable performance, and seemed to be the most reliable. Conclusions: Methodological details including internal validation, sample size, and handling of missing data need to be emphasized on, and external validation is needed to facilitate the application and generalization of models for cervical cancer.

Dataset Information

Assessment of performance of survival prediction models for cancer prognosis.

Publications

Assessment of performance of survival prediction models for cancer prognosis.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets