Unknown

Dataset Information

0

Assessing the effect of data integration on predictive ability of cancer survival models.


ABSTRACT: Cancer is the second leading cause of death in the United States. To improve cancer prognosis and survival rates, a better understanding of multi-level contributory factors associated with cancer survival is needed. However, prior research on cancer survival has primarily focused on factors from the individual level due to limited availability of integrated datasets. In this study, we sought to examine how data integration impacts the performance of cancer survival prediction models. We linked data from four different sources and evaluated the performance of Cox proportional hazard models for breast, lung, and colorectal cancers under three common data integration scenarios. We showed that adding additional contextual-level predictors to survival models through linking multiple datasets improved model fit and performance. We also showed that different representations of the same variable or concept have differential impacts on model performance. When building statistical models for cancer outcomes, it is important to consider cross-level predictor interactions.

SUBMITTER: Guo Y 

PROVIDER: S-EPMC7712491 | biostudies-literature | 2020 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Assessing the effect of data integration on predictive ability of cancer survival models.

Guo Yi Y   Bian Jiang J   Modave Francois F   Li Qian Q   George Thomas J TJ   Prosperi Mattia M   Shenkman Elizabeth E  

Health informatics journal 20190123 1


Cancer is the second leading cause of death in the United States. To improve cancer prognosis and survival rates, a better understanding of multi-level contributory factors associated with cancer survival is needed. However, prior research on cancer survival has primarily focused on factors from the individual level due to limited availability of integrated datasets. In this study, we sought to examine how data integration impacts the performance of cancer survival prediction models. We linked d  ...[more]

Similar Datasets

| S-EPMC8974097 | biostudies-literature
| S-EPMC6868770 | biostudies-literature
| S-EPMC5570302 | biostudies-literature
| S-EPMC6407787 | biostudies-literature
| S-EPMC11306523 | biostudies-literature
| S-EPMC11001619 | biostudies-literature
| S-EPMC5961799 | biostudies-literature
| S-EPMC7007312 | biostudies-literature
| S-EPMC7553929 | biostudies-literature
| S-EPMC4056181 | biostudies-literature