Unknown

Dataset Information

0

Integration of Survival and Binary Data for Variable Selection and Prediction: A Bayesian Approach.


ABSTRACT: We consider the problem where the data consist of a survival time and a binary outcome measurement for each individual, as well as corresponding predictors. The goal is to select the common set of predictors which affect both the responses, and not just only one of them. In addition, we develop a survival prediction model based on data integration. This article is motivated by the Cancer Genomic Atlas (TCGA) databank, which is currently the largest genomics and transcriptomics database. The data contain cancer survival information along with cancer stages for each patient. Furthermore, it contains Reverse-phase Protein Array (RPPA) measurements for each individual, which are the predictors associated with these responses. The biological motivation is to identify the major actionable proteins associated with both survival outcomes and cancer stages. We develop a Bayesian hierarchical model to jointly model the survival time and the classification of the cancer stages. Moreover, to deal with the high dimensionality of the RPPA measurements, we use a shrinkage prior to identify significant proteins. Simulations and TCGA data analysis show that the joint integrated modeling approach improves survival prediction.

SUBMITTER: Maity AK 

PROVIDER: S-EPMC7729996 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Integration of Survival and Binary Data for Variable Selection and Prediction: A Bayesian Approach.

Maity Arnab Kumar AK   Carroll Raymond J RJ   Mallick Bani K BK  

Journal of the Royal Statistical Society. Series C, Applied statistics 20190918 5


We consider the problem where the data consist of a survival time and a binary outcome measurement for each individual, as well as corresponding predictors. The goal is to select the common set of predictors which affect both the responses, and not just only one of them. In addition, we develop a survival prediction model based on data integration. This article is motivated by the Cancer Genomic Atlas (TCGA) databank, which is currently the largest genomics and transcriptomics database. The data  ...[more]

Similar Datasets

| S-EPMC6218990 | biostudies-literature
| S-EPMC6222001 | biostudies-literature
| S-EPMC4848399 | biostudies-literature
| S-EPMC5885321 | biostudies-literature
| S-EPMC3031034 | biostudies-other
| S-EPMC4712617 | biostudies-literature
| S-EPMC5940219 | biostudies-literature