Unknown

Dataset Information

0

Heterogeneity Aware Random Forest for Drug Sensitivity Prediction.


ABSTRACT: Samples collected in pharmacogenomics databases typically belong to various cancer types. For designing a drug sensitivity predictive model from such a database, a natural question arises whether a model trained on diverse inter-tumor heterogeneous samples will perform similar to a predictive model that takes into consideration the heterogeneity of the samples in model training and prediction. We explore this hypothesis and observe that ensemble model predictions obtained when cancer type is known out-perform predictions when that information is withheld even when the samples sizes for the former is considerably lower than the combined sample size. To incorporate the heterogeneity idea in the commonly used ensemble based predictive model of Random Forests, we propose Heterogeneity Aware Random Forests (HARF) that assigns weights to the trees based on the category of the sample. We treat heterogeneity as a latent class allocation problem and present a covariate free class allocation approach based on the distribution of leaf nodes of the model ensemble. Applications on CCLE and GDSC databases show that HARF outperforms traditional Random Forest when the average drug responses of cancer types are different.

SUBMITTER: Rahman R 

PROVIDER: S-EPMC5595802 | biostudies-other | 2017 Sep

REPOSITORIES: biostudies-other

altmetric image

Publications

Heterogeneity Aware Random Forest for Drug Sensitivity Prediction.

Rahman Raziur R   Matlock Kevin K   Ghosh Souparno S   Pal Ranadip R  

Scientific reports 20170912 1


Samples collected in pharmacogenomics databases typically belong to various cancer types. For designing a drug sensitivity predictive model from such a database, a natural question arises whether a model trained on diverse inter-tumor heterogeneous samples will perform similar to a predictive model that takes into consideration the heterogeneity of the samples in model training and prediction. We explore this hypothesis and observe that ensemble model predictions obtained when cancer type is kno  ...[more]

Similar Datasets

2005-07-30 | E-GEOD-3034 | biostudies-arrayexpress
2012-05-09 | E-GEOD-37858 | biostudies-arrayexpress
2012-05-10 | GSE37858 | GEO
2005-07-30 | GSE3034 | GEO
2022-05-16 | GSE189510 | GEO
| S-EPMC3530872 | biostudies-other
| S-EPMC4684346 | biostudies-literature
| S-EPMC8425567 | biostudies-literature
| S-EPMC5881105 | biostudies-other
| S-EPMC3018816 | biostudies-other