Unknown

Dataset Information

0

Feature selection strategies for drug sensitivity prediction.


ABSTRACT: Drug sensitivity prediction constitutes one of the main challenges in personalized medicine. Critically, the sensitivity of cancer cells to treatment depends on an unknown subset of a large number of biological features. Here, we compare standard, data-driven feature selection approaches to feature selection driven by prior knowledge of drug targets, target pathways, and gene expression signatures. We asses these methodologies on Genomics of Drug Sensitivity in Cancer (GDSC) dataset, evaluating 2484 unique models. For 23 drugs, better predictive performance is achieved when the features are selected according to prior knowledge of drug targets and pathways. The best correlation of observed and predicted response using the test set is achieved for Linifanib (r?=?0.75). Extending the drug-dependent features with gene expression signatures yields the most predictive models for 60 drugs, with the best performing example of Dabrafenib. For many compounds, even a very small subset of drug-related features is highly predictive of drug sensitivity. Small feature sets selected using prior knowledge are more predictive for drugs targeting specific genes and pathways, while models with wider feature sets perform better for drugs affecting general cellular mechanisms. Appropriate feature selection strategies facilitate the development of interpretable models that are indicative for therapy design.

SUBMITTER: Koras K 

PROVIDER: S-EPMC7287073 | biostudies-literature | 2020 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Feature selection strategies for drug sensitivity prediction.

Koras Krzysztof K   Juraeva Dilafruz D   Kreis Julian J   Mazur Johanna J   Staub Eike E   Szczurek Ewa E  

Scientific reports 20200610 1


Drug sensitivity prediction constitutes one of the main challenges in personalized medicine. Critically, the sensitivity of cancer cells to treatment depends on an unknown subset of a large number of biological features. Here, we compare standard, data-driven feature selection approaches to feature selection driven by prior knowledge of drug targets, target pathways, and gene expression signatures. We asses these methodologies on Genomics of Drug Sensitivity in Cancer (GDSC) dataset, evaluating  ...[more]

Similar Datasets

| S-EPMC4485860 | biostudies-literature
2005-07-30 | E-GEOD-3034 | biostudies-arrayexpress
| S-EPMC11009020 | biostudies-literature
2005-07-30 | GSE3034 | GEO
| S-EPMC11371799 | biostudies-literature
| S-EPMC3850986 | biostudies-literature
| S-EPMC6245785 | biostudies-other
| S-EPMC6047368 | biostudies-literature
| S-EPMC3376124 | biostudies-other
| S-EPMC7989622 | biostudies-literature