Unknown

Dataset Information

0

The Impact of Pathway Database Choice on Statistical Enrichment Analysis and Predictive Modeling.


ABSTRACT: Pathway-centric approaches are widely used to interpret and contextualize -omics data. However, databases contain different representations of the same biological pathway, which may lead to different results of statistical enrichment analysis and predictive models in the context of precision medicine. We have performed an in-depth benchmarking of the impact of pathway database choice on statistical enrichment analysis and predictive modeling. We analyzed five cancer datasets using three major pathway databases and developed an approach to merge several databases into a single integrative one: MPath. Our results show that equivalent pathways from different databases yield disparate results in statistical enrichment analysis. Moreover, we observed a significant dataset-dependent impact on the performance of machine learning models on different prediction tasks. In some cases, MPath significantly improved prediction performance and also reduced the variance of prediction performances. Furthermore, MPath yielded more consistent and biologically plausible results in statistical enrichment analyses. In summary, this benchmarking study demonstrates that pathway database choice can influence the results of statistical enrichment analysis and predictive modeling. Therefore, we recommend the use of multiple pathway databases or integrative ones.

SUBMITTER: Mubeen S 

PROVIDER: S-EPMC6883970 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

The Impact of Pathway Database Choice on Statistical Enrichment Analysis and Predictive Modeling.

Mubeen Sarah S   Hoyt Charles Tapley CT   Gemünd André A   Hofmann-Apitius Martin M   Fröhlich Holger H   Domingo-Fernández Daniel D  

Frontiers in genetics 20191122


Pathway-centric approaches are widely used to interpret and contextualize -<i>omics</i> data. However, databases contain different representations of the same biological pathway, which may lead to different results of statistical enrichment analysis and predictive models in the context of precision medicine. We have performed an in-depth benchmarking of the impact of pathway database choice on statistical enrichment analysis and predictive modeling. We analyzed five cancer datasets using three m  ...[more]

Similar Datasets

| S-EPMC3439721 | biostudies-literature
| S-EPMC5462748 | biostudies-literature
| S-EPMC7802636 | biostudies-literature
| S-EPMC10471899 | biostudies-literature
| S-EPMC5876005 | biostudies-literature
| S-EPMC9534387 | biostudies-literature
| S-EPMC9076597 | biostudies-literature
| S-EPMC4275602 | biostudies-literature
| S-EPMC3439733 | biostudies-literature
| S-EPMC8453775 | biostudies-literature