Unknown

Dataset Information

0

Coherent pathway enrichment estimation by modeling inter-pathway dependencies using regularized regression.


ABSTRACT:

Motivation

Gene set enrichment methods are a common tool to improve the interpretability of gene lists as obtained, for example, from differential gene expression analyses. They are based on computing whether dysregulated genes are located in certain biological pathways more often than expected by chance. Gene set enrichment tools rely on pre-existing pathway databases such as KEGG, Reactome, or the Gene Ontology. These databases are increasing in size and in the number of redundancies between pathways, which complicates the statistical enrichment computation.

Results

We address this problem and develop a novel gene set enrichment method, called pareg, which is based on a regularized generalized linear model and directly incorporates dependencies between gene sets related to certain biological functions, for example, due to shared genes, in the enrichment computation. We show that pareg is more robust to noise than competing methods. Additionally, we demonstrate the ability of our method to recover known pathways as well as to suggest novel treatment targets in an exploratory analysis using breast cancer samples from TCGA.

Availability and implementation

pareg is freely available as an R package on Bioconductor (https://bioconductor.org/packages/release/bioc/html/pareg.html) as well as on https://github.com/cbg-ethz/pareg. The GitHub repository also contains the Snakemake workflows needed to reproduce all results presented here.

SUBMITTER: Jablonski KP 

PROVIDER: S-EPMC10471899 | biostudies-literature | 2023 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Coherent pathway enrichment estimation by modeling inter-pathway dependencies using regularized regression.

Jablonski Kim Philipp KP   Beerenwinkel Niko N  

Bioinformatics (Oxford, England) 20230801 8


<h4>Motivation</h4>Gene set enrichment methods are a common tool to improve the interpretability of gene lists as obtained, for example, from differential gene expression analyses. They are based on computing whether dysregulated genes are located in certain biological pathways more often than expected by chance. Gene set enrichment tools rely on pre-existing pathway databases such as KEGG, Reactome, or the Gene Ontology. These databases are increasing in size and in the number of redundancies b  ...[more]

Similar Datasets

| S-EPMC4672891 | biostudies-literature
| S-EPMC2242820 | biostudies-literature
| S-EPMC5559077 | biostudies-literature
| S-EPMC3750505 | biostudies-literature
| S-EPMC9733318 | biostudies-literature
| S-EPMC6305753 | biostudies-literature
| S-EPMC11842445 | biostudies-literature
| S-EPMC8903146 | biostudies-literature
| S-EPMC5802704 | biostudies-literature
| S-EPMC3667751 | biostudies-literature