Dataset Information

PhenoRank: reducing study bias in gene prioritization through simulation.

ABSTRACT:

Motivation

Genome-wide association studies have identified thousands of loci associated with human disease, but identifying the causal genes at these loci is often difficult. Several methods prioritize genes most likely to be disease causing through the integration of biological data, including protein-protein interaction and phenotypic data. Data availability is not the same for all genes however, potentially influencing the performance of these methods.

Results

We demonstrate that whilst disease genes tend to be associated with greater numbers of data, this may be at least partially a result of them being better studied. With this observation we develop PhenoRank, which prioritizes disease genes whilst avoiding being biased towards genes with more available data. Bias is avoided by comparing gene scores generated for the query disease against gene scores generated using simulated sets of phenotype terms, which ensures that differences in data availability do not affect the ranking of genes. We demonstrate that whilst existing prioritization methods are biased by data availability, PhenoRank is not similarly biased. Avoiding this bias allows PhenoRank to effectively prioritize genes with fewer available data and improves its overall performance. PhenoRank outperforms three available prioritization methods in cross-validation (PhenoRank area under receiver operating characteristic curve [AUC]=0.89, DADA AUC = 0.87, EXOMISER AUC = 0.71, PRINCE AUC = 0.83, P < 2.2 × 10-16).

Availability and implementation

PhenoRank is freely available for download at https://github.com/alexjcornish/PhenoRank.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Cornish AJ

PROVIDER: S-EPMC5949213 | biostudies-literature | 2018 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

PhenoRank: reducing study bias in gene prioritization through simulation.

Cornish Alex J AJ David Alessia A Sternberg Michael J E MJE

Bioinformatics (Oxford, England) 20180601 12

<h4>Motivation</h4>Genome-wide association studies have identified thousands of loci associated with human disease, but identifying the causal genes at these loci is often difficult. Several methods prioritize genes most likely to be disease causing through the integration of biological data, including protein-protein interaction and phenotypic data. Data availability is not the same for all genes however, potentially influencing the performance of these methods.<h4>Results</h4>We demonstrate th ...[more]

PMID: 29360927

Dataset Information

PhenoRank: reducing study bias in gene prioritization through simulation.

Motivation

Results

Availability and implementation

Supplementary information

Publications

PhenoRank: reducing study bias in gene prioritization through simulation.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Assessment of regression-based methods to adjust for publication bias through a comprehensive simulation study.
| S-EPMC2649158 | biostudies-literature

Incident Diabetes and Mobility Limitations: Reducing Bias Through Risk-set Matching.
| S-EPMC4481690 | biostudies-literature

A simulation study of diagnostics for selection bias.
| S-EPMC8460089 | biostudies-literature

Effect of the 2010 Chilean earthquake on posttraumatic stress: reducing sensitivity to unmeasured bias through study design.
| S-EPMC3580201 | biostudies-literature

Quantifying lead-time bias in risk factor studies of cancer through simulation.
| S-EPMC3839248 | biostudies-literature

Evaluating bias-reducing protocols for RNA sequencing library preparation
2014-06-24 | E-MTAB-2566 | biostudies-arrayexpress

Simulation study - Oral Microbiome
| PRJEB26333 | ENA

Simulation study - Oral Microbiome
| PRJEB25791 | ENA

Guilt by rewiring: gene prioritization through network rewiring in genome wide association studies.
| S-EPMC3990172 | biostudies-literature

Gene Prioritization through Consensus Strategy, Enrichment Methodologies Analysis, and Networking for Osteosarcoma Pathogenesis.
| S-EPMC7038221 | biostudies-literature