Dataset Information

Improved exome prioritization of disease genes through cross-species phenotype comparison.

ABSTRACT: Numerous new disease-gene associations have been identified by whole-exome sequencing studies in the last few years. However, many cases remain unsolved due to the sheer number of candidate variants remaining after common filtering strategies such as removing low quality and common variants and those deemed unlikely to be pathogenic. The observation that each of our genomes contains about 100 genuine loss-of-function variants makes identification of the causative mutation problematic when using these strategies alone. We propose using the wealth of genotype to phenotype data that already exists from model organism studies to assess the potential impact of these exome variants. Here, we introduce PHenotypic Interpretation of Variants in Exomes (PHIVE), an algorithm that integrates the calculation of phenotype similarity between human diseases and genetically modified mouse models with evaluation of the variants according to allele frequency, pathogenicity, and mode of inheritance approaches in our Exomiser tool. Large-scale validation of PHIVE analysis using 100,000 exomes containing known mutations demonstrated a substantial improvement (up to 54.1-fold) over purely variant-based (frequency and pathogenicity) methods with the correct gene recalled as the top hit in up to 83% of samples, corresponding to an area under the ROC curve of >95%. We conclude that incorporation of phenotype data can play a vital role in translational bioinformatics and propose that exome sequencing projects should systematically capture clinical phenotypes to take advantage of the strategy presented here.

SUBMITTER: Robinson PN

PROVIDER: S-EPMC3912424 | biostudies-literature | 2014 Feb

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Improved exome prioritization of disease genes through cross-species phenotype comparison.

Robinson Peter N PN Köhler Sebastian S Oellrich Anika A Wang Kai K Mungall Christopher J CJ Lewis Suzanna E SE Washington Nicole N Bauer Sebastian S Seelow Dominik D Krawitz Peter P Gilissen Christian C Haendel Melissa M Smedley Damian D

Genome research 20131025 2

Numerous new disease-gene associations have been identified by whole-exome sequencing studies in the last few years. However, many cases remain unsolved due to the sheer number of candidate variants remaining after common filtering strategies such as removing low quality and common variants and those deemed unlikely to be pathogenic. The observation that each of our genomes contains about 100 genuine loss-of-function variants makes identification of the causative mutation problematic when using ...[more]

PMID: 24162188

Similar Datasets

Project description:BackgroundUnderstanding complex networks that modulate development in humans is hampered by genetic and phenotypic heterogeneity within and between populations. Here we present a method that exploits natural variation in highly diverse mouse genetic reference panels in which genetic and environmental factors can be tightly controlled. The aim of our study is to test a cross-species genetic mapping strategy, which compares data of gene mapping in human patients with functional data obtained by QTL mapping in recombinant inbred mouse strains in order to prioritize human disease candidate genes.MethodologyWe exploit evolutionary conservation of developmental phenotypes to discover gene variants that influence brain development in humans. We studied corpus callosum volume in a recombinant inbred mouse panel (C57BL/6J×DBA/2J, BXD strains) using high-field strength MRI technology. We aligned mouse mapping results for this neuro-anatomical phenotype with genetic data from patients with abnormal corpus callosum (ACC) development.Principal findingsFrom the 61 syndromes which involve an ACC, 51 human candidate genes have been identified. Through interval mapping, we identified a single significant QTL on mouse chromosome 7 for corpus callosum volume with a QTL peak located between 25.5 and 26.7 Mb. Comparing the genes in this mouse QTL region with those associated with human syndromes (involving ACC) and those covered by copy number variations (CNV) yielded a single overlap, namely HNRPU in humans and Hnrpul1 in mice. Further analysis of corpus callosum volume in BXD strains revealed that the corpus callosum was significantly larger in BXD mice with a B genotype at the Hnrpul1 locus than in BXD mice with a D genotype at Hnrpul1 (F = 22.48, p<9.87*10(-5)).ConclusionThis approach that exploits highly diverse mouse strains provides an efficient and effective translational bridge to study the etiology of human developmental disorders, such as autism and schizophrenia.

Dataset Information

Improved exome prioritization of disease genes through cross-species phenotype comparison.

Publications

Improved exome prioritization of disease genes through cross-species phenotype comparison.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets