Unknown

Dataset Information

0

Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?


ABSTRACT: Large experimental efforts are characterizing the regulatory genome, yet we are still missing a systematic definition of functional and silent genetic variants in non-coding regions. Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict the impact of a genetic variant on TF binding across 153 tissues and 1,372 TF motifs. Each annotation we derived is specific for a cell-type condition or assay and is locally motif-driven. We found 5.8 million genetic variants in footprints, 66% of which are predicted by our model to affect TF binding. Comprehensive examination using allele-specific hypersensitivity (ASH) reveals that only the latter group consistently shows evidence for ASH (3,217 SNPs at 20% FDR), suggesting that most (97%) genetic variants in footprinted regulatory regions are indeed silent. Combining this information with GWAS data reveals that our annotation helps in computationally fine-mapping 86 SNPs in GWAS hit regions with at least a 2-fold increase in the posterior odds of picking the causal SNP. The rich meta information provided by the tissue-specificity and the identity of the putative TF binding site being affected also helps in identifying the underlying mechanism supporting the association. As an example, the enrichment for LDL level-associated SNPs is 9.1-fold higher among SNPs predicted to affect HNF4 binding sites than in a background model already including tissue-specific annotation.

SUBMITTER: Moyerbrailean GA 

PROVIDER: S-EPMC4764260 | biostudies-literature | 2016 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?

Moyerbrailean Gregory A GA   Kalita Cynthia A CA   Harvey Chris T CT   Wen Xiaoquan X   Luca Francesca F   Pique-Regi Roger R  

PLoS genetics 20160222 2


Large experimental efforts are characterizing the regulatory genome, yet we are still missing a systematic definition of functional and silent genetic variants in non-coding regions. Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict the impact of a genetic variant on TF binding across 153 tissues and 1,372 TF motifs. Each annotation we derived is specific for a cell-type condition or assay and is locally motif-driven. We found 5.8  ...[more]

Similar Datasets

| S-EPMC6385462 | biostudies-literature
| S-EPMC5741056 | biostudies-literature
| S-EPMC3834841 | biostudies-other
| S-EPMC5351933 | biostudies-literature
| S-EPMC5630036 | biostudies-literature
| S-EPMC4772017 | biostudies-literature
| S-EPMC4978937 | biostudies-literature
| S-EPMC6549478 | biostudies-literature
| S-EPMC11304670 | biostudies-literature
| S-EPMC4133688 | biostudies-literature