Unknown

Dataset Information

0

Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping.


ABSTRACT: Segmental duplications and other highly repetitive regions of genomes contribute significantly to cells' regulatory programs. Advancements in next generation sequencing enabled genome-wide profiling of protein-DNA interactions by chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq). However, interactions in highly repetitive regions of genomes have proven difficult to map since short reads of 50-100 base pairs (bps) from these regions map to multiple locations in reference genomes. Standard analytical methods discard such multi-mapping reads and the few that can accommodate them are prone to large false positive and negative rates. We developed Perm-seq, a prior-enhanced read allocation method for ChIP-seq experiments, that can allocate multi-mapping reads in highly repetitive regions of the genomes with high accuracy. We comprehensively evaluated Perm-seq, and found that our prior-enhanced approach significantly improves multi-read allocation accuracy over approaches that do not utilize additional data types. The statistical formalism underlying our approach facilitates supervising of multi-read allocation with a variety of data sources including histone ChIP-seq. We applied Perm-seq to 64 ENCODE ChIP-seq datasets from GM12878 and K562 cells and identified many novel protein-DNA interactions in segmental duplication regions. Our analysis reveals that although the protein-DNA interactions sites are evolutionarily less conserved in repetitive regions, they share the overall sequence characteristics of the protein-DNA interactions in non-repetitive regions.

SUBMITTER: Zeng X 

PROVIDER: S-EPMC4618727 | biostudies-literature | 2015 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping.

Zeng Xin X   Li Bo B   Welch Rene R   Rojo Constanza C   Zheng Ye Y   Dewey Colin N CN   Keleş Sündüz S  

PLoS computational biology 20151020 10


Segmental duplications and other highly repetitive regions of genomes contribute significantly to cells' regulatory programs. Advancements in next generation sequencing enabled genome-wide profiling of protein-DNA interactions by chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq). However, interactions in highly repetitive regions of genomes have proven difficult to map since short reads of 50-100 base pairs (bps) from these regions map to multiple locations in refer  ...[more]

Similar Datasets

| S-EPMC5507458 | biostudies-literature
| S-EPMC2374708 | biostudies-literature
| S-EPMC2820677 | biostudies-literature
| S-EPMC3468387 | biostudies-literature
| S-EPMC3975067 | biostudies-literature
| S-EPMC2736440 | biostudies-literature
| S-EPMC4833417 | biostudies-literature
| S-EPMC5983011 | biostudies-literature