Unknown

Dataset Information

0

Important biological information uncovered in previously unaligned reads from chromatin immunoprecipitation experiments (ChIP-Seq).


ABSTRACT: Establishing the architecture of gene regulatory networks (GRNs) relies on chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) methods that provide genome-wide transcription factor binding sites (TFBSs). ChIP-Seq furnishes millions of short reads that, after alignment, describe the genome-wide binding sites of a particular TF. However, in all organisms investigated an average of 40% of reads fail to align to the corresponding genome, with some datasets having as much as 80% of reads failing to align. We describe here the provenance of previously unaligned reads in ChIP-Seq experiments from animals and plants. We show that a substantial portion corresponds to sequences of bacterial and metazoan origin, irrespective of the ChIP-Seq chromatin source. Unforeseen was the finding that 30%-40% of unaligned reads were actually alignable. To validate these observations, we investigated the characteristics of the previously unaligned reads corresponding to TAL1, a human TF involved in lineage specification of hemopoietic cells. We show that, while unmapped ChIP-Seq read datasets contain foreign DNA sequences, additional TFBSs can be identified from the previously unaligned ChIP-Seq reads. Our results indicate that the re-evaluation of previously unaligned reads from ChIP-Seq experiments will significantly contribute to TF target identification and determination of emerging properties of GRNs.

SUBMITTER: Ouma WZ 

PROVIDER: S-EPMC4345404 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

altmetric image

Publications

Important biological information uncovered in previously unaligned reads from chromatin immunoprecipitation experiments (ChIP-Seq).

Ouma Wilberforce Zachary WZ   Mejia-Guerra Maria Katherine MK   Yilmaz Alper A   Pareja-Tobes Pablo P   Li Wei W   Doseff Andrea I AI   Grotewold Erich E  

Scientific reports 20150302


Establishing the architecture of gene regulatory networks (GRNs) relies on chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) methods that provide genome-wide transcription factor binding sites (TFBSs). ChIP-Seq furnishes millions of short reads that, after alignment, describe the genome-wide binding sites of a particular TF. However, in all organisms investigated an average of 40% of reads fail to align to the corresponding genome, with some datasets having as mu  ...[more]

Similar Datasets

| S-EPMC6701478 | biostudies-literature
| S-EPMC4053718 | biostudies-literature
| S-EPMC5544034 | biostudies-other
| S-EPMC3351193 | biostudies-literature
| S-EPMC4053851 | biostudies-literature
| S-EPMC4620941 | biostudies-literature
| S-EPMC3665311 | biostudies-other
| S-EPMC4447067 | biostudies-literature
| S-EPMC5908393 | biostudies-literature
| S-EPMC2367313 | biostudies-literature