Dataset Information

Application of experimentally verified transcription factor binding sites models for computational analysis of ChIP-Seq data.

ABSTRACT:

Background

ChIP-Seq is widely used to detect genomic segments bound by transcription factors (TF), either directly at DNA binding sites (BSs) or indirectly via other proteins. Currently, there are many software tools implementing different approaches to identify TFBSs within ChIP-Seq peaks. However, their use for the interpretation of ChIP-Seq data is usually complicated by the absence of direct experimental verification, making it difficult both to set a threshold to avoid recognition of too many false-positive BSs, and to compare the actual performance of different models.

Results

Using ChIP-Seq data for FoxA2 binding loci in mouse adult liver and human HepG2 cells we compared FoxA binding-site predictions for four computational models of two fundamental classes: pattern matching based on existing training set of experimentally confirmed TFBSs (oPWM and SiteGA) and de novo motif discovery (ChIPMunk and diChIPMunk). To properly select prediction thresholds for the models, we experimentally evaluated affinity of 64 predicted FoxA BSs using EMSA that allows safely distinguishing sequences able to bind TF. As a result we identified thousands of reliable FoxA BSs within ChIP-Seq loci from mouse liver and human HepG2 cells. It was found that the performance of conventional position weight matrix (PWM) models was inferior with the highest false positive rate. On the contrary, the best recognition efficiency was achieved by the combination of SiteGA & diChIPMunk/ChIPMunk models, properly identifying FoxA BSs in up to 90% of loci for both mouse and human ChIP-Seq datasets.

Conclusions

The experimental study of TF binding to oligonucleotides corresponding to predicted sites increases the reliability of computational methods for TFBS-recognition in ChIP-Seq data analysis. Regarding ChIP-Seq data interpretation, basic PWMs have inferior TFBS recognition quality compared to the more sophisticated SiteGA and de novo motif discovery methods. A combination of models from different principles allowed identification of proper TFBSs.

SUBMITTER: Levitsky VG

PROVIDER: S-EPMC4234207 | biostudies-literature | 2014 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Application of experimentally verified transcription factor binding sites models for computational analysis of ChIP-Seq data.

Levitsky Victor G VG Kulakovskiy Ivan V IV Ershov Nikita I NI Oshchepkov Dmitry Yu DY Makeev Vsevolod J VJ Hodgman T C TC Merkulova Tatyana I TI

BMC genomics 20140129

<h4>Background</h4>ChIP-Seq is widely used to detect genomic segments bound by transcription factors (TF), either directly at DNA binding sites (BSs) or indirectly via other proteins. Currently, there are many software tools implementing different approaches to identify TFBSs within ChIP-Seq peaks. However, their use for the interpretation of ChIP-Seq data is usually complicated by the absence of direct experimental verification, making it difficult both to set a threshold to avoid recognition o ...[more]

PMID: 24472686

Dataset Information

Application of experimentally verified transcription factor binding sites models for computational analysis of ChIP-Seq data.

Background

Results

Conclusions

Publications

Application of experimentally verified transcription factor binding sites models for computational analysis of ChIP-Seq data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Pinpointing transcription factor binding sites from ChIP-seq data with SeqSite.
| S-EPMC3287483 | biostudies-literature

Optimized detection of transcription factor-binding sites in ChIP-seq experiments.
| S-EPMC3245948 | biostudies-literature

Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data.
| S-EPMC2917543 | biostudies-literature

PolyaPeak: detecting transcription factor binding sites from ChIP-seq using peak shape information.
| S-EPMC3946423 | biostudies-literature

Identification of transcription factor binding sites from ChIP-seq data at high resolution.
| S-EPMC3799470 | biostudies-literature

On the detection and refinement of transcription factor binding sites using ChIP-Seq data.
| S-EPMC2853110 | biostudies-literature

A practical comparison of methods for detecting transcription factor binding sites in ChIP-seq experiments.
| S-EPMC2804666 | biostudies-literature

Identifying differential transcription factor binding in ChIP-seq.
| S-EPMC4413818 | biostudies-literature

Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment.
| S-EPMC4082612 | biostudies-literature

Genome-wide map of human and mouse transcription factor binding sites aggregated from ChIP-Seq data.
| S-EPMC6199713 | biostudies-literature