Dataset Information

A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites.

ABSTRACT:

Background

Transcription factors are important controllers of gene expression and mapping transcription factor binding sites (TFBS) is key to inferring transcription factor regulatory networks. Several methods for predicting TFBS exist, but there are no standard genome-wide datasets on which to assess the performance of these prediction methods. Also, it is believed that information about sequence conservation across different genomes can generally improve accuracy of motif-based predictors, but it is not clear under what circumstances use of conservation is most beneficial.

Results

Here we use published ChIP-seq data and an improved peak detection method to create comprehensive benchmark datasets for prediction methods which use known descriptors or binding motifs to detect TFBS in genomic sequences. We use this benchmark to assess the performance of five different prediction methods and find that the methods that use information about sequence conservation generally perform better than simpler motif-scanning methods. The difference is greater on high-affinity peaks and when using short and information-poor motifs. However, if the motifs are specific and information-rich, we find that simple motif-scanning methods can perform better than conservation-based methods.

Conclusions

Our benchmark provides a comprehensive test that can be used to rank the relative performance of transcription factor binding site prediction methods. Moreover, our results show that, contrary to previous reports, sequence conservation is better suited for predicting strong than weak transcription factor binding sites.

SUBMITTER: Handstad T

PROVIDER: S-EPMC3077367 | biostudies-literature | 2011 Apr

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites.

Håndstad Tony T Rye Morten Beck MB Drabløs Finn F Sætrom Pål P

PloS one 20110414 4

<h4>Background</h4>Transcription factors are important controllers of gene expression and mapping transcription factor binding sites (TFBS) is key to inferring transcription factor regulatory networks. Several methods for predicting TFBS exist, but there are no standard genome-wide datasets on which to assess the performance of these prediction methods. Also, it is believed that information about sequence conservation across different genomes can generally improve accuracy of motif-based predict ...[more]

PMID: 21533218

Dataset Information

A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites.

Background

Results

Conclusions

Publications

A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

De novo motif identification improves the accuracy of predicting transcription factor binding sites in ChIP-Seq data analysis.
| S-EPMC2887977 | biostudies-literature

Pinpointing transcription factor binding sites from ChIP-seq data with SeqSite.
| S-EPMC3287483 | biostudies-literature

Optimized detection of transcription factor-binding sites in ChIP-seq experiments.
| S-EPMC3245948 | biostudies-literature

Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data.
| S-EPMC2917543 | biostudies-literature

Identification of transcription factor binding sites from ChIP-seq data at high resolution.
| S-EPMC3799470 | biostudies-literature

PolyaPeak: detecting transcription factor binding sites from ChIP-seq using peak shape information.
| S-EPMC3946423 | biostudies-literature

On the detection and refinement of transcription factor binding sites using ChIP-Seq data.
| S-EPMC2853110 | biostudies-literature

A practical comparison of methods for detecting transcription factor binding sites in ChIP-seq experiments.
| S-EPMC2804666 | biostudies-literature

Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment.
| S-EPMC4082612 | biostudies-literature

Genome-wide map of human and mouse transcription factor binding sites aggregated from ChIP-Seq data.
| S-EPMC6199713 | biostudies-literature