Unknown

Dataset Information

0

Assessing the effects of symmetry on motif discovery and modeling.


ABSTRACT:

Background

Identifying the DNA binding sites for transcription factors is a key task in modeling the gene regulatory network of a cell. Predicting DNA binding sites computationally suffers from high false positives and false negatives due to various contributing factors, including the inaccurate models for transcription factor specificity. One source of inaccuracy in the specificity models is the assumption of asymmetry for symmetric models.

Methodology/principal findings

Using simulation studies, so that the correct binding site model is known and various parameters of the process can be systematically controlled, we test different motif finding algorithms on both symmetric and asymmetric binding site data. We show that if the true binding site is asymmetric the results are unambiguous and the asymmetric model is clearly superior to the symmetric model. But if the true binding specificity is symmetric commonly used methods can infer, incorrectly, that the motif is asymmetric. The resulting inaccurate motifs lead to lower sensitivity and specificity than would the correct, symmetric models. We also show how the correct model can be obtained by the use of appropriate measures of statistical significance.

Conclusions/significance

This study demonstrates that the most commonly used motif-finding approaches usually model symmetric motifs incorrectly, which leads to higher than necessary false prediction errors. It also demonstrates how alternative motif-finding methods can correct the problem, providing more accurate motif models and reducing the errors. Furthermore, it provides criteria for determining whether a symmetric or asymmetric model is the most appropriate for any experimental dataset.

SUBMITTER: Motlhabi LM 

PROVIDER: S-EPMC3176789 | biostudies-literature | 2011

REPOSITORIES: biostudies-literature

altmetric image

Publications

Assessing the effects of symmetry on motif discovery and modeling.

Motlhabi Lala M LM   Stormo Gary D GD  

PloS one 20110920 9


<h4>Background</h4>Identifying the DNA binding sites for transcription factors is a key task in modeling the gene regulatory network of a cell. Predicting DNA binding sites computationally suffers from high false positives and false negatives due to various contributing factors, including the inaccurate models for transcription factor specificity. One source of inaccuracy in the specificity models is the assumption of asymmetry for symmetric models.<h4>Methodology/principal findings</h4>Using si  ...[more]

Similar Datasets

| S-EPMC2687942 | biostudies-literature
| S-EPMC6030969 | biostudies-literature
| S-EPMC7746960 | biostudies-literature
| S-EPMC3390389 | biostudies-literature
| S-EPMC6408154 | biostudies-literature
| S-EPMC2794970 | biostudies-literature
| S-EPMC10581916 | biostudies-literature
| S-EPMC2311304 | biostudies-literature
| S-EPMC3855595 | biostudies-literature
| S-EPMC1903367 | biostudies-literature