Unknown

Dataset Information

0

Extracting transcription factor binding sites from unaligned gene sequences with statistical models.


ABSTRACT:

Background

Transcription factor binding sites (TFBSs) are crucial in the regulation of gene transcription. Recently, chromatin immunoprecipitation followed by cDNA microarray hybridization (ChIP-chip array) has been used to identify potential regulatory sequences, but the procedure can only map the probable protein-DNA interaction loci within 1-2 kb resolution. To find out the exact binding motifs, it is necessary to build a computational method to examine the ChIP-chip array binding sequences and search for possible motifs representing the transcription factor binding sites.

Results

We developed a program to find out accurate motif sites from a set of unaligned DNA sequences in the yeast genome. Compared with MDscan, the prediction results suggest that, overall, our algorithm outperforms MDscan since the predicted motifs are more consistent with previously known specificities reported in the literature and have better prediction ranks. Our program also outperforms the constraint-less Cosmo program, especially in the elimination of false positives.

Conclusion

In this study, an improved sampling algorithm is proposed to incorporate the binomial probability model to build significant initial candidate motif sets. By investigating the statistical dependence between base positions in TFBSs, the method of dependency graphs and their expanded Bayesian networks is combined. The results show that our program satisfactorily extract transcription factor binding sites from unaligned gene sequences.

SUBMITTER: Lu CC 

PROVIDER: S-EPMC2638147 | biostudies-literature | 2008 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Extracting transcription factor binding sites from unaligned gene sequences with statistical models.

Lu Chung-Chin CC   Yuan Wei-Hao WH   Chen Te-Ming TM  

BMC bioinformatics 20081212


<h4>Background</h4>Transcription factor binding sites (TFBSs) are crucial in the regulation of gene transcription. Recently, chromatin immunoprecipitation followed by cDNA microarray hybridization (ChIP-chip array) has been used to identify potential regulatory sequences, but the procedure can only map the probable protein-DNA interaction loci within 1-2 kb resolution. To find out the exact binding motifs, it is necessary to build a computational method to examine the ChIP-chip array binding seq  ...[more]

Similar Datasets

| S-EPMC2241927 | biostudies-literature
| S-EPMC3240832 | biostudies-literature
| S-EPMC5644988 | biostudies-literature
| S-EPMC5481346 | biostudies-literature
| S-EPMC7736823 | biostudies-literature
| S-EPMC2908699 | biostudies-literature
| S-EPMC2647310 | biostudies-literature
| S-EPMC3639912 | biostudies-literature
| S-EPMC2800119 | biostudies-literature
| S-EPMC7826281 | biostudies-literature