Unknown

Dataset Information

0

Promoter prediction and annotation of microbial genomes based on DNA sequence and structural responses to superhelical stress.


ABSTRACT: In our previous studies, we found that the sites in prokaryotic genomes which are most susceptible to duplex destabilization under the negative superhelical stresses that occur in vivo are statistically highly significantly associated with intergenic regions that are known or inferred to contain promoters. In this report we investigate how this structural property, either alone or together with other structural and sequence attributes, may be used to search prokaryotic genomes for promoters.We show that the propensity for stress-induced DNA duplex destabilization (SIDD) is closely associated with specific promoter regions. The extent of destabilization in promoter-containing regions is found to be bimodally distributed. When compared with DNA curvature, deformability, thermostability or sequence motif scores within the -10 region, SIDD is found to be the most informative DNA property regarding promoter locations in the E. coli K12 genome. SIDD properties alone perform better at detecting promoter regions than other programs trained on this genome. Because this approach has a very low false positive rate, it can be used to predict with high confidence the subset of promoters that are strongly destabilized. When SIDD properties are combined with -10 motif scores in a linear classification function, they predict promoter regions with better than 80% accuracy. When these methods were tested with promoter and non-promoter sequences from Bacillus subtilis, they achieved similar or higher accuracies. We also present a strictly SIDD-based predictor for annotating promoter sequences in complete microbial genomes.In this report we show that the propensity to undergo stress-induced duplex destabilization (SIDD) is a distinctive structural attribute of many prokaryotic promoter sequences. We have developed methods to identify promoter sequences in prokaryotic genomes that use SIDD either as a sole predictor or in combination with other DNA structural and sequence properties. Although these methods cannot predict all the promoter-containing regions in a genome, they do find large sets of potential regions that have high probabilities of being true positives. This approach could be especially valuable for annotating those genomes about which there is limited experimental data.

SUBMITTER: Wang H 

PROVIDER: S-EPMC1468432 | biostudies-literature | 2006 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Promoter prediction and annotation of microbial genomes based on DNA sequence and structural responses to superhelical stress.

Wang Huiquan H   Benham Craig J CJ  

BMC bioinformatics 20060505


<h4>Background</h4>In our previous studies, we found that the sites in prokaryotic genomes which are most susceptible to duplex destabilization under the negative superhelical stresses that occur in vivo are statistically highly significantly associated with intergenic regions that are known or inferred to contain promoters. In this report we investigate how this structural property, either alone or together with other structural and sequence attributes, may be used to search prokaryotic genomes  ...[more]

Similar Datasets

| S-EPMC7856248 | biostudies-literature
| S-EPMC7789693 | biostudies-literature
| S-EPMC2688272 | biostudies-literature
| S-EPMC3049762 | biostudies-literature
| S-EPMC4847265 | biostudies-literature
| S-EPMC5531587 | biostudies-literature
| S-EPMC2211533 | biostudies-literature
| S-EPMC8743544 | biostudies-literature
| S-EPMC4916984 | biostudies-literature
| S-EPMC6837872 | biostudies-literature