Dataset Information

A novel method for accurate operon predictions in all sequenced prokaryotes.

ABSTRACT: We combine comparative genomic measures and the distance separating adjacent genes to predict operons in 124 completely sequenced prokaryotic genomes. Our method automatically tailors itself to each genome using sequence information alone, and thus can be applied to any prokaryote. For Escherichia coli K12 and Bacillus subtilis, our method is 85 and 83% accurate, respectively, which is similar to the accuracy of methods that use the same features but are trained on experimentally characterized transcripts. In Halobacterium NRC-1 and in Helicobacter pylori, our method correctly infers that genes in operons are separated by shorter distances than they are in E.coli, and its predictions using distance alone are more accurate than distance-only predictions trained on a database of E.coli transcripts. We use microarray data from six phylogenetically diverse prokaryotes to show that combining intergenic distance with comparative genomic measures further improves accuracy and that our method is broadly effective. Finally, we survey operon structure across 124 genomes, and find several surprises: H.pylori has many operons, contrary to previous reports; Bacillus anthracis has an unusual number of pseudogenes within conserved operons; and Synechocystis PCC 6803 has many operons even though it has unusually wide spacings between conserved adjacent genes.

SUBMITTER: Price MN

PROVIDER: S-EPMC549399 | biostudies-literature | 2005

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A novel method for accurate operon predictions in all sequenced prokaryotes.

Price Morgan N MN Huang Katherine H KH Alm Eric J EJ Arkin Adam P AP

Nucleic acids research 20050208 3

We combine comparative genomic measures and the distance separating adjacent genes to predict operons in 124 completely sequenced prokaryotic genomes. Our method automatically tailors itself to each genome using sequence information alone, and thus can be applied to any prokaryote. For Escherichia coli K12 and Bacillus subtilis, our method is 85 and 83% accurate, respectively, which is similar to the accuracy of methods that use the same features but are trained on experimentally characterized t ...[more]

PMID: 15701760

Dataset Information

A novel method for accurate operon predictions in all sequenced prokaryotes.

Publications

A novel method for accurate operon predictions in all sequenced prokaryotes.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Does the "surprisingly popular" method yield accurate crowdsourced predictions?
| S-EPMC7658271 | biostudies-literature

Transcriptome dynamics-based operon prediction in prokaryotes.
| S-EPMC4235196 | biostudies-literature

Capping-RACE: a simple, accurate, and sensitive 5' RACE method for use in prokaryotes.
| S-EPMC6265449 | biostudies-literature

Operon prediction for sequenced bacterial genomes without experimental information.
| S-EPMC1800777 | biostudies-literature

RepurposeVS: A Drug Repurposing-Focused Computational Method for Accurate Drug-Target Signature Predictions.
| S-EPMC5848469 | biostudies-literature

The distinctive signatures of promoter regions and operon junctions across prokaryotes.
| S-EPMC1557821 | biostudies-literature

How accurate can genetic predictions be?
| S-EPMC3534619 | biostudies-literature

A novel method of amplification of FFPET derived-RNA enables accurate disease classification with microarrays
2010-06-09 | E-GEOD-19246 | biostudies-arrayexpress

Accurate protein stability predictions from homology models.
| S-EPMC9729920 | biostudies-literature

Accurate Physical Property Predictions via Deep Learning.
| S-EPMC8912091 | biostudies-literature