Unknown

Dataset Information

0

Seeking an ancient enzyme in Methanococcus jannaschii using ORF, a program based on predicted secondary structure comparisons.


ABSTRACT: We have developed a simple procedure to identify protein homologs in genomic databases. The program, called ORF, is based on comparisons of predicted secondary structure. Protein structure is far better conserved than amino acid sequence, and structure-based methods have been effective in exploiting this fact to find homologs, even among proteins with scant sequence identity. ORF is a secondary structure-based method that operates solely on predictions from sequence and requires no experimentally determined information about the structure. The approach is illustrated by an example: Thymidylate synthase, a highly conserved enzyme essential to thymidine biosynthesis in both prokaryotes and eukaryotes, is thought to be used by Archaea, but a corresponding gene has yet to be identified. Here, a candidate thymidylate synthase is identified as a previously unassigned open reading frame from the genome of Methanococcus jannaschii, viz., MJ0757. Using primary structure information alone, the optimally aligned sequence identity between MJ0757 and Escherichia coli thymidylate synthase is 7%, well below the threshold of sensitivity for detection by sequence-based methods.

SUBMITTER: Aurora R 

PROVIDER: S-EPMC19652 | biostudies-literature | 1998 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Seeking an ancient enzyme in Methanococcus jannaschii using ORF, a program based on predicted secondary structure comparisons.

Aurora R R   Rose G D GD  

Proceedings of the National Academy of Sciences of the United States of America 19980301 6


We have developed a simple procedure to identify protein homologs in genomic databases. The program, called ORF, is based on comparisons of predicted secondary structure. Protein structure is far better conserved than amino acid sequence, and structure-based methods have been effective in exploiting this fact to find homologs, even among proteins with scant sequence identity. ORF is a secondary structure-based method that operates solely on predictions from sequence and requires no experimentall  ...[more]

Similar Datasets

| S-EPMC2743043 | biostudies-literature
| S-EPMC5624517 | biostudies-literature
| S-EPMC4805478 | biostudies-literature
| S-EPMC2699501 | biostudies-literature
| S-EPMC134845 | biostudies-literature
| S-EPMC107589 | biostudies-literature
| S-EPMC2206692 | biostudies-other
2005-01-01 | E-TIGR-18 | biostudies-arrayexpress
| S-EPMC55802 | biostudies-literature
| S-EPMC29280 | biostudies-literature