Unknown

Dataset Information

0

Evolutionarily consistent families in SCOP: sequence, structure and function.


ABSTRACT:

Background

SCOP is a hierarchical domain classification system for proteins of known structure. The superfamily level has a clear definition: Protein domains belong to the same superfamily if there is structural, functional and sequence evidence for a common evolutionary ancestor. Superfamilies are sub-classified into families, however, there is not such a clear basis for the family level groupings. Do SCOP families group together domains with sequence similarity, do they group domains with similar structure or by common function? It is these questions we answer, but most importantly, whether each family represents a distinct phylogenetic group within a superfamily.

Results

Several phylogenetic trees were generated for each superfamily: one derived from a multiple sequence alignment, one based on structural distances, and the final two from presence/absence of GO terms or EC numbers assigned to domains. The topologies of the resulting trees and confidence values were compared to the SCOP family classification.

Conclusions

We show that SCOP family groupings are evolutionarily consistent to a very high degree with respect to classical sequence phylogenetics. The trees built from (automatically generated) structural distances correlate well, but are not always consistent with SCOP (hand annotated) groupings. Trees derived from functional data are less consistent with the family level than those from structure or sequence, though the majority still agree. Much of GO and EC annotation applies directly to one family or subset of the family; relatively few terms apply at the superfamily level. Maximum sequence diversity within a family is on average 22% but close to zero for superfamilies.

SUBMITTER: Pethica RB 

PROVIDER: S-EPMC3495643 | biostudies-literature | 2012 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Evolutionarily consistent families in SCOP: sequence, structure and function.

Pethica Ralph B RB   Levitt Michael M   Gough Julian J  

BMC structural biology 20121018


<h4>Background</h4>SCOP is a hierarchical domain classification system for proteins of known structure. The superfamily level has a clear definition: Protein domains belong to the same superfamily if there is structural, functional and sequence evidence for a common evolutionary ancestor. Superfamilies are sub-classified into families, however, there is not such a clear basis for the family level groupings. Do SCOP families group together domains with sequence similarity, do they group domains w  ...[more]

Similar Datasets

| S-EPMC308773 | biostudies-literature
| S-EPMC99153 | biostudies-literature
| S-EPMC148146 | biostudies-other
| S-EPMC194751 | biostudies-literature
| S-EPMC4702857 | biostudies-literature
| S-EPMC1941674 | biostudies-literature
| S-EPMC3106286 | biostudies-literature
| S-EPMC7730384 | biostudies-literature
2014-06-30 | E-GEOD-48324 | biostudies-arrayexpress
| S-EPMC93682 | biostudies-literature