Unknown

Dataset Information

0

Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era.


ABSTRACT: Recently developed methods have shown considerable promise in predicting residue-residue contacts in protein 3D structures using evolutionary covariance information. However, these methods require large numbers of evolutionarily related sequences to robustly assess the extent of residue covariation, and the larger the protein family, the more likely that contact information is unnecessary because a reasonable model can be built based on the structure of a homolog. Here we describe a method that integrates sequence coevolution and structural context information using a pseudolikelihood approach, allowing more accurate contact predictions from fewer homologous sequences. We rigorously assess the utility of predicted contacts for protein structure prediction using large and representative sequence and structure databases from recent structure prediction experiments. We find that contact predictions are likely to be accurate when the number of aligned sequences (with sequence redundancy reduced to 90%) is greater than five times the length of the protein, and that accurate predictions are likely to be useful for structure modeling if the aligned sequences are more similar to the protein of interest than to the closest homolog of known structure. These conditions are currently met by 422 of the protein families collected in the Pfam database.

SUBMITTER: Kamisetty H 

PROVIDER: S-EPMC3785744 | biostudies-literature | 2013 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era.

Kamisetty Hetunandan H   Ovchinnikov Sergey S   Baker David D  

Proceedings of the National Academy of Sciences of the United States of America 20130905 39


Recently developed methods have shown considerable promise in predicting residue-residue contacts in protein 3D structures using evolutionary covariance information. However, these methods require large numbers of evolutionarily related sequences to robustly assess the extent of residue covariation, and the larger the protein family, the more likely that contact information is unnecessary because a reasonable model can be built based on the structure of a homolog. Here we describe a method that  ...[more]

Similar Datasets

| S-EPMC3226919 | biostudies-literature
| S-EPMC3154634 | biostudies-literature
| S-EPMC6851495 | biostudies-literature
| S-EPMC3348156 | biostudies-literature
| S-EPMC3261191 | biostudies-literature
| S-EPMC5820155 | biostudies-literature
| S-EPMC4621035 | biostudies-literature
| S-EPMC5860164 | biostudies-literature
| S-EPMC5870574 | biostudies-literature
| S-EPMC3823628 | biostudies-literature