Unknown

Dataset Information

0

Active site prediction using evolutionary and structural information.


ABSTRACT: The identification of catalytic residues is a key step in understanding the function of enzymes. While a variety of computational methods have been developed for this task, accuracies have remained fairly low. The best existing method exploits information from sequence and structure to achieve a precision (the fraction of predicted catalytic residues that are catalytic) of 18.5% at a corresponding recall (the fraction of catalytic residues identified) of 57% on a standard benchmark. Here we present a new method, Discern, which provides a significant improvement over the state-of-the-art through the use of statistical techniques to derive a model with a small set of features that are jointly predictive of enzyme active sites.In cross-validation experiments on two benchmark datasets from the Catalytic Site Atlas and CATRES resources containing a total of 437 manually curated enzymes spanning 487 SCOP families, Discern increases catalytic site recall between 12% and 20% over methods that combine information from both sequence and structure, and by >or=50% over methods that make use of sequence conservation signal only. Controlled experiments show that Discern's improvement in catalytic residue prediction is derived from the combination of three ingredients: the use of the INTREPID phylogenomic method to extract conservation information; the use of 3D structure data, including features computed for residues that are proximal in the structure; and a statistical regularization procedure to prevent overfitting.

SUBMITTER: Sankararaman S 

PROVIDER: S-EPMC2828116 | biostudies-literature | 2010 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Active site prediction using evolutionary and structural information.

Sankararaman Sriram S   Sha Fei F   Kirsch Jack F JF   Jordan Michael I MI   Sjölander Kimmen K  

Bioinformatics (Oxford, England) 20100114 5


<h4>Motivation</h4>The identification of catalytic residues is a key step in understanding the function of enzymes. While a variety of computational methods have been developed for this task, accuracies have remained fairly low. The best existing method exploits information from sequence and structure to achieve a precision (the fraction of predicted catalytic residues that are catalytic) of 18.5% at a corresponding recall (the fraction of catalytic residues identified) of 57% on a standard benc  ...[more]

Similar Datasets

| S-EPMC3044306 | biostudies-literature
| S-EPMC3161874 | biostudies-other
| S-EPMC9235490 | biostudies-literature
| S-EPMC7657543 | biostudies-literature
| S-EPMC3549808 | biostudies-other
| S-EPMC3925371 | biostudies-literature
| S-EPMC10326337 | biostudies-literature
| S-EPMC1891676 | biostudies-literature
| S-EPMC2258193 | biostudies-literature
| S-EPMC2567998 | biostudies-literature