Unknown

Dataset Information

0

Towards fully automated structure-based function prediction in structural genomics: a case study.


ABSTRACT: As the global Structural Genomics projects have picked up pace, the number of structures annotated in the Protein Data Bank as hypothetical protein or unknown function has grown significantly. A major challenge now involves the development of computational methods to assign functions to these proteins accurately and automatically. As part of the Midwest Center for Structural Genomics (MCSG) we have developed a fully automated functional analysis server, ProFunc, which performs a battery of analyses on a submitted structure. The analyses combine a number of sequence-based and structure-based methods to identify functional clues. After the first stage of the Protein Structure Initiative (PSI), we review the success of the pipeline and the importance of structure-based function prediction. As a dataset, we have chosen all structures solved by the MCSG during the 5 years of the first PSI. Our analysis suggests that two of the structure-based methods are particularly successful and provide examples of local similarity that is difficult to identify using current sequence-based methods. No one method is successful in all cases, so, through the use of a number of complementary sequence and structural approaches, the ProFunc server increases the chances that at least one method will find a significant hit that can help elucidate function. Manual assessment of the results is a time-consuming process and subject to individual interpretation and human error. We present a method based on the Gene Ontology (GO) schema using GO-slims that can allow the automated assessment of hits with a success rate approaching that of expert manual assessment.

SUBMITTER: Watson JD 

PROVIDER: S-EPMC2566530 | biostudies-literature | 2007 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Towards fully automated structure-based function prediction in structural genomics: a case study.

Watson James D JD   Sanderson Steve S   Ezersky Alexandra A   Savchenko Alexei A   Edwards Aled A   Orengo Christine C   Joachimiak Andrzej A   Laskowski Roman A RA   Thornton Janet M JM  

Journal of molecular biology 20070130 5


As the global Structural Genomics projects have picked up pace, the number of structures annotated in the Protein Data Bank as hypothetical protein or unknown function has grown significantly. A major challenge now involves the development of computational methods to assign functions to these proteins accurately and automatically. As part of the Midwest Center for Structural Genomics (MCSG) we have developed a fully automated functional analysis server, ProFunc, which performs a battery of analy  ...[more]

Similar Datasets

| S-EPMC2782770 | biostudies-literature
| S-EPMC5436848 | biostudies-literature
| S-EPMC28018 | biostudies-literature
| S-EPMC4684744 | biostudies-literature
| S-EPMC8673552 | biostudies-literature
| S-EPMC3001128 | biostudies-literature
| S-EPMC4105012 | biostudies-literature
| S-EPMC8123492 | biostudies-literature
| S-EPMC6853954 | biostudies-literature
| S-EPMC2920418 | biostudies-literature