Dataset Information

A fast SCOP fold classification system using content-based E-Predict algorithm.

ABSTRACT:

Background

Domain experts manually construct the Structural Classification of Protein (SCOP) database to categorize and compare protein structures. Even though using the SCOP database is believed to be more reliable than classification results from other methods, it is labor intensive. To mimic human classification processes, we develop an automatic SCOP fold classification system to assign possible known SCOP folds and recognize novel folds for newly-discovered proteins.

Results

With a sufficient amount of ground truth data, our system is able to assign the known folds for newly-discovered proteins in the latest SCOP v1.69 release with 92.17% accuracy. Our system also recognizes the novel folds with 89.27% accuracy using 10 fold cross validation. The average response time for proteins with 500 and 1409 amino acids to complete the classification process is 4.1 and 17.4 seconds, respectively. By comparison with several structural alignment algorithms, our approach outperforms previous methods on both the classification accuracy and efficiency.

Conclusion

In this paper, we build an advanced, non-parametric classifier to accelerate the manual classification processes of SCOP. With satisfactory ground truth data from the SCOP database, our approach identifies relevant domain knowledge and yields reasonably accurate classifications. Our system is publicly accessible at http://ProteinDBS.rnet.missouri.edu/E-Predict.php.

SUBMITTER: Chi PH

PROVIDER: S-EPMC1579235 | biostudies-literature | 2006 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A fast SCOP fold classification system using content-based E-Predict algorithm.

Chi Pin-Hao PH Shyu Chi-Ren CR Xu Dong D

BMC bioinformatics 20060726

<h4>Background</h4>Domain experts manually construct the Structural Classification of Protein (SCOP) database to categorize and compare protein structures. Even though using the SCOP database is believed to be more reliable than classification results from other methods, it is labor intensive. To mimic human classification processes, we develop an automatic SCOP fold classification system to assign possible known SCOP folds and recognize novel folds for newly-discovered proteins.<h4>Results</h4> ...[more]

PMID: 16872501

Similar Datasets

Project description:In mammalian ventricular cardiomyocytes, invaginations of the surface membrane form the transverse tubular system (T-system), which consists of transverse tubules (TTs) that align with sarcomeres and Z-lines as well as longitudinal tubules (LTs) that are present between Z-lines in some species. In many cardiac disease etiologies, the T-system is perturbed, which is believed to promote spatially heterogeneous, dyssynchronous Ca2+ release and inefficient contraction. In general, T-system characterization approaches have been directed primarily at isolated cells and do not detect subcellular T-system heterogeneity. Here, we present MatchedMyo, a matched-filter-based algorithm for subcellular T-system characterization in isolated cardiomyocytes and millimeter-scale myocardial sections. The algorithm utilizes "filters" representative of TTs, LTs, and T-system absence. Application of the algorithm to cardiomyocytes isolated from rat disease models of myocardial infarction (MI), dilated cardiomyopathy induced via aortic banding, and sham surgery confirmed and quantified heterogeneous T-system structure and remodeling. Cardiomyocytes from post-MI hearts exhibited increasing T-system disarray as proximity to the infarct increased. We found significant (p < 0.05, Welch's t-test) increases in LT density within cardiomyocytes proximal to the infarct (12 ± 3%, data reported as mean ± SD, n = 3) versus sham (4 ± 2%, n = 5), but not distal to the infarct (7 ± 1%, n = 3). The algorithm also detected decreases in TTs within 5° of the myocyte minor axis for isolated aortic banding (36 ± 9%, n = 3) and MI cardiomyocytes located intermediate (37 ± 4%, n = 3) and proximal (34 ± 4%, n = 3) to the infarct versus sham (57 ± 12%, n = 5). Application of bootstrapping to rabbit MI tissue revealed distal sections comprised 18.9 ± 1.0% TTs, whereas proximal sections comprised 10.1 ± 0.8% TTs (p < 0.05), a 46.6% decrease. The matched-filter approach therefore provides a robust and scalable technique for T-system characterization from isolated cells through millimeter-scale myocardial sections.

Dataset Information

A fast SCOP fold classification system using content-based E-Predict algorithm.

Background

Results

Conclusion

Publications

A fast SCOP fold classification system using content-based E-Predict algorithm.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets