Dataset Information

Protein disorder prediction by condensed PSSM considering propensity for order or disorder.

ABSTRACT:

Background

More and more disordered regions have been discovered in protein sequences, and many of them are found to be functionally significant. Previous studies reveal that disordered regions of a protein can be predicted by its primary structure, the amino acid sequence. One observation that has been widely accepted is that ordered regions usually have compositional bias toward hydrophobic amino acids, and disordered regions are toward charged amino acids. Recent studies further show that employing evolutionary information such as position specific scoring matrices (PSSMs) improves the prediction accuracy of protein disorder. As more and more machine learning techniques have been introduced to protein disorder detection, extracting more useful features with biological insights attracts more attention.

Results

This paper first studies the effect of a condensed position specific scoring matrix with respect to physicochemical properties (PSSMP) on the prediction accuracy, where the PSSMP is derived by merging several amino acid columns of a PSSM belonging to a certain property into a single column. Next, we decompose each conventional physicochemical property of amino acids into two disjoint groups which have a propensity for order and disorder respectively, and show by experiments that some of the new properties perform better than their parent properties in predicting protein disorder. In order to get an effective and compact feature set on this problem, we propose a hybrid feature selection method that inherits the efficiency of uni-variant analysis and the effectiveness of the stepwise feature selection that explores combinations of multiple features. The experimental results show that the selected feature set improves the performance of a classifier built with Radial Basis Function Networks (RBFN) in comparison with the feature set constructed with PSSMs or PSSMPs that adopt simply the conventional physicochemical properties.

Conclusion

Distinguishing disordered regions from ordered regions in protein sequences facilitates the exploration of protein structures and functions. Results based on independent testing data reveal that the proposed predicting model DisPSSMP performs the best among several of the existing packages doing similar tasks, without either under-predicting or over-predicting the disordered regions. Furthermore, the selected properties are demonstrated to be useful in finding discriminating patterns for order/disorder classification.

SUBMITTER: Su CT

PROVIDER: S-EPMC1526762 | biostudies-literature | 2006 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Protein disorder prediction by condensed PSSM considering propensity for order or disorder.

Su Chung-Tsai CT Chen Chien-Yu CY Ou Yu-Yen YY

BMC bioinformatics 20060623

<h4>Background</h4>More and more disordered regions have been discovered in protein sequences, and many of them are found to be functionally significant. Previous studies reveal that disordered regions of a protein can be predicted by its primary structure, the amino acid sequence. One observation that has been widely accepted is that ordered regions usually have compositional bias toward hydrophobic amino acids, and disordered regions are toward charged amino acids. Recent studies further show ...[more]

PMID: 16796745

Dataset Information

Protein disorder prediction by condensed PSSM considering propensity for order or disorder.

Background

Results

Conclusion

Publications

Protein disorder prediction by condensed PSSM considering propensity for order or disorder.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

ODiNPred: comprehensive prediction of protein order and disorder.
| S-EPMC7479119 | biostudies-literature

SODA: prediction of protein solubility from disorder and aggregation propensity.
| S-EPMC7059794 | biostudies-literature

Real value prediction of protein solvent accessibility using enhanced PSSM features.
| S-EPMC2638152 | biostudies-literature

Computational Prediction of MoRFs, Short Disorder-to-order Transitioning Protein Binding Regions.
| S-EPMC6453775 | biostudies-literature

Protein structure along the order-disorder continuum.
| S-EPMC3125948 | biostudies-literature

Predicting protein folds with fold-specific PSSM libraries.
| S-EPMC3116844 | biostudies-literature

Tunable order-disorder continuum in protein-DNA interactions.
| S-EPMC6158747 | biostudies-literature

Rotational order-disorder structure of fluorescent protein FP480.
| S-EPMC2733879 | biostudies-literature

Stereochemistry in the disorder-order continuum of protein interactions.
| S-EPMC11655355 | biostudies-literature

Order propensity of an intrinsically disordered protein, the cyclin-dependent-kinase inhibitor Sic1.
| S-EPMC2754754 | biostudies-literature