Ontology highlight
ABSTRACT: Background
Hydroxylation is an important post-translational modification and closely related to various diseases. Besides the biotechnology experiments, in silico prediction methods are alternative ways to identify the potential hydroxylation sites.Methodology/principal findings
In this study, we developed a novel sequence-based method for identifying the two main types of hydroxylation sites--hydroxyproline and hydroxylysine. First, feature selection was made on three kinds of features consisting of amino acid indices (AAindex) which includes various physicochemical properties and biochemical properties of amino acids, Position-Specific Scoring Matrices (PSSM) which represent evolution information of amino acids and structural disorder of amino acids in the sliding window with length of 13 amino acids, then the prediction model were built using incremental feature selection method. As a result, the prediction accuracies are 76.0% and 82.1%, evaluated by jackknife cross-validation on the hydroxyproline dataset and hydroxylysine dataset, respectively. Feature analysis suggested that physicochemical properties and biochemical properties and evolution information of amino acids contribute much to the identification of the protein hydroxylation sites, while structural disorder had little relation to protein hydroxylation. It was also found that the amino acid adjacent to the hydroxylation site tends to exert more influence than other sites on hydroxylation determination.Conclusions/significance
These findings may provide useful insights for exploiting the mechanisms of hydroxylation.
SUBMITTER: Hu LL
PROVIDER: S-EPMC3013141 | biostudies-literature | 2010 Dec
REPOSITORIES: biostudies-literature
Hu Le-Le LL Niu Shen S Huang Tao T Wang Kai K Shi Xiao-He XH Cai Yu-Dong YD
PloS one 20101231 12
<h4>Background</h4>Hydroxylation is an important post-translational modification and closely related to various diseases. Besides the biotechnology experiments, in silico prediction methods are alternative ways to identify the potential hydroxylation sites.<h4>Methodology/principal findings</h4>In this study, we developed a novel sequence-based method for identifying the two main types of hydroxylation sites--hydroxyproline and hydroxylysine. First, feature selection was made on three kinds of f ...[more]