Unknown

Dataset Information

0

Computational prediction of N-linked glycosylation incorporating structural properties and patterns.


ABSTRACT: MOTIVATION: N-linked glycosylation occurs predominantly at the N-X-T/S motif, where X is any amino acid except proline. Not all N-X-T/S sequons are glycosylated, and a number of web servers for predicting N-linked glycan occupancy using sequence and/or residue pattern information have been developed. None of the currently available servers, however, utilizes protein structural information for the prediction of N-glycan occupancy. RESULTS: Here, we describe a novel classifier algorithm, NGlycPred, for the prediction of glycan occupancy at the N-X-T/S sequons. The algorithm utilizes both structural as well as residue pattern information and was trained on a set of glycosylated protein structures using the Random Forest algorithm. The best predictor achieved a balanced accuracy of 0.687 under 10-fold cross-validation on a curated dataset of 479 N-X-T/S sequons and outperformed sequence-based predictors when evaluated on the same dataset. The incorporation of structural information, including local contact order, surface accessibility/composition and secondary structure thus improves the prediction accuracy of glycan occupancy at the N-X-T/S consensus sequon. AVAILABILITY AND IMPLEMENTATION: NGlycPred is freely available to non-commercial users as a web-based server at http://exon.niaid.nih.gov/nglycpred/.

SUBMITTER: Chuang GY 

PROVIDER: S-EPMC3426846 | biostudies-literature | 2012 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Computational prediction of N-linked glycosylation incorporating structural properties and patterns.

Chuang Gwo-Yu GY   Boyington Jeffrey C JC   Joyce M Gordon MG   Zhu Jiang J   Nabel Gary J GJ   Kwong Peter D PD   Georgiev Ivelin I  

Bioinformatics (Oxford, England) 20120710 17


<h4>Motivation</h4>N-linked glycosylation occurs predominantly at the N-X-T/S motif, where X is any amino acid except proline. Not all N-X-T/S sequons are glycosylated, and a number of web servers for predicting N-linked glycan occupancy using sequence and/or residue pattern information have been developed. None of the currently available servers, however, utilizes protein structural information for the prediction of N-glycan occupancy.<h4>Results</h4>Here, we describe a novel classifier algorit  ...[more]

Similar Datasets

| S-EPMC4054257 | biostudies-literature
| S-EPMC7226087 | biostudies-literature
| S-EPMC6483403 | biostudies-literature
| S-EPMC6645310 | biostudies-literature
| S-EPMC5552137 | biostudies-literature
| S-EPMC8658957 | biostudies-literature