Unknown

Dataset Information

0

N-GlyDE: a two-stage N-linked glycosylation site prediction incorporating gapped dipeptides and pattern-based encoding.


ABSTRACT: N-linked glycosylation is one of the predominant post-translational modifications involved in a number of biological functions. Since experimental characterization of glycosites is challenging, glycosite prediction is crucial. Several predictors have been made available and report high performance. Most of them evaluate their performance at every asparagine in protein sequences, not confined to asparagine in the N-X-S/T sequon. In this paper, we present N-GlyDE, a two-stage prediction tool trained on rigorously-constructed non-redundant datasets to predict N-linked glycosites in the human proteome. The first stage uses a protein similarity voting algorithm trained  on both glycoproteins and non-glycoproteins to predict a score for a protein to improve glycosite prediction. The second stage uses a support vector machine to predict N-linked glycosites by utilizing features of gapped dipeptides, pattern-based predicted surface accessibility, and predicted secondary structure. N-GlyDE's final predictions are derived from a weight adjustment of the second-stage prediction results based on the first-stage prediction score. Evaluated on N-X-S/T sequons of an independent dataset comprised of 53 glycoproteins and 33 non-glycoproteins, N-GlyDE achieves an accuracy and MCC of 0.740 and 0.499, respectively, outperforming the compared tools. The N-GlyDE web server is available at http://bioapp.iis.sinica.edu.tw/N-GlyDE/ .

SUBMITTER: Pitti T 

PROVIDER: S-EPMC6828726 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

N-GlyDE: a two-stage N-linked glycosylation site prediction incorporating gapped dipeptides and pattern-based encoding.

Pitti Thejkiran T   Chen Ching-Tai CT   Lin Hsin-Nan HN   Choong Wai-Kok WK   Hsu Wen-Lian WL   Sung Ting-Yi TY  

Scientific reports 20191104 1


N-linked glycosylation is one of the predominant post-translational modifications involved in a number of biological functions. Since experimental characterization of glycosites is challenging, glycosite prediction is crucial. Several predictors have been made available and report high performance. Most of them evaluate their performance at every asparagine in protein sequences, not confined to asparagine in the N-X-S/T sequon. In this paper, we present N-GlyDE, a two-stage prediction tool train  ...[more]

Similar Datasets

| S-EPMC3426846 | biostudies-literature
| S-EPMC8658957 | biostudies-literature
| S-EPMC2220009 | biostudies-literature
| S-EPMC2930451 | biostudies-literature
| S-EPMC4102394 | biostudies-literature
| S-EPMC5951161 | biostudies-literature
| S-EPMC3692120 | biostudies-literature
| S-EPMC5552137 | biostudies-literature