Dataset Information

N-GlyDE: a two-stage N-linked glycosylation site prediction incorporating gapped dipeptides and pattern-based encoding.

ABSTRACT: N-linked glycosylation is one of the predominant post-translational modifications involved in a number of biological functions. Since experimental characterization of glycosites is challenging, glycosite prediction is crucial. Several predictors have been made available and report high performance. Most of them evaluate their performance at every asparagine in protein sequences, not confined to asparagine in the N-X-S/T sequon. In this paper, we present N-GlyDE, a two-stage prediction tool trained on rigorously-constructed non-redundant datasets to predict N-linked glycosites in the human proteome. The first stage uses a protein similarity voting algorithm trained on both glycoproteins and non-glycoproteins to predict a score for a protein to improve glycosite prediction. The second stage uses a support vector machine to predict N-linked glycosites by utilizing features of gapped dipeptides, pattern-based predicted surface accessibility, and predicted secondary structure. N-GlyDE's final predictions are derived from a weight adjustment of the second-stage prediction results based on the first-stage prediction score. Evaluated on N-X-S/T sequons of an independent dataset comprised of 53 glycoproteins and 33 non-glycoproteins, N-GlyDE achieves an accuracy and MCC of 0.740 and 0.499, respectively, outperforming the compared tools. The N-GlyDE web server is available at http://bioapp.iis.sinica.edu.tw/N-GlyDE/ .

SUBMITTER: Pitti T

PROVIDER: S-EPMC6828726 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

N-GlyDE: a two-stage N-linked glycosylation site prediction incorporating gapped dipeptides and pattern-based encoding.

Pitti Thejkiran T Chen Ching-Tai CT Lin Hsin-Nan HN Choong Wai-Kok WK Hsu Wen-Lian WL Sung Ting-Yi TY

Scientific reports 20191104 1

N-linked glycosylation is one of the predominant post-translational modifications involved in a number of biological functions. Since experimental characterization of glycosites is challenging, glycosite prediction is crucial. Several predictors have been made available and report high performance. Most of them evaluate their performance at every asparagine in protein sequences, not confined to asparagine in the N-X-S/T sequon. In this paper, we present N-GlyDE, a two-stage prediction tool train ...[more]

PMID: 31685900

Dataset Information

N-GlyDE: a two-stage N-linked glycosylation site prediction incorporating gapped dipeptides and pattern-based encoding.

Publications

N-GlyDE: a two-stage N-linked glycosylation site prediction incorporating gapped dipeptides and pattern-based encoding.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Computational prediction of N-linked glycosylation incorporating structural properties and patterns.
| S-EPMC3426846 | biostudies-literature

DeepNGlyPred: A Deep Neural Network-Based Approach for Human N-Linked Glycosylation Site Prediction.
| S-EPMC8658957 | biostudies-literature

Unique N-linked glycosylation of murine coronavirus MHV-2 membrane protein at the conserved O-linked glycosylation site.
| S-EPMC7125849 | biostudies-literature

Glycosylation site prediction using ensembles of Support Vector Machine classifiers.
| S-EPMC2220009 | biostudies-literature

Prediction of metabolic fluxes by incorporating genomic context and flux-converging pattern analyses.
| S-EPMC2930451 | biostudies-literature

Enhanced regulatory sequence prediction using gapped k-mer features.
| S-EPMC4102394 | biostudies-literature

Site-Directed Glycosylation of Peptide/Protein with Homogeneous O-Linked Eukaryotic N-Glycans.
| S-EPMC5951161 | biostudies-literature

HIV N-linked glycosylation site analyzer and its further usage in anchored alignment.
| S-EPMC3692120 | biostudies-literature

Incorporating significant amino acid pairs to identify O-linked glycosylation sites on transmembrane proteins and non-transmembrane proteins.
| S-EPMC2989983 | biostudies-literature

Prediction of N-linked glycosylation sites using position relative features and statistical moments.
| S-EPMC5552137 | biostudies-literature