Automated N-glycopeptide identification in glycoproteomics
Ontology highlight
ABSTRACT: Recent advances in software-driven glycopeptide identification in LC-MS/MS-based N-glycoproteomics have facilitated biochemical studies reporting thousands of intact N-glycopeptides, i.e. N-glycan-conjugated peptides, but the automated identification process remains to be scrutinized. Herein, we explore the efficiency of site-specific glycoprofiling using the PTM-centric search-engine Byonic relative to manual expert annotation. To allow an appropriately deep comparison, the study utilised typical glycoproteomics acquisition and data analysis strategies, but of a single glycoprotein, the uncharacterised N-glycosylated (Asn160, Asn268 and Asn302) human basigin. Detailed site-specific reference glycoprofiles of purified basigin were manually established using ion trap CID-MS/MS and high-resolution Q-Exactive Orbitrap HCD-MS/MS acquisition of tryptic N-glycopeptides and released N-glycans. The basigin N-glycosylation sites, which showed extensive micro- and macro-heterogeneity, were then glycoprofiled using Byonic with or without a background of complex peptides using Q-Exactive Orbitrap HCD-MS/MS data. The glycoprofiling efficiencies were assessed against the site-specific reference glycoprofiles and target and decoy proteome databases. The search criteria and confidence thresholds (Byonic scores) recommended by the vendor provided very high glycoprofiling accuracy and coverage (both >80%) and low peptide FDRs (<1%). The data complexity, search parameters including search space (proteome/glycome size), mass tolerance and peptide modifications, and confidence thresholds affected the glycoprofiling efficiency and analysis time. Automated identification of peptide modifications (methionine oxidation/carbamidomethylation) that coincide with monosaccharide mass differences (Fuc/Hex/HexNAc) and accurately distinguishing isobaric (Hex1NeuAc1-R/Fuc1NeuGc1-R) or near-isobaric (NeuAc1-R/Fuc2-R) monosaccharide sub-compositions remain challenging, arguing particular attention to such “difficult-to-identify” N-glycopeptides. The presented analysis provides valuable insights into automated glycopeptide identification; knowledge that facilitates further developments in FDR-based glycoproteomics.
INSTRUMENT(S): Bruker Daltonics HCT Series, Q Exactive
ORGANISM(S): Homo Sapiens (human)
TISSUE(S): Permanent Cell Line Cell
DISEASE(S): Liver Disease
SUBMITTER: Ling Lee
LAB HEAD: Morten Andersen
PROVIDER: PXD004243 | Pride | 2016-08-17
REPOSITORIES: Pride
ACCESS DATA