Unsupervised mining of HLA-I peptidomes reveals unsuspected false positives and new binding motifs
Ontology highlight
ABSTRACT: Modern antigen vaccine designs and studies of human leukocyte antigen (HLA)-mediated immune responses rely heavily on the knowledge of HLA allele-specific binding motifs and computational prediction of antigen-HLA binding affinity. Breakthroughs in HLA peptidomics have considerably expanded the databases of natural HLA antigens and enabled detailed characterizations of antigen-HLA binding specificity. However, cautions must be made when analyzing HLA peptidomics data because identified peptides may be contaminants or may weakly bind to the HLA molecules. Here, a hybrid de novo peptide sequencing approach was applied to large-scale mono-allelic HLA peptidomics datasets to uncover new antigens and refine current knowledge of HLA binding motifs. Up to 12-40% contaminations in the form of tryptic peptides were identified in the peptidomics data of HLA alleles whose binding motifs do not involve an arginine or a lysine at the C-terminus. Thousands of these peptides were reported in a community database as positive antigens and might be erroneously used to train prediction models. Furthermore, unsupervised clustering of identified antigens not only revealed additional binding motifs for several HLA class I alleles but also effectively isolated outliers which were confirmed to be false positives in a binding experiment. Overall, our findings expanded the knowledge of HLA binding specificity and indicated that a more careful HLA peptidomics data interpretation protocol is needed to ensure the high quality of community antigen databases.
INSTRUMENT(S): Q Exactive Plus
ORGANISM(S): Homo Sapiens (human)
TISSUE(S): B Cell, Blood
SUBMITTER: Poorichaya Somparn
LAB HEAD: Poorichaya somparn
PROVIDER: PXD028088 | Pride | 2021-09-17
REPOSITORIES: Pride
ACCESS DATA