Thousands of novel unannotated proteins expand the MHC I immunopeptidome in cancer
Ontology highlight
ABSTRACT: Tumor epitopes – peptides that are presented on surface-bound MHC I proteins - provide targets for cancer immunotherapy and have been identified extensively in the annotated protein-coding regions of the genome. Motivated by the recent discovery of translated novel unannotated open reading frames (nuORFs) using ribosome profiling (Ribo-seq), we hypothesized that cancer-associated processes could generate nuORFs that can serve as a new source of tumor antigens that harbor somatic mutations or show tumor-specific expression. To identify cancer-specific nuORFs, we generated Ribo-seq profiles for 29 malignant and healthy samples. These included primary normal and chronic lymphocytic leukemia (CLL) B cells, patient-derived primary glioblastoma (GBM) and melanoma cell cultures, primary healthy melanocytes, as well as established colon carcinoma and melanoma cell lines. These also included B721.221 cells, the parental cell line previously used to generate 92 single HLA allele-expressing lines from which we collected mono-allelic MHC I immunopeptidome data. We developed a hierarchical approach to identify translated novel unannotated open reading frames that leverages the large amount of Ribo-seq data we have generated in order to uncover lowly expressed nuORFs, while also preserving tissue specificity. We constructed a database of novel unannotated ORFs (nuORFdb) and used it to analyze mass spectrometry datasets of MHC I-bound peptides, where we detected peptides from 3,555 nuORFs. Additionally, we used nuORFdb to identify cancer-specific nuORFs, as well as nuORFs harboring cancer-specific somatic variants as potential sources of neoantigens in cancer. *This repository contains data from the publicly available cell lines used in this study, including B721.221 cells, A375 cells, HCT116 cells and primary healthy melanocytes (Thermo C0025C). The data pertaining to primary patient samples that are part of this study is deposited in dbGaP.
ORGANISM(S): Homo sapiens
PROVIDER: GSE143263 | GEO | 2020/08/01
REPOSITORIES: GEO
ACCESS DATA