Dataset Information

Generation of ENSEMBL-based proteogenomics databases boost the identification of novel peptides - Mouse dataset

ABSTRACT: A novel bioinformatics tool pypgatk and the pgdb workflow is presented in study to create proteogenomics databases based on ENSEMBL resources. The tools allow the generation of protein sequences from novel protein-coding transcripts by performing a three-frame translation of pseudogenes, lncRNAs, and other non-canonical transcripts, such as those produced by alternative splicing events. It also includes exonic out-of-frame translation from otherwise canonical protein-coding mRNAs. Moreover, the tool enables the generation of variant protein sequences from multiple sources of genomic variants including COSMIC, cBioportal, gnomAD, and mutations detected from sequencing of patient samples. pypgatk and pgdb provide multiple functionalities for database handling, notably optimized target/decoy generati on by the algorithm DecoyPyrat.

INSTRUMENT(S): LTQ, Q Exactive HF

ORGANISM(S): Mus Musculus (mouse)

DISEASE(S): Melanoma

SUBMITTER: Yasset Perez-Riverol

LAB HEAD: Yasset Perez-Riverol

PROVIDER: PXD029362 | Pride | 2021-10-26

REPOSITORIES: Pride

ACCESS DATA

Dataset's files

Source:

			Action	DRS
	20171026_AM_Hector_Ana_E06_F1_BIO_01.mzML.gz	Mzml
	20171026_AM_Hector_Ana_E06_F1_BIO_01b.mzML.gz	Mzml
	20171026_AM_Hector_Ana_E06_F1_BIO_02.mzML.gz	Mzml
	20171026_AM_Hector_Ana_E06_F1_BIO_02c.mzML.gz	Mzml
	20171026_AM_Hector_Ana_E06_F1_BIO_03.mzML.gz	Mzml

Items per page:

1 - 5 of 114

Publications

Generation of ENSEMBL-based proteogenomics databases boosts the identification of non-canonical peptides.

Umer Husen M HM Audain Enrique E Zhu Yafeng Y Pfeuffer Julianus J Sachsenberg Timo T Lehtiö Janne J Branca Rui M RM Perez-Riverol Yasset Y

Bioinformatics (Oxford, England) 20220201 5

<h4>Summary</h4>We have implemented the pypgatk package and the pgdb workflow to create proteogenomics databases based on ENSEMBL resources. The tools allow the generation of protein sequences from novel protein-coding transcripts by performing a three-frame translation of pseudogenes, lncRNAs and other non-canonical transcripts, such as those produced by alternative splicing events. It also includes exonic out-of-frame translation from otherwise canonical protein-coding mRNAs. Moreover, the too ...[more]

PMID: 34904638

Similar Datasets

Project description:Various cancer immunotherapies rely on the T cell recognition of peptide antigens presented on human leukocyte antigens (HLA). However, the identification and selection of naturally presented peptide targets for the development of personalized as well as off-the-shelf immunotherapy approaches remains challenging. Here, we introduce the open-access Peptides for Cancer Immunotherapy Database (PCI-DB, https://pci-db.org/), a comprehensive resource of immunopeptidome data originating from various malignant and benign primary tissues that provides the research community with a convenient tool to facilitate the identification of peptide targets for immunotherapy development. The PCI-DB includes > 6.6 million HLA class I and > 3.4 million HLA class II peptides from over 40 tissue types and cancer entities analyzed uniformly using high-sensitive nf-core bioinformatics pipelines and applying a global peptide false discovery rate (FDR) approach. First application of the database provided insights into the representation of cancer-testis antigens (CTA) across malignant and benign tissues and enabled the identification and characterization of the cross-tumor entity and entity-specific tumor-associated antigens as well as naturally presented neoepitopes from frequent cancer mutations. Further, we used the PCI-DB to design personalized peptide vaccines for two patients suffering from metastatic cancer. PCI-DB enabled the composition of both a multi-peptide vaccine comprising non-mutated, highly frequent tumor-associated antigens matching the immunopeptidome of the individual patient´s tumor and a neoepitope-based vaccine matching the mutational profile of the cancer patient. Both vaccine approaches induced potent and long-lasting T-cell responses, accompanied by long-term survival of these advanced cancer patients. In conclusion, the PCI-DB provides a highly versatile tool to broaden the understanding of cancer-related antigen presentation and, ultimately, supports the development of novel immunotherapies.