Project description:The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics data sets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and nonreference markers of cancer. The CPTAC laboratories have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these data sets were produced from 2D liquid chromatography-tandem mass spectrometry analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false-discovery rate-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the data sets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level ("rolled-up") precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data to enable comparisons between different samples and cancer types as well as across the major omics fields.
Project description:<p>Recently, significant progress has been made in characterizing and sequencing the genomic alterations in statistically robust numbers of samples from several types of cancer. For example, The Cancer Genome Atlas (TCGA), International Cancer Genome Consortium (ICGC) and other similar efforts are identifying genomic alterations associated with specific cancers (e.g., copy number aberrations, rearrangements, point mutations, epigenomic changes, etc.). The availability of these multi-dimensional data to the scientific community sets the stage for the development of new molecularly targeted cancer interventions. Understanding the comprehensive functional changes in cancer proteomes arising from genomic alterations and other factors is the next logical step in the development of high-value candidate protein biomarkers. Hence, proteomics can greatly advance the understanding of molecular mechanisms of disease pathology via the analysis of changes in protein expression, their modifications and variations, as well as protein-protein interaction, signaling pathways and networks responsible for cellular functions such as apoptosis and oncogenesis.</p> <p>Realizing this great potential, the NCI launched the second phase of the CPTC initiative in September 2011. Renamed the Clinical Proteomic Tumor Analysis Consortium, CPTAC is beginning to leverage its analytical outputs from Phase I to define cancer proteomes on genomically-characterized biospecimens. The purpose of this integrative approach is to provide the broad scientific community with knowledge that links genotype to proteotype and ultimately phenotype.</p> <p>The data contained in this dataset are derived from samples designed to confirm CPTAC findings from the TCGA samples. These confirmatory samples contain breast, ovarian, colon, and lung tumors collected via a protocol optimized for proteomics. Specifically, ischemic time of the sample was controlled and restricted to less than 30 minutes.</p> <p>ACGT, Inc. produced whole exome, mRNAseq, and miRNAseq for these samples. Corresponding proteomic data are available at: <a href="https://cptac-data-portal.georgetown.edu/cptacPublic/">https://cptac-data-portal.georgetown.edu/cptacPublic/</a></p> <p>The study design was to profile colon, breast, ovarian, and lung tumors both genomically and proteomically. Germline DNA was obtained from blood. Normal control samples for proteomics varied by organ site: adjacent colon tissue for colon cases, contralateral breast tissue for some breast cases, and Fallopian tube fimbria for some ovarian cases. Lung cases had no normal control for proteomic analysis. All cancer samples were derived from primary and untreated tumors.</p>
Project description:<p>Recently, significant progress has been made in characterizing and sequencing the genomic alterations in statistically robust numbers of samples from several types of cancer. For example, The Cancer Genome Atlas (TCGA), International Cancer Genome Consortium (ICGC) and other similar efforts are identifying genomic alterations associated with specific cancers (e.g., copy number aberrations, rearrangements, point mutations, epigenomic changes, etc.) The availability of these multi-dimensional data to the scientific community sets the stage for the development of new molecularly targeted cancer interventions. Understanding the comprehensive functional changes in cancer proteomes arising from genomic alterations and other factors is the next logical step in the development of high-value candidate protein biomarkers. Hence, proteomics can greatly advance the understanding of molecular mechanisms of disease pathology via the analysis of changes in protein expression, their modifications and variations, as well as protein=protein interaction, signaling pathways and networks responsible for cellular functions such as apoptosis and oncogenesis. Realizing this great potential, the NCI launched the third phase of the CPTC initiative in September 2016. As the Clinical Proteomic Tumor Analysis Consortium, CPTAC continues to define cancer proteomes on genomically-characterized biospecimens. The purpose of this integrative approach was to provide the broad scientific community with knowledge that links genotype to proteotype and ultimately phenotype. In this third phase of CPTAC, the program aims to expand on CPTAC II and genomically and proteomically characterize over 2000 samples from 10 cancer types (Lung Adenocarcinoma, Pancreatic Ductal Adenocarcinoma, Glioblastoma Multiforme, Acute Myeloid Leukemia, Clear cell renal Carcinoma, Head and Neck Squamous Cell Carcinoma, Cutaneous Melanoma, Sarcoma, Lung Squamous Cell Carcinoma, Uterine Corpus Endometrial Carcinoma) .Germline DNA is obtained from blood and Normal control samples for proteomics varied by organ site. All cancer samples were derived from primary and untreated tumor.</p>
Project description:The objective of the "System Suitability (CompRef) Study" was to validate mass spectrometry protocols used at each Proteome Characterization Center (PCC). Please include this attribution in publications, "Data used in this publication was created by the Clinical Proteomics Tumor Analysis Consortium (NCI/NIH)."
Project description:The objective of the "System Suitability (CompRef) Study" was to validate mass spectrometry protocols used at each Proteome Characterization Center (PCC). Please include this attribution in publications, "Data used in this publication was created by the Clinical Proteomics Tumor Analysis Consortium (NCI/NIH)."
Project description:The objective of the "System Suitability (CompRef) Study" was to validate mass spectrometry protocols used at each Proteome Characterization Center (PCC). Please include this attribution in publications, "Data used in this publication was created by the Clinical Proteomics Tumor Analysis Consortium (NCI/NIH)."
Project description:This SuperSeries is composed of the SubSeries listed below. Consortium contacts: Maria Pedersen: mpedersen@nygenome.org Hemali Phatnani: hphatnani@nygenome.org NYGC ALS Consortium: cgndhelp@nygenome.org
Project description:The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has provided some of the most in-depth analyses of the phenotypes of human tumors ever constructed. Today, the majority of proteomic data analysis is still performed using software housed on desktop computers which limits the number of sequence variants and post-translational modifications that can be considered. The original CPTAC studies limited the search for PTMs to only samples that were chemically enriched for those modified peptides. Similarly, the only sequence variants considered were those with strong evidence at the exon or transcript level. In this multi-institutional collaborative reanalysis, we utilized unbiased protein databases containing millions of human sequence variants in conjunction with hundreds of common post-translational modifications. Using these tools, we identified tens of thousands of high-confidence PTMs and sequence variants. We identified 4132 phosphorylated peptides in nonenriched samples, 93% of which were confirmed in the samples which were chemically enriched for phosphopeptides. In addition, our results also cover 90% of the high-confidence variants reported by the original proteogenomics study, without the need for sample specific next-generation sequencing. Finally, we report fivefold more somatic and germline variants that have an independent evidence at the peptide level, including mutations in ERRB2 and BCAS1. In this reanalysis of CPTAC proteomic data with cloud computing, we present an openly available and searchable web resource of the highest-coverage proteomic profiling of human tumors described to date.