Dataset Information

An integrated landscape of protein expression in human cancer.

ABSTRACT: Using 11 proteomics datasets, mostly available through the PRIDE database, we assembled a reference expression map for 191 cancer cell lines and 246 clinical tumour samples, across 13 lineages. We found unique peptides identified only in tumour samples despite a much higher coverage in cell lines. These were mainly mapped to proteins related to regulation of signalling receptor activity. Correlations between baseline expression in cell lines and tumours were calculated. We found these to be highly similar across all samples with most similarity found within a given sample type. Integration of proteomics and transcriptomics data showed median correlation across cell lines to be 0.58 (range between 0.43 and 0.66). Additionally, in agreement with previous studies, variation in mRNA levels was often a poor predictor of changes in protein abundance. To our knowledge, this work constitutes the first meta-analysis focusing on cancer-related public proteomics datasets. We therefore also highlight shortcomings and limitations of such studies. All data is available through PRIDE dataset identifier PXD013455 and in Expression Atlas.

SUBMITTER: Jarnuczak AF

PROVIDER: S-EPMC8065022 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Similar Datasets

Project description:BackgroundMicroarray technology enables a standardized, objective assessment of oncological diagnosis and prognosis. However, such studies are typically specific to certain cancer types, and the results have limited use due to inadequate validation in large patient cohorts. Discovery of genes commonly regulated in cancer may have an important implication in understanding the common molecular mechanism of cancer.Methods and findingsWe described an integrated gene-expression analysis of 2,186 samples from 39 studies to identify and validate a cancer type-independent gene signature that can identify cancer patients for a wide variety of human malignancies. The commonness of gene expression in 20 types of common cancer was assessed in 20 training datasets. The discriminative power of a signature defined by these common cancer genes was evaluated in the other 19 independent datasets including novel cancer types. QRT-PCR and tissue microarray were used to validate commonly regulated genes in multiple cancer types. We identified 187 genes dysregulated in nearly all cancerous tissue samples. The 187-gene signature can robustly predict cancer versus normal status for a wide variety of human malignancies with an overall accuracy of 92.6%. We further refined our signature to 28 genes confirmed by QRT-PCR. The refined signature still achieved 80% accuracy of classifying samples from mixed cancer types. This signature performs well in the prediction of novel cancer types that were not represented in training datasets. We also identified three biological pathways including glycolysis, cell cycle checkpoint II and plk3 pathways in which most genes are systematically up-regulated in many types of cancer.ConclusionsThe identified signature has captured essential transcriptional features of neoplastic transformation and progression in general. These findings will help to elucidate the common molecular mechanism of cancer, and provide new insights into cancer diagnostics, prognostics and therapy.

Dataset Information

An integrated landscape of protein expression in human cancer.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets