Dataset Information

From genome to phenome: Predicting multiple cancer phenotypes based on somatic genomic alterations via the genomic impact transformer.

ABSTRACT: Cancers are mainly caused by somatic genomic alterations (SGAs) that perturb cellular signaling systems and eventually activate oncogenic processes. Therefore, understanding the functional impact of SGAs is a fundamental task in cancer biology and precision oncology. Here, we present a deep neural network model with encoder-decoder architecture, referred to as genomic impact transformer (GIT), to infer the functional impact of SGAs on cellular signaling systems through modeling the statistical relationships between SGA events and differentially expressed genes (DEGs) in tumors. The model utilizes a multi-head self-attention mechanism to identify SGAs that likely cause DEGs, or in other words, differentiating potential driver SGAs from passenger ones in a tumor. GIT model learns a vector (gene embedding) as an abstract representation of functional impact for each SGA-affected gene. Given SGAs of a tumor, the model can instantiate the states of the hidden layer, providing an abstract representation (tumor embedding) reflecting characteristics of perturbed molecular/cellular processes in the tumor, which in turn can be used to predict multiple phenotypes. We apply the GIT model to 4,468 tumors profiled by The Cancer Genome Atlas (TCGA) project. The attention mechanism enables the model to better capture the statistical relationship between SGAs and DEGs than conventional methods, and distinguishes cancer drivers from passengers. The learned gene embeddings capture the functional similarity of SGAs perturbing common pathways. The tumor embeddings are shown to be useful for tumor status representation, and phenotype prediction including patient survival time and drug response of cancer cell lines.

SUBMITTER: Tao Y

PROVIDER: S-EPMC6932864 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

From genome to phenome: Predicting multiple cancer phenotypes based on somatic genomic alterations via the genomic impact transformer.

Tao Yifeng Y Cai Chunhui C Cohen William W WW Lu Xinghua X

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing 20200101

Cancers are mainly caused by somatic genomic alterations (SGAs) that perturb cellular signaling systems and eventually activate oncogenic processes. Therefore, understanding the functional impact of SGAs is a fundamental task in cancer biology and precision oncology. Here, we present a deep neural network model with encoder-decoder architecture, referred to as genomic impact transformer (GIT), to infer the functional impact of SGAs on cellular signaling systems through modeling the statistical r ...[more]

PMID: 31797588

Similar Datasets

Project description:PurposeMultiple myeloma (MM) is accompanied by heterogeneous somatic alterations. The overall goal of this study was to describe the genomic landscape of myeloma using deep whole-genome sequencing (WGS) and develop a model that identifies patients with long survival.MethodsWe analyzed deep WGS data from 183 newly diagnosed patients with MM treated with lenalidomide, bortezomib, and dexamethasone (RVD) alone or RVD + autologous stem cell transplant (ASCT) in the IFM/DFCI 2009 study (ClinicalTrials.gov identifier: NCT01191060). We integrated genomic markers with clinical data.ResultsWe report significant variability in mutational load and processes within MM subgroups. The timeline of observed activation of mutational processes provides the basis for 2 distinct models of acquisition of mutational changes detected at the time of diagnosis of myeloma. Virtually all MM subgroups have activated DNA repair-associated signature as a prominent late mutational process, whereas APOBEC signature targeting C>G is activated in the intermediate phase of disease progression in high-risk MM. Importantly, we identify a genomically defined MM subgroup (17% of newly diagnosed patients) with low DNA damage (low genomic scar score with chromosome 9 gain) and a superior outcome (100% overall survival at 69 months), which was validated in a large independent cohort. This subgroup allowed us to distinguish patients with low- and high-risk hyperdiploid MM and identify patients with prolongation of progression-free survival. Genomic characteristics of this subgroup included lower mutational load with significant contribution from age-related mutations as well as frequent NRAS mutation. Surprisingly, their overall survival was independent of International Staging System and minimal residual disease status.ConclusionThis is a comprehensive study identifying genomic markers of a good-risk group with prolonged survival. Identification of this patient subgroup will affect future therapeutic algorithms and research planning.

Dataset Information

From genome to phenome: Predicting multiple cancer phenotypes based on somatic genomic alterations via the genomic impact transformer.

Publications

From genome to phenome: Predicting multiple cancer phenotypes based on somatic genomic alterations via the genomic impact transformer.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets