Dataset Information

Deep learning identified glioblastoma subtypes based on internal genomic expression ranks.

ABSTRACT:

Background

Glioblastoma (GBM) can be divided into subtypes according to their genomic features, including Proneural (PN), Neural (NE), Classical (CL) and Mesenchymal (ME). However, it is a difficult task to unify various genomic expression profiles which were standardized with various procedures from different studies and to manually classify a given GBM sample into a subtype.

Methods

An algorithm was developed to unify the genomic profiles of GBM samples into a standardized normal distribution (SND), based on their internal expression ranks. Deep neural networks (DNN) and convolutional DNN (CDNN) models were trained on original and SND data. In addition, expanded SND data by combining various The Cancer Genome Atlas (TCGA) datasets were used to improve the robustness and generalization capacity of the CDNN models.

Results

The SND data kept unimodal distribution similar to their original data, and also kept the internal expression ranks of all genes for each sample. CDNN models trained on the SND data showed significantly higher accuracy compared to DNN and CDNN models trained on primary expression data. Interestingly, the CDNN models classified the NE subtype with the lowest accuracy in the GBM datasets, expanded datasets and in IDH wide type GBMs, consistent with the recent studies that NE subtype should be excluded. Furthermore, the CDNN models also recognized independent GBM datasets, even with small set of genomic expressions.

Conclusions

The GBM expression profiles can be transformed into unified SND data, which can be used to train CDNN models with high accuracy and generalization capacity. These models suggested NE subtype may be not compatible with the 4 subtypes classification system.

SUBMITTER: Mao XG

PROVIDER: S-EPMC8780813 | biostudies-literature | 2022 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Deep learning identified glioblastoma subtypes based on internal genomic expression ranks.

Mao Xing-Gang XG Xue Xiao-Yan XY Wang Ling L Lin Wei W Zhang Xiang X

BMC cancer 20220120 1

<h4>Background</h4>Glioblastoma (GBM) can be divided into subtypes according to their genomic features, including Proneural (PN), Neural (NE), Classical (CL) and Mesenchymal (ME). However, it is a difficult task to unify various genomic expression profiles which were standardized with various procedures from different studies and to manually classify a given GBM sample into a subtype.<h4>Methods</h4>An algorithm was developed to unify the genomic profiles of GBM samples into a standardized norma ...[more]

PMID: 35057766

Similar Datasets

Project description:Breast cancer is a profoundly heterogeneous disease with respect to biological and clinical behavior. Gene expression profiling has been used to dissect this complexity and stratify tumors into intrinsic gene expression subtypes associated with distinct biology, patient outcome and different genomic alterations. Additionally, breast tumors occurring in individuals with germline BRCA1 or BRCA2 mutations typically fall into distinct subtypes. We applied global DNA copy number and gene expression profiling in 359 breast tumors. All tumors were classified according to intrinsic gene expression subtypes and included cases from genetically predisposed women. The Genomic Identification of Significant Targets in Cancer (GISTIC) algorithm was used to identify significant DNA copy number aberrations and genomic subgroups of breast cancer. We identified 31 genomic regions that were highly amplified in >1% of the 359 breast tumors. Several amplicons were found to co-occur, the 8p12 and 11q13.3 regions being the most frequent combination besides amplicons on the same chromosomal arm. Unsupervised hierarchical clustering with 133 significant GISTIC regions (66 and 67 with DNA copy number gain and loss, respectively) revealed six genomic subtypes, termed: 17q12, basal-complex, luminal-simple, luminal-complex, amplifier and mixed subtype. Four of them had striking similarity to intrinsic gene expression subtypes and showed association to conventional tumor biomarkers and clinical outcome. However, luminal A-classified tumors were distributed in two main genomic subtypes, luminal-simple and luminal-complex, the former group having better prognosis while the latter group included also luminal B and the majority of BRCA2-mutated tumors. The basal-complex subtype displayed extensive genomic homogeneity and harbored the majority of BRCA1-mutated tumors. The 17q12 subtype comprised mostly HER2-amplified and HER2-enriched subtype tumors and had the worst prognosis. The amplifier and mixed subtypes contained tumors from all gene expression subtypes, the former being enriched for 8p12-amplified cases while the mixed subtype included many tumors with predominantly DNA copy number losses and poor prognosis. Genomic profiling of 359 breast tumors using tiling BAC aCGH. A number of cases were hybridized as replicates or replicate as dye-swaps. Gene expression profiling of 359 breast tumors using 55K oligonucleotide microarrays.

Project description:Molecular subtypes of colorectal cancer (CRC) significantly influence treatment decisions. While convolutional neural networks (CNNs) have recently been introduced for automated CRC subtype identification using H&E stained histopathological images, the correlation between CRC subtype genomic variants and their corresponding cellular morphology expressed by their imaging phenotypes is yet to be fully explored. The goal of this study was to determine such correlations by incorporating genomic variants in CNN models for CRC subtype classification from H&E images. We utilized the publicly available TCGA-CRC-DX dataset, which comprises whole slide images from 360 CRC-diagnosed patients (260 for training and 100 for testing). This dataset also provides information on CRC subtype classifications and genomic variations. We trained CNN models for CRC subtype classification that account for potential correlation between genomic variations within CRC subtypes and their corresponding cellular morphology patterns. We assessed the interplay between CRC subtypes' genomic variations and cellular morphology patterns by evaluating the CRC subtype classification accuracy of the different models in a stratified 5-fold cross-validation experimental setup using the area under the ROC curve (AUROC) and average precision (AP) as the performance metrics. The CNN models that account for potential correlation between genomic variations within CRC subtypes and their cellular morphology pattern achieved superior accuracy compared to the baseline CNN classification model that does not account for genomic variations when using either single-nucleotide-polymorphism (SNP) molecular features (AUROC: 0.824±0.02 vs. 0.761±0.04, p<0.05, AP: 0.652±0.06 vs. 0.58±0.08) or CpG-Island methylation phenotype (CIMP) molecular features (AUROC: 0.834±0.01 vs. 0.787±0.03, p<0.05, AP: 0.687±0.02 vs. 0.64±0.05). Combining the CNN models account for variations in CIMP and SNP further improved classification accuracy (AUROC: 0.847±0.01 vs. 0.787±0.03, p = 0.01, AP: 0.68±0.02 vs. 0.64±0.05). The improved accuracy of CNN models for CRC subtype classification that account for potential correlation between genomic variations within CRC subtypes and their corresponding cellular morphology as expressed by H&E imaging phenotypes may elucidate the biological cues impacting cancer histopathological imaging phenotypes. Moreover, considering CRC subtypes genomic variations has the potential to improve the accuracy of deep-learning models in discerning cancer subtype from histopathological imaging data.

Dataset Information

Deep learning identified glioblastoma subtypes based on internal genomic expression ranks.

Background

Methods

Results

Conclusions

Publications

Deep learning identified glioblastoma subtypes based on internal genomic expression ranks.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets