Unknown

Dataset Information

0

A network-based deep learning methodology for stratification of tumor mutations.


ABSTRACT:

Motivation

Tumor stratification has a wide range of biomedical and clinical applications, including diagnosis, prognosis and personalized treatment. However, cancer is always driven by the combination of mutated genes, which are highly heterogeneous across patients. Accurately subdividing the tumors into subtypes is challenging.

Results

We developed a network-embedding based stratification (NES) methodology to identify clinically relevant patient subtypes from large-scale patients' somatic mutation profiles. The central hypothesis of NES is that two tumors would be classified into the same subtypes if their somatic mutated genes located in the similar network regions of the human interactome. We encoded the genes on the human protein-protein interactome with a network embedding approach and constructed the patients' vectors by integrating the somatic mutation profiles of 7,344 tumor exomes across 15 cancer types. We firstly adopted the lightGBM classification algorithm to train the patients' vectors. The AUC value is around 0.89 in the prediction of the patient's cancer type and around 0.78 in the prediction of the tumor stage within a specific cancer type. The high classification accuracy suggests that network embedding-based patients' features are reliable for dividing the patients. We conclude that we can cluster patients with a specific cancer type into several subtypes by using an unsupervised clustering algorithm to learn the patients' vectors. Among the 15 cancer types, the new patient clusters (subtypes) identified by the NES are significantly correlated with patient survival across 12 cancer types. In summary, this study offers a powerful network-based deep learning methodology for personalized cancer medicine.

Availability and implementation

Source code and data can be downloaded from https://github.com/ChengF-Lab/NES.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Liu C 

PROVIDER: S-EPMC8034530 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC3866081 | biostudies-literature
| S-EPMC5822793 | biostudies-literature
| S-EPMC11352240 | biostudies-literature
| S-EPMC7289260 | biostudies-literature
| S-EPMC6822714 | biostudies-literature
| S-EPMC9490924 | biostudies-literature
| S-EPMC4165739 | biostudies-other
| S-EPMC11334714 | biostudies-literature
| S-EPMC6294939 | biostudies-literature
| S-EPMC7100603 | biostudies-literature