Ontology highlight
ABSTRACT: Background
Traditionally, mutational burden and mutational signatures have been assessed by tumor-normal pair DNA sequencing. The requirement of having both normal and tumor samples is not always feasible from a clinical perspective, and led us to investigate the efficacy of using RNA sequencing of only the tumor sample to determine the mutational burden and signatures, and subsequently molecular cause of the cancer. The potential advantages include reducing the cost of testing, and simultaneously providing information on the gene expression profile and gene fusions present in the tumor.Results
In this study, we devised supervised and unsupervised learning methods to determine mutational signatures from tumor RNA-seq data. As applications, we applied the methods to a training set of 587 TCGA uterine cancer RNA-seq samples, and examined in an independent testing set of 521 TCGA colorectal cancer RNA-seq samples. Both diseases are known associated with microsatellite instable high (MSI-H) and driver defects in DNA polymerase ? (POL?). From RNA-seq called variants, we found majority (>?95%) are likely germline variants, leading to C?>?T enriched germline variants (dbSNP) widely applicable in tumor and normal RNA-seq samples. We found significant associations between RNA-derived mutational burdens and MSI/POL? status, and insignificant relationship between RNA-seq total coverage and derived mutational burdens. Additionally we found that over 80% of variants could be explained by using the COSMIC mutational signature-5, -6 and -10, which are implicated in natural aging, MSI-H, and POL?, respectively. For classifying tumor type, within UCEC we achieved a recall of 0.56 and 0.78, and specificity of 0.66 and 0.99 for MSI and POL? respectively. By applying learnt RNA signatures from UCEC to COAD, we were able to improve our classification of both MSI and POL?.Conclusions
Taken together, our work provides a novel method to detect RNA-seq derived mutational signatures with effective procedures to remove likely germline variants. It can leads to accurate classification of underlying driving mechanisms of DNA damage deficiency.
SUBMITTER: Jessen E
PROVIDER: S-EPMC7923324 | biostudies-literature | 2021 Mar
REPOSITORIES: biostudies-literature
Jessen Erik E Liu Yuanhang Y Davila Jaime J Kocher Jean-Pierre JP Wang Chen C
BMC medical genomics 20210301 1
<h4>Background</h4>Traditionally, mutational burden and mutational signatures have been assessed by tumor-normal pair DNA sequencing. The requirement of having both normal and tumor samples is not always feasible from a clinical perspective, and led us to investigate the efficacy of using RNA sequencing of only the tumor sample to determine the mutational burden and signatures, and subsequently molecular cause of the cancer. The potential advantages include reducing the cost of testing, and simu ...[more]