Project description:Diffuse gliomas (DGs) are the most common and lethal primary neoplasms in the central nervous system. The latest 2021 WHO Classification of Tumors of the Central Nervous System (CNS) was published in 2021, immensely changing the approach to diagnosis and decision making. As a part of the Chinese Glioma Genome Atlas (CGGA) project, our aim was to provide genomic profiling of gliomas in a Chinese cohort. Two hundred eighty six gliomas with different grades were collected over the last decade. Using the Illumina HiSeq platform, over 75.8 million high-quality 150 bp paired-end reads were generated per sample, yielding a total of 43.4 billion reads. We also collected each patient’s clinical and pathological information and used it to annotate their genetic data. This dataset provides an important reference for researchers and will significantly advance our understanding of gliomas.
Project description:As part of the advancement in therapeutic decision-making for brain tumor patients at St. Jude Children’s Research Hospital (SJCRH), we developed three robust classifiers, a deep learning neural network (NN), k-nearest neighbor (kNN), and random forest (RF), trained on a reference series DNA-methylation profiles to classify central nervous system (CNS) tumor types. The models’ performance was rigorously validated against 2,054 samples from two independent cohorts. In addition to classic metrics of model performance, we compared the robustness of the three models to reduced tumor purity, a critical consideration in the clinical utility of such classifiers. Our findings revealed that the NN model exhibited the highest accuracy and maintained a balance between precision and recall. The NN model was the most resistant to drops in performance associated with a reduction in tumor purity, showing good performance until the purity fell below 50%. Through rigorous validation, our study emphasizes the potential of DNA-methylation-based deep learning methods to improve precision medicine for brain tumor classification in the clinical setting.
Project description:Background: Modern neuropathology is challenged by an increasing number of clinically-relevant CNS tumor subgroups that require assessment of a multitude of molecular markers for classification, as well a highly trained medical staff. Failure to meet this challenge leads to tumor misclassification, which can have severe consequences for affected patients. Methods: We compiled a cohort of genome-wide DNA methylation profiles of 2,682 tumors from 82 histologically and/or molecularly distinct CNS tumor classes across all ages and histologies that served as reference for a Random Forest-based diagnostic classifier. This classifier was used to prospectively investigate a further 1,104 CNS tumor samples in order to determine its clinical utility. Results: The classifier was able to reliably assign tumor samples to a given diagnostic category with a misclassification rate of less than 2%. The system functioned robustly across laboratories and using different DNA methylation profiling techniques. Prospective application to clinical samples resulted in a reclassification of 12% of tumors compared with standard practice alone. A further 12% could not be classified by methylation profiling – this subset was highly enriched for unusual syndrome-associated tumors and likely novel entities. Conclusion: This study represents a proof-of-concept for the application of machine learning approaches in molecular diagnostics using a single, easy-to-use assay. The reference cohort and Random Forest-based classifier are available online as a valuable community tool for improving precision in brain tumor diagnostics. We expect that approaches similar to the one presented herein will rapidly restructure diagnostic practice in neurooncology and across tumor pathology.
Project description:Background: Modern neuropathology is challenged by an increasing number of clinically-relevant CNS tumor subgroups that require assessment of a multitude of molecular markers for classification, as well a highly trained medical staff. Failure to meet this challenge leads to tumor misclassification, which can have severe consequences for affected patients. Methods: We compiled a cohort of genome-wide DNA methylation profiles of 2,682 tumors from 82 histologically and/or molecularly distinct CNS tumor classes across all ages and histologies that served as reference for a Random Forest-based diagnostic classifier. This classifier was used to prospectively investigate a further 1,104 CNS tumor samples in order to determine its clinical utility. Results: The classifier was able to reliably assign tumor samples to a given diagnostic category with a misclassification rate of less than 2%. The system functioned robustly across laboratories and using different DNA methylation profiling techniques. Prospective application to clinical samples resulted in a reclassification of 12% of tumors compared with standard practice alone. A further 12% could not be classified by methylation profiling – this subset was highly enriched for unusual syndrome-associated tumors and likely novel entities. Conclusion: This study represents a proof-of-concept for the application of machine learning approaches in molecular diagnostics using a single, easy-to-use assay. The reference cohort and Random Forest-based classifier are available online as a valuable community tool for improving precision in brain tumor diagnostics. We expect that approaches similar to the one presented herein will rapidly restructure diagnostic practice in neurooncology and across tumor pathology.
Project description:The 2021 WHO Classification of Tumors of the Central Nervous System includes several tumor types and subtypes for which the diagnosis is at least partially reliant on utilization of whole genome methylation profiling. The current approach to array DNA methylation profiling utilizes a reference library of tumor DNA methylation data, and a machine learning-based tumor classifier. This approach was pioneered and popularized by the German Cancer Research Network (DKFZ) and University Hospital Heidelberg. This research group has kindly made their classifier for central nervous system tumors freely available as a research tool via a web-based portal. However, this classifier is not maintained in a clinical testing environment. Therefore, we validated our own DNA methylation-based classifier of central nervous system tumors. We validated our classifier using the same training and validation datasets as the DKFZ group. In addition, we performed a validation of samples tested in our own laboratory and compared the performance of both classifiers. Using the validation data set, our classifier’s performance showed high concordance (92%) and comparable accuracy (specificity 94.0% v. 84.9% for DKFZ, sensitivity 88.6% v. 94.7% for DKFZ). Receiver operator curve showed areas under the curve of 0.964 v. 0.966 for NM and DKFZ classifiers, respectively. Our classifier performed comparably well with samples tested in our own laboratory and is currently offered for clinical testing.