Dataset Information

Ensemble learning predicts multiple sclerosis disease course in the SUMMIT study.

ABSTRACT: The rate of disability accumulation varies across multiple sclerosis (MS) patients. Machine learning techniques may offer more powerful means to predict disease course in MS patients. In our study, 724 patients from the Comprehensive Longitudinal Investigation in MS at Brigham and Women's Hospital (CLIMB study) and 400 patients from the EPIC dataset, University of California, San Francisco, were included in the analysis. The primary outcome was an increase in Expanded Disability Status Scale (EDSS) ≥ 1.5 (worsening) or not (non-worsening) at up to 5 years after the baseline visit. Classification models were built using the CLIMB dataset with patients' clinical and MRI longitudinal observations in first 2 years, and further validated using the EPIC dataset. We compared the performance of three popular machine learning algorithms (SVM, Logistic Regression, and Random Forest) and three ensemble learning approaches (XGBoost, LightGBM, and a Meta-learner L). A "threshold" was established to trade-off the performance between the two classes. Predictive features were identified and compared among different models. Machine learning models achieved 0.79 and 0.83 AUC scores for the CLIMB and EPIC datasets, respectively, shortly after disease onset. Ensemble learning methods were more effective and robust compared to standalone algorithms. Two ensemble models, XGBoost and LightGBM were superior to the other four models evaluated in our study. Of variables evaluated, EDSS, Pyramidal Function, and Ambulatory Index were the top common predictors in forecasting the MS disease course. Machine learning techniques, in particular ensemble methods offer increased accuracy for the prediction of MS disease course.

SUBMITTER: Zhao Y

PROVIDER: S-EPMC7567781 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Ensemble learning predicts multiple sclerosis disease course in the SUMMIT study.

Zhao Yijun Y Wang Tong T Bove Riley R Cree Bruce B Henry Roland R Lokhande Hrishikesh H Polgar-Turcsanyi Mariann M Anderson Mark M Bakshi Rohit R Weiner Howard L HL Chitnis Tanuja T

NPJ digital medicine 20201016

The rate of disability accumulation varies across multiple sclerosis (MS) patients. Machine learning techniques may offer more powerful means to predict disease course in MS patients. In our study, 724 patients from the Comprehensive Longitudinal Investigation in MS at Brigham and Women's Hospital (CLIMB study) and 400 patients from the EPIC dataset, University of California, San Francisco, were included in the analysis. The primary outcome was an increase in <i>Expanded Disability Status Scale< ...[more]

PMID: 33083570

Similar Datasets

Project description:BackgroundIt remains unclear whether disease course in multiple sclerosis (MS) is influenced by genetic polymorphisms. Here, we aimed to identify genetic variants associated with benign and aggressive disease courses in MS patients.MethodsMS patients were classified into benign and aggressive phenotypes according to clinical criteria. We performed exome sequencing in a discovery cohort, which included 20 MS patients, 10 with benign and 10 with aggressive disease course, and genotyping in 2 independent validation cohorts. The first validation cohort encompassed 194 MS patients, 107 with benign and 87 with aggressive phenotypes. The second validation cohort comprised 257 patients, of whom 224 patients had benign phenotypes and 33 aggressive disease courses. Brain immunohistochemistries were performed using disease course associated genes antibodies.ResultsBy means of single-nucleotide polymorphism (SNP) detection and comparison of allele frequencies between patients with benign and aggressive phenotypes, a total of 16 SNPs were selected for validation from the exome sequencing data in the discovery cohort. Meta-analysis of genotyping results in two validation cohorts revealed two polymorphisms, rs28469012 and rs10894768, significantly associated with disease course. SNP rs28469012 is located in CPXM2 (carboxypeptidase X, M14 family, member 2) and was associated with aggressive disease course (uncorrected p value < 0.05). SNP rs10894768, which is positioned in IGSF9B (immunoglobulin superfamily member 9B) was associated with benign phenotype (uncorrected p value < 0.05). In addition, a trend for association with benign phenotype was observed for a third SNP, rs10423927, in NLRP9 (NLR family pyrin domain containing 9). Brain immunohistochemistries in chronic active lesions from MS patients revealed expression of IGSF9B in astrocytes and macrophages/microglial cells, and expression of CPXM2 and NLRP9 restricted to brain macrophages/microglia.ConclusionsGenetic variants located in CPXM2, IGSF9B, and NLRP9 have the potential to modulate disease course in MS patients and may be used as disease activity biomarkers to identify patients with divergent disease courses. Altogether, the reported results from this study support the influence of genetic factors in MS disease course and may help to better understand the complex molecular mechanisms underlying disease pathogenesis.

Dataset Information

Ensemble learning predicts multiple sclerosis disease course in the SUMMIT study.

Publications

Ensemble learning predicts multiple sclerosis disease course in the SUMMIT study.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets