Project description:Background and purposeMechanical thrombectomy greatly improves stroke outcomes. Nonetheless, some patients fall short of full recovery despite good reperfusion. The purpose of this study was to develop machine learning (ML) models for the pre-interventional prediction of functional outcome at 3 months of thrombectomy in acute ischemic stroke (AIS), using clinical and auto-extractable radiological information consistently available upon first emergency evaluation.Materials and methodsA two-center retrospective cohort of 293 patients with AIS who underwent thrombectomy was analyzed. ML models were developed to predict dichotomized modified Rankin score at 90 days (mRS-90) using clinical and imaging features, both separately and combined. Conventional and experimental imaging biomarkers were quantified using automated image-processing software from non-contract computed tomography (CT) and computed tomography angiography (CTA). Shapley Additive Explanation (SHAP) was applied for model interpretability and predictor importance analysis of the optimal model.ResultsMerging clinical and imaging features returned the best results for mRS-90 prediction. The best performing classifier was Extreme Gradient Boosting (XGB) with an area under the receiver operating characteristic curve (AUC) = 84% using selected features. The most important classifying features were age, baseline National Institutes of Health Stroke Scale (NIHSS), occlusion side, degree of brain atrophy [primarily represented by cortical cerebrospinal fluid (CSF) volume and lateral ventricle volume], early ischemic core [primarily represented by e-Alberta Stroke Program Early CT Score (ASPECTS)], and collateral circulation deficit volume on CTA.ConclusionMachine learning that is applied to quantifiable image features from CT and CTA alongside basic clinical characteristics constitutes a promising automated method in the pre-interventional prediction of stroke prognosis. Interpretable models allow for exploring which initial features contribute the most to post-thrombectomy outcome prediction overall and for each individual patient outcome.
Project description:BackgroundAccurate prediction of clinical outcomes in individual patients following acute stroke is vital for healthcare providers to optimize treatment strategies and plan further patient care. Here, we use advanced machine learning (ML) techniques to systematically compare the prediction of functional recovery, cognitive function, depression, and mortality of first-ever ischemic stroke patients and to identify the leading prognostic factors.MethodsWe predicted clinical outcomes for 307 patients (151 females, 156 males; 68 ± 14 years) from the PROSpective Cohort with Incident Stroke Berlin study using 43 baseline features. Outcomes included modified Rankin Scale (mRS), Barthel Index (BI), Mini-Mental State Examination (MMSE), Modified Telephone Interview for Cognitive Status (TICS-M), Center for Epidemiologic Studies Depression Scale (CES-D) and survival. The ML models included a Support Vector Machine with a linear kernel and a radial basis function kernel as well as a Gradient Boosting Classifier based on repeated 5-fold nested cross-validation. The leading prognostic features were identified using Shapley additive explanations.ResultsThe ML models achieved significant prediction performance for mRS at patient discharge and after 1 year, BI and MMSE at patient discharge, TICS-M after 1 and 3 years and CES-D after 1 year. Additionally, we showed that National Institutes of Health Stroke Scale (NIHSS) was the top predictor for most functional recovery outcomes as well as education for cognitive function and depression.ConclusionOur machine learning analysis successfully demonstrated the ability to predict clinical outcomes after first-ever ischemic stroke and identified the leading prognostic factors that contribute to this prediction.
Project description:BackgroundVenous thromboembolism (VTE) is a life-threatening complication commonly occurring after acute ischemic stroke (AIS), with an increased risk of mortality. Traditional risk assessment tools lack precision in predicting VTE in AIS patients due to the omission of stroke-specific factors.MethodsWe developed a machine learning model using clinical data from patients with acute ischemic stroke (AIS) admitted between December 2021 and December 2023. Predictive models were developed using machine learning algorithms, including Gradient Boosting Machine (GBM), Random Forest (RF), and Logistic Regression (LR). Feature selection involved stepwise logistic regression and LASSO, with SHapley Additive exPlanations (SHAP) used to enhance model interpretability. Model performance was evaluated using area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).ResultsAmong the 1,632 AIS patients analyzed, 4.17% developed VTE. The GBM model achieved the highest predictive accuracy with an AUC of 0.923, outperforming other models such as Random Forest and Logistic Regression. The model demonstrated strong sensitivity (90.83%) and specificity (93.83%) in identifying high-risk patients. SHAP analysis revealed that key predictors of VTE risk included elevated D-dimer levels, premorbid mRS, and large vessel occlusion, offering clinicians valuable insights for personalized treatment decisions.ConclusionThis study provides an accurate and interpretable method to predict VTE risk in patients with AIS using the GBM model, potentially improving early detection rates and reducing morbidity. Further validation is needed to assess its broader clinical applicability.
Project description:Background and objectives Post-stroke cognitive impairment (PSCI) occurs in up to 50% of patients with acute ischemic stroke (AIS). Thus, the prediction of cognitive outcomes in AIS may be useful for treatment decisions. This PSCI cohort study aimed to determine the applicability of a machine learning approach for predicting PSCI after stroke. Methods This retrospective study used a prospective PSCI cohort of patients with AIS. Demographic features, clinical characteristics, and brain imaging variables previously known to be associated with PSCI were included in the analysis. The primary outcome was PSCI at 3–6 months, defined as an adjusted z-score of less than − 2.0 standard deviation in at least one of the four cognitive domains (memory, executive/frontal, visuospatial, and language), using the Korean version of the Vascular Cognitive Impairment Harmonization Standards-Neuropsychological Protocol (VCIHS-NP). We developed four machine learning models (logistic regression, support vector machine, extreme gradient boost, and artificial neural network) and compared their accuracies for outcome variables. Results A total of 951 patients (mean age 65.7 ± 11.9; male 61.5%) with AIS were included in this study. The area under the curve for the extreme gradient boost and the artificial neural network was the highest (0.7919 and 0.7365, respectively) among the four models for predicting PSCI according to the VCIHS-NP definition. The most important features for predicting PSCI include the presence of cortical infarcts, mesial temporal lobe atrophy, initial stroke severity, stroke history, and strategic lesion infarcts. Conclusion Our findings indicate that machine-learning algorithms, particularly the extreme gradient boost and the artificial neural network models, can best predict cognitive outcomes after ischemic stroke. Supplementary Information The online version contains supplementary material available at 10.1186/s13195-023-01289-4.
Project description:Acute stroke is often superimposed on chronic damage from previous cerebrovascular events. This background will inevitably modulate the impact of acute injury on clinical outcomes to an extent that will depend on the precise anatomical pattern of damage. Previous attempts to quantify such modulation have employed only reductive models that ignore anatomical detail. The combination of automated image processing, large-scale data, and machine learning now enables us to quantify the impact of this with high-dimensional multivariate models sensitive to individual variations in the detailed anatomical pattern. We introduce and validate a new automated chronic lesion segmentation routine for use with non-contrast CT brain scans, combining non-parametric outlier-detection score, Zeta, with an unsupervised 3-dimensional maximum-flow, minimum-cut algorithm. The routine was then applied to a dataset of 1,704 stroke patient scans, obtained at their presentation to a hyper-acute stroke unit (St George's Hospital, London, UK), and used to train a support vector machine (SVM) model to predict between low (0-2) and high (3-6) pre-admission and discharge modified Rankin Scale (mRS) scores, quantifying performance by the area under the receiver operating curve (AUROC). In this single center retrospective observational study, our SVM models were able to differentiate between low (0-2) and high (3-6) pre-admission and discharge mRS scores with an AUROC of 0.77 (95% confidence interval of 0.74-0.79), and 0.76 (0.74-0.78), respectively. The chronic lesion segmentation routine achieved a mean (standard deviation) sensitivity, specificity and Dice similarity coefficient of 0.746 (0.069), 0.999 (0.001), and 0.717 (0.091), respectively. We have demonstrated that machine learning models capable of capturing the high-dimensional features of chronic injuries are able to stratify patients-at the time of presentation-by pre-admission and discharge mRS scores. Our fully automated chronic stroke lesion segmentation routine simplifies this process, and utilizes routinely collected CT head scans, thereby facilitating future large-scale studies to develop supportive clinical decision tools.
Project description:BackgroundThe prognosis, recurrence rates, and secondary prevention strategies varied significantly among different subtypes of acute ischemic stroke (AIS). Machine learning (ML) techniques can uncover intricate, non-linear relationships within medical data, enabling the identification of factors associated with etiological classification. However, there is currently a lack of research utilizing ML algorithms for predicting AIS etiology.ObjectiveWe aimed to use interpretable ML algorithms to develop AIS etiology prediction models, identify critical factors in etiology classification, and enhance existing clinical categorization.MethodsThis study involved patients with the Third China National Stroke Registry (CNSR-III). Nine models, which included Natural Gradient Boosting (NGBoost), Categorical Boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Random Forest (RF), Light Gradient Boosting Machine (LGBM), Gradient Boosting Decision Tree (GBDT), Adaptive Boosting (AdaBoost), Support Vector Machine (SVM), and logistic regression (LR), were employed to predict large artery atherosclerosis (LAA), small vessel occlusion (SVO), and cardioembolism (CE) using an 80:20 randomly split training and test set. We designed an SFS-XGB with 10-fold cross-validation for feature selection. The primary evaluation metrics for the models included the area under the receiver operating characteristic curve (AUC) for discrimination and the Brier score (or calibration plots) for calibration.ResultsA total of 5,213 patients were included, comprising 2,471 (47.4%) with LAA, 2,153 (41.3%) with SVO, and 589 (11.3%) with CE. In both LAA and SVO models, the AUC values of the ML models were significantly higher than that of the LR model (P < 0.001). The optimal model for predicting SVO (AUC [RF model] = 0.932) outperformed the optimal LAA model (AUC [NGB model] = 0.917) and the optimal CE model (AUC [LGBM model] = 0.846). Each model displayed relatively satisfactory calibration. Further analysis showed that the optimal CE model could identify potential CE patients in the undetermined etiology (SUE) group, accounting for 1,900 out of 4,156 (45.7%).ConclusionsThe ML algorithm effectively classified patients with LAA, SVO, and CE, demonstrating superior classification performance compared to the LR model. The optimal ML model can identify potential CE patients among SUE patients. These newly identified predictive factors may complement the existing etiological classification system, enabling clinicians to promptly categorize stroke patients' etiology and initiate optimal strategies for secondary prevention.
Project description:Despite the identification of several dozen genetic loci associated with ischemic stroke (IS), the genetic bases of this disease remain largely unexplored. In this research we present the results of genome-wide association studies (GWAS) based on classical statistical testing and machine learning algorithms (logistic regression, gradient boosting on decision trees, and tabular deep learning model TabNet). To build a consensus on the results obtained by different techniques, the Pareto-Optimal solution was proposed and applied. These methods were applied to real genotypic data of sick and healthy individuals of European ancestry obtained from the Database of Genotypes and Phenotypes (5,581 individuals, 883,749 single nucleotide polymorphisms). Finally, 131 genes were identified as candidates for association with the onset of IS. UBQLN1, TRPS1, and MUSK were previously described as associated with the course of IS in model animals. ACOT11 taking part in metabolism of fatty acids was shown for the first time to be associated with IS. The identified genes were compared with genes from the Illuminating Druggable Genome project. The product of GPR26 representing the G-coupled protein receptor can be considered as a therapeutic target for stroke prevention. The approaches presented in this research can be used to reprocess GWAS datasets from other diseases.
Project description:ObjectiveThis study aimed to develop and validate a machine learning-based predictive model for gait recovery in patients with acute anterior circulation ischemic stroke.MethodsBetween May and November 2023, 237 patients with acute anterior circulation ischemic stroke were enrolled. Patients were randomly divided into training and validation sets at a 7:3 ratio. Thirty-one medical characteristics were collected, and the Least Absolute Shrinkage and Selection Operator (LASSO) regression was applied to screen predictor variables. Predictive models were developed using the Random Survival Forest (RSF) and COX regression methods. The optimal model was identified based on C-index values. The SHapley Additive exPlanations (SHAP) method was employed to interpret the RSF model globally and locally.ResultsTen predictors were identified through LASSO regression, including age, gender, periventricular white matter hyperintensities (PVWMH), Montreal Cognitive Assessment (MoCA), National Institutes of Health Stroke Scale (NIHSS), enlarged perivascular spaces in basal ganglia (BG-EPVS), lacunes, parietal infarction, basal ganglia infarction, and Timed Up & Go (TUG) test score. The C-index values of the COX regression and RSF models were 0.741 and 0.761 in the training set and 0.705 and 0.725 in the validation set, respectively. SHAP analysis of the RSF model identified BG-EPVS, TUG, MoCA, age, and PVWMH as the top five most influential predictors of gait recovery.ConclusionThe RSF model demonstrated superior performance to the COX regression model in predicting gait recovery, offering a reliable tool for clinical decision-making regarding stroke patients' prognoses.
Project description:The unfavorable outcome of acute ischemic stroke (AIS) with large vessel occlusion (LVO) is related to clinical factors at multiple time points. However, predictive models used for dynamically predicting unfavorable outcomes using clinically relevant preoperative and postoperative time point variables have not been developed. Our goal was to develop a machine learning (ML) model for the dynamic prediction of unfavorable outcomes. We retrospectively reviewed patients with AIS who underwent a consecutive mechanical thrombectomy (MT) from three centers in China between January 2014 and December 2018. Based on the eXtreme gradient boosting (XGBoost) algorithm, we used clinical characteristics on admission ("Admission" Model) and additional variables regarding intraoperative management and the postoperative National Institute of Health stroke scale (NIHSS) score ("24-Hour" Model, "3-Day" Model and "Discharge" Model). The outcome was an unfavorable outcome at the three-month mark (modified Rankin scale, mRS 3-6: unfavorable). The area under the receiver operating characteristic curve and Brier scores were the main evaluating indexes. The unfavorable outcome at the three-month mark was observed in 156 (62.0%) of 238 patients. These four models had a high accuracy in the range of 75.0% to 87.5% and had a good discrimination with AUC in the range of 0.824 to 0.945 on the testing set. The Brier scores of the four models ranged from 0.122 to 0.083 and showed a good predictive ability on the testing set. This is the first dynamic, preoperative and postoperative predictive model constructed for AIS patients who underwent MT, which is more accurate than the previous prediction model. The preoperative model could be used to predict the clinical outcome before MT and support the decision to perform MT, and the postoperative models would further improve the predictive accuracy of the clinical outcome after MT and timely adjust therapeutic strategies.
Project description:This study aimed to develop and validate an automated machine learning (ML) system that predicts 3-month functional outcomes in acute ischemic stroke (AIS) patients by combining clinical and neuroimaging features. Functional outcomes were categorized as unfavorable (modified Rankin Scale ≥ 3) or not. A clinical model employing optimal clinical features (Model_A), a convolutional neural network model incorporating imaging data (Model_B), and an integrated model combining both imaging and clinical features (Model_C) were developed and tested to predict unfavorable outcomes. The developed models were compared with each other and with traditional risk-scoring models. The dataset comprised 4147 patients from a multicenter stroke registry, with 1268 (30.6%) experiencing unfavorable outcomes. Age, initial NIHSS, and early neurologic deterioration were identified as the most important clinical features. The ML model prediction achieved an area under the curves of 0.757 (95% CI 0.726-0.789) for Model_A, 0.725 (95% CI 0.693-0.755) for Model_B, and 0.786 (95% CI 0.757-0.814) for Model_C in the test set. The integrated models outperformed traditional risk-scoring models by 0.21 (95% CI 0.16-0.25) for HIAT and 0.15 (95% CI 0.11-0.19) for THRIVE. In conclusion, the integrated ML system enhanced stroke outcome prediction by combining imaging data and clinical features, outperforming traditional risk-scoring models.