Project description:(1) Medical research has shown an increasing interest in machine learning, permitting massive multivariate data analysis. Thus, we developed drug intoxication mortality prediction models, and compared machine learning models and traditional logistic regression. (2) Categorized as drug intoxication, 8,937 samples were extracted from the Korea Centers for Disease Control and Prevention (2008-2017). We trained, validated, and tested each model through data and compared their performance using three measures: Brier score, calibration slope, and calibration-in-the-large. (3) A chi-square test demonstrated that mortality risk statistically significantly differed according to severity, intent, toxic substance, age, and sex. The multilayer perceptron model (MLP) had the highest area under the curve (AUC), and lowest Brier score in training and validation phases, while the logistic regression model (LR) showed the highest AUC (0.827) and lowest Brier score (0.0307) in the testing phase. MLP also had the second-highest AUC (0.816) and second-lowest Brier score (0.003258) in the testing phase, demonstrating better performance than the decision-making tree model. (4) Given the complexity of choosing tuning parameters, LR proved competitive when using medical datasets, which require strict accuracy.
Project description:OBJECTIVES:Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine-learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine-learning algorithms. METHODS:We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine-learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared with the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis, and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics. RESULTS:After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95% confidence interval (CI) 0.56-0.67), whereas the machine-learning algorithm had a c-statistic of 0.64 (95% CI 0.60-0.69) in the validation cohort. The HALT-C model had a c-statistic of 0.60 (95% CI 0.50-0.70) in the validation cohort and was outperformed by the machine-learning algorithm. The machine-learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (P<0.001) and integrated discrimination improvement (P=0.04). CONCLUSIONS:Machine-learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC.
Project description:AimsThis study aimed to review the performance of machine learning (ML) methods compared with conventional statistical models (CSMs) for predicting readmission and mortality in patients with heart failure (HF) and to present an approach to formally evaluate the quality of studies using ML algorithms for prediction modelling.Methods and resultsFollowing Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines, we performed a systematic literature search using MEDLINE, EPUB, Cochrane CENTRAL, EMBASE, INSPEC, ACM Library, and Web of Science. Eligible studies included primary research articles published between January 2000 and July 2020 comparing ML and CSMs in mortality and readmission prognosis of initially hospitalized HF patients. Data were extracted and analysed by two independent reviewers. A modified CHARMS checklist was developed in consultation with ML and biostatistics experts for quality assessment and was utilized to evaluate studies for risk of bias. Of 4322 articles identified and screened by two independent reviewers, 172 were deemed eligible for a full-text review. The final set comprised 20 articles and 686 842 patients. ML methods included random forests (n = 11), decision trees (n = 5), regression trees (n = 3), support vector machines (n = 9), neural networks (n = 12), and Bayesian techniques (n = 3). CSMs included logistic regression (n = 16), Cox regression (n = 3), or Poisson regression (n = 3). In 15 studies, readmission was examined at multiple time points ranging from 30 to 180 day readmission, with the majority of studies (n = 12) presenting prediction models for 30 day readmission outcomes. Of a total of 21 time-point comparisons, ML-derived c-indices were higher than CSM-derived c-indices in 16 of the 21 comparisons. In seven studies, mortality was examined at 9 time points ranging from in-hospital mortality to 1 year survival; of these nine, seven reported higher c-indices using ML. Two of these seven studies reported survival analyses utilizing random survival forests in their ML prediction models. Both reported higher c-indices when using ML compared with CSMs. A limitation of studies using ML techniques was that the majority were not externally validated, and calibration was rarely assessed. In the only study that was externally validated in a separate dataset, ML was superior to CSMs (c-indices 0.913 vs. 0.835).ConclusionsML algorithms had better discrimination than CSMs in most studies aiming to predict risk of readmission and mortality in HF patients. Based on our review, there is a need for external validation of ML-based studies of prediction modelling. We suggest that ML-based studies should also be evaluated using clinical quality standards for prognosis research. Registration: PROSPERO CRD42020134867.
Project description:Bronchopulmonary dysplasia (BPD) is the most prevalent and clinically significant complication of prematurity. Accurate identification of at-risk infants would enable ongoing intervention to improve outcomes. Although postnatal exposures are known to affect an infant's likelihood of developing BPD, most existing BPD prediction models do not allow risk to be evaluated at different time points, and/or are not suitable for use in ethno-diverse populations. A comprehensive approach to developing clinical prediction models avoids assumptions as to which method will yield the optimal results by testing multiple algorithms/models. We compared the performance of machine learning and logistic regression models in predicting BPD/death. Our main cohort included infants <33 weeks' gestational age (GA) admitted to a Canadian Neonatal Network site from 2016 to 2018 (n = 9,006) with all analyses repeated for the <29 weeks' GA subcohort (n = 4,246). Models were developed to predict, on days 1, 7, and 14 of admission to neonatal intensive care, the composite outcome of BPD/death prior to discharge. Ten-fold cross-validation and a 20% hold-out sample were used to measure area under the curve (AUC). Calibration intercepts and slopes were estimated by regressing the outcome on the log-odds of the predicted probabilities. The model AUCs ranged from 0.811 to 0.886. Model discrimination was lower in the <29 weeks' GA subcohort (AUCs 0.699-0.790). Several machine learning models had a suboptimal calibration intercept and/or slope (k-nearest neighbor, random forest, artificial neural network, stacking neural network ensemble). The top-performing algorithms will be used to develop multinomial models and an online risk estimator for predicting BPD severity and death that does not require information on ethnicity.
Project description:Acute kidney injury (AKI) after liver transplantation has been reported to be associated with increased mortality. Recently, machine learning approaches were reported to have better predictive ability than the classic statistical analysis. We compared the performance of machine learning approaches with that of logistic regression analysis to predict AKI after liver transplantation. We reviewed 1211 patients and preoperative and intraoperative anesthesia and surgery-related variables were obtained. The primary outcome was postoperative AKI defined by acute kidney injury network criteria. The following machine learning techniques were used: decision tree, random forest, gradient boosting machine, support vector machine, naïve Bayes, multilayer perceptron, and deep belief networks. These techniques were compared with logistic regression analysis regarding the area under the receiver-operating characteristic curve (AUROC). AKI developed in 365 patients (30.1%). The performance in terms of AUROC was best in gradient boosting machine among all analyses to predict AKI of all stages (0.90, 95% confidence interval [CI] 0.86?0.93) or stage 2 or 3 AKI. The AUROC of logistic regression analysis was 0.61 (95% CI 0.56?0.66). Decision tree and random forest techniques showed moderate performance (AUROC 0.86 and 0.85, respectively). The AUROC of support the vector machine, naïve Bayes, neural network, and deep belief network was smaller than that of the other models. In our comparison of seven machine learning approaches with logistic regression analysis, the gradient boosting machine showed the best performance with the highest AUROC. An internet-based risk estimator was developed based on our model of gradient boosting. However, prospective studies are required to validate our results.
Project description:ObjectiveTo predict preterm birth in nulliparous women using logistic regression and machine learning.DesignPopulation-based retrospective cohort.ParticipantsNulliparous women (N = 112,963) with a singleton gestation who gave birth between 20-42 weeks gestation in Ontario hospitals from April 1, 2012 to March 31, 2014.MethodsWe used data during the first and second trimesters to build logistic regression and machine learning models in a "training" sample to predict overall and spontaneous preterm birth. We assessed model performance using various measures of accuracy including sensitivity, specificity, positive predictive value, negative predictive value, and area under the receiver operating characteristic curve (AUC) in an independent "validation" sample.ResultsDuring the first trimester, logistic regression identified 13 variables associated with preterm birth, of which the strongest predictors were diabetes (Type I: adjusted odds ratio (AOR): 4.21; 95% confidence interval (CI): 3.23-5.42; Type II: AOR: 2.68; 95% CI: 2.05-3.46) and abnormal pregnancy-associated plasma protein A concentration (AOR: 2.04; 95% CI: 1.80-2.30). During the first trimester, the maximum AUC was 60% (95% CI: 58-62%) with artificial neural networks in the validation sample. During the second trimester, 17 variables were significantly associated with preterm birth, among which complications during pregnancy had the highest AOR (13.03; 95% CI: 12.21-13.90). During the second trimester, the AUC increased to 65% (95% CI: 63-66%) with artificial neural networks in the validation sample. Including complications during the pregnancy yielded an AUC of 80% (95% CI: 79-81%) with artificial neural networks. All models yielded 94-97% negative predictive values for spontaneous PTB during the first and second trimesters.ConclusionAlthough artificial neural networks provided slightly higher AUC than logistic regression, prediction of preterm birth in the first trimester remained elusive. However, including data from the second trimester improved prediction to a moderate level by both logistic regression and machine learning approaches.
Project description:BackgroundTimely and accurate prediction of delayed cerebral ischemia is critical for improving the prognosis of patients with aneurysmal subarachnoid hemorrhage. Machine learning (ML) algorithms are increasingly regarded as having a higher prediction power than conventional logistic regression (LR). This study aims to construct LR and ML models and compare their prediction power on delayed cerebral ischemia (DCI) after aneurysmal subarachnoid hemorrhage (aSAH).MethodsThis was a multicenter, retrospective, observational cohort study that enrolled patients with aneurysmal subarachnoid hemorrhage from five hospitals in China. A total of 404 aSAH patients were prospectively enrolled. We randomly divided the patients into training (N = 303) and validation cohorts (N = 101) according to a ratio of 75-25%. One LR and six popular ML algorithms were used to construct models. The area under the receiver operating characteristic curve (AUC), accuracy, balanced accuracy, confusion matrix, sensitivity, specificity, calibration curve, and Hosmer-Lemeshow test were used to assess and compare the model performance. Finally, we calculated each feature of importance.ResultsA total of 112 (27.7%) patients developed DCI. Our results showed that conventional LR with an AUC value of 0.824 (95%CI: 0.73-0.91) in the validation cohort outperformed k-nearest neighbor, decision tree, support vector machine, and extreme gradient boosting model with the AUCs of 0.792 (95%CI: 0.68-0.9, P = 0.46), 0.675 (95%CI: 0.56-0.79, P < 0.01), 0.677 (95%CI: 0.57-0.77, P < 0.01), and 0.78 (95%CI: 0.68-0.87, P = 0.50). However, random forest (RF) and artificial neural network model with the same AUC (0.858, 95%CI: 0.78-0.93, P = 0.26) were better than the LR. The accuracy and the balanced accuracy of the RF were 20.8% and 11% higher than the latter, and the RF also showed good calibration in the validation cohort (Hosmer-Lemeshow: P = 0.203). We found that the CT value of subarachnoid hemorrhage, WBC count, neutrophil count, CT value of cerebral edema, and monocyte count were the five most important features for DCI prediction in the RF model. We then developed an online prediction tool (https://dynamic-nomogram.shinyapps.io/DynNomapp-DCI/) based on important features to calculate DCI risk precisely.ConclusionsIn this multicenter study, we found that several ML methods, particularly RF, outperformed conventional LR. Furthermore, an online prediction tool based on the RF model was developed to identify patients at high risk for DCI after SAH and facilitate timely interventions.Clinical trial registrationhttp://www.chictr.org.cn, Unique identifier: ChiCTR2100044448.
Project description:Criticism of the implementation of existing risk prediction models (RPMs) for cardiovascular diseases (CVDs) in new populations motivates researchers to develop regional models. The predominant usage of laboratory features in these RPMs is also causing reproducibility issues in low-middle-income countries (LMICs). Further, conventional logistic regression analysis (LRA) does not consider non-linear associations and interaction terms in developing these RPMs, which might oversimplify the phenomenon. This study aims to develop alternative machine learning (ML)-based RPMs that may perform better at predicting CVD status using nonlaboratory features in comparison to conventional RPMs. The data was based on a case-control study conducted at the Punjab Institute of Cardiology, Pakistan. Data from 460 subjects, aged between 30 and 76 years, with (1:1) gender-based matching, was collected. We tested various ML models to identify the best model/models considering LRA as a baseline RPM. An artificial neural network and a linear support vector machine outperformed the conventional RPM in the majority of performance matrices. The predictive accuracies of the best performed ML-based RPMs were between 80.86 and 81.09% and were found to be higher than 79.56% for the baseline RPM. The discriminating capabilities of the ML-based RPMs were also comparable to baseline RPMs. Further, ML-based RPMs identified substantially different orders of features as compared to baseline RPM. This study concludes that nonlaboratory feature-based RPMs can be a good choice for early risk assessment of CVDs in LMICs. ML-based RPMs can identify better order of features as compared to the conventional approach, which subsequently provided models with improved prognostic capabilities.
Project description:ObjectiveTo present new classification methods of knee osteoarthritis (KOA) using machine learning and compare its performance with conventional statistical methods as classification techniques using machine learning have recently been developed.MethodsA total of 84 KOA patients and 97 normal participants were recruited. KOA patients were clustered into three groups according to the Kellgren-Lawrence (K-L) grading system. All subjects completed gait trials under the same experimental conditions. Machine learning-based classification using the support vector machine (SVM) classifier was performed to classify KOA patients and the severity of KOA. Logistic regression analysis was also performed to compare the results in classifying KOA patients with machine learning method.ResultsIn the classification between KOA patients and normal subjects, the accuracy of classification was higher in machine learning method than in logistic regression analysis. In the classification of KOA severity, accuracy was enhanced through the feature selection process in the machine learning method. The most significant gait feature for classification was flexion and extension of the knee in the swing phase in the machine learning method.ConclusionThe machine learning method is thought to be a new approach to complement conventional logistic regression analysis in the classification of KOA patients. It can be clinically used for diagnosis and gait correction of KOA patients.
Project description:In medical research, there is great interest in developing methods for combining biomarkers. We argue that selection of markers should also be considered in the process. Traditional model/variable selection procedures ignore the underlying uncertainty after model selection. In this work, we propose a novel model-combining algorithm for classification in biomarker studies. It works by considering weighted combinations of various logistic regression models; five different weighting schemes are considered in the article. The weights and algorithm are justified using decision theory and risk-bound results. Simulation studies are performed to assess the finite-sample properties of the proposed model-combining method. It is illustrated with an application to data from an immunohistochemical study in prostate cancer.