Project description:AimsOur aim was to develop a machine learning (ML)-based risk stratification system to predict 1-, 2-, 3-, 4-, and 5-year all-cause mortality from pre-implant parameters of patients undergoing cardiac resynchronization therapy (CRT).Methods and resultsMultiple ML models were trained on a retrospective database of 1510 patients undergoing CRT implantation to predict 1- to 5-year all-cause mortality. Thirty-three pre-implant clinical features were selected to train the models. The best performing model [SEMMELWEIS-CRT score (perSonalizEd assessMent of estiMatEd risk of mortaLity With machinE learnIng in patientS undergoing CRT implantation)], along with pre-existing scores (Seattle Heart Failure Model, VALID-CRT, EAARN, ScREEN, and CRT-score), was tested on an independent cohort of 158 patients. There were 805 (53%) deaths in the training cohort and 80 (51%) deaths in the test cohort during the 5-year follow-up period. Among the trained classifiers, random forest demonstrated the best performance. For the prediction of 1-, 2-, 3-, 4-, and 5-year mortality, the areas under the receiver operating characteristic curves of the SEMMELWEIS-CRT score were 0.768 (95% CI: 0.674-0.861; P < 0.001), 0.793 (95% CI: 0.718-0.867; P < 0.001), 0.785 (95% CI: 0.711-0.859; P < 0.001), 0.776 (95% CI: 0.703-0.849; P < 0.001), and 0.803 (95% CI: 0.733-0.872; P < 0.001), respectively. The discriminative ability of our model was superior to other evaluated scores.ConclusionThe SEMMELWEIS-CRT score (available at semmelweiscrtscore.com) exhibited good discriminative capabilities for the prediction of all-cause death in CRT patients and outperformed the already existing risk scores. By capturing the non-linear association of predictors, the utilization of ML approaches may facilitate optimal candidate selection and prognostication of patients undergoing CRT implantation.
Project description:BackgroundThe impact of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) on postoperative recovery needs to be understood to inform clinical decision making during and after the COVID-19 pandemic. This study reports 30-day mortality and pulmonary complication rates in patients with perioperative SARS-CoV-2 infection.MethodsThis international, multicentre, cohort study at 235 hospitals in 24 countries included all patients undergoing surgery who had SARS-CoV-2 infection confirmed within 7 days before or 30 days after surgery. The primary outcome measure was 30-day postoperative mortality and was assessed in all enrolled patients. The main secondary outcome measure was pulmonary complications, defined as pneumonia, acute respiratory distress syndrome, or unexpected postoperative ventilation.FindingsThis analysis includes 1128 patients who had surgery between Jan 1 and March 31, 2020, of whom 835 (74·0%) had emergency surgery and 280 (24·8%) had elective surgery. SARS-CoV-2 infection was confirmed preoperatively in 294 (26·1%) patients. 30-day mortality was 23·8% (268 of 1128). Pulmonary complications occurred in 577 (51·2%) of 1128 patients; 30-day mortality in these patients was 38·0% (219 of 577), accounting for 81·7% (219 of 268) of all deaths. In adjusted analyses, 30-day mortality was associated with male sex (odds ratio 1·75 [95% CI 1·28-2·40], p<0·0001), age 70 years or older versus younger than 70 years (2·30 [1·65-3·22], p<0·0001), American Society of Anesthesiologists grades 3-5 versus grades 1-2 (2·35 [1·57-3·53], p<0·0001), malignant versus benign or obstetric diagnosis (1·55 [1·01-2·39], p=0·046), emergency versus elective surgery (1·67 [1·06-2·63], p=0·026), and major versus minor surgery (1·52 [1·01-2·31], p=0·047).InterpretationPostoperative pulmonary complications occur in half of patients with perioperative SARS-CoV-2 infection and are associated with high mortality. Thresholds for surgery during the COVID-19 pandemic should be higher than during normal practice, particularly in men aged 70 years and older. Consideration should be given for postponing non-urgent procedures and promoting non-operative treatment to delay or avoid the need for surgery.FundingNational Institute for Health Research (NIHR), Association of Coloproctology of Great Britain and Ireland, Bowel and Cancer Research, Bowel Disease Research Foundation, Association of Upper Gastrointestinal Surgeons, British Association of Surgical Oncology, British Gynaecological Cancer Society, European Society of Coloproctology, NIHR Academy, Sarcoma UK, Vascular Society for Great Britain and Ireland, and Yorkshire Cancer Research.
Project description:This study retrospectively investigated the effect of dexmedetomidine on outcomes of patients undergoing coronary artery bypass graft (CABG) surgery.Retrospective investigation.Patients from a single tertiary medical center.A total of 724 patients undergoing CABG surgery met the inclusion criteria and were categorized into 2 groups: 345 in the dexmedetomidine group (DEX) and 379 in the nondexmedetomidine group (Non-DEX).Perioperative dexmedetomidine was used as an intravenous infusion (0.24 to 0.6 µg/kg/hour) initiated after cardiopulmonary bypass and continued for less than 24 hours postoperatively in the intensive care unit.Major outcome measures of this study were in-hospital, 30-day and 1-year all-cause mortality, delirium and major adverse cardiocerebral events. Perioperative dexmedetomidine infusion was associated with significant reductions in in-hospital, 30-day, and 1-year mortalities, compared with the patients who did not received dexmedetomidine. In-hospital, 30-day, and 1-year mortalities were 1.5% and 4.0% (adjusted odds ratio [OR], 0.332; 95% CI, 0.155 to 0.708; p = 0.0044), 2.0% and 4.5% (adjusted OR, 0.487; 95% CI, 0.253 to 0.985; p = 0.0305), and 3.2% and 6.9% (adjusted OR 0.421; 95% CI, 0.247 to 0.718, p = 0.0015), respectively. Perioperative dexmedetomidine infusion was associated with a reduced risk of delirium from 7.9% to 4.6% (adjusted OR, 0.431; 95% CI, 0.265-0.701; p = 0.0007).Dexmedetomidine infusion during CABG surgery was more likely to achieve improved in-hospital, 30-day, and 1-year survival rates, and a significantly lower incidence of delirium.
Project description:Gene expression profiles were generated from 199 primary breast cancer patients. Samples 1-176 were used in another study, GEO Series GSE22820, and form the training data set in this study. Sample numbers 200-222 form a validation set. This data is used to model a machine learning classifier for Estrogen Receptor Status. RNA was isolated from 199 primary breast cancer patients. A machine learning classifier was built to predict ER status using only three gene features.
Project description:Background: Pediatric myocarditis is a rare disease. The etiologies are multiple. Mortality associated with the disease is 5-8%. Prognostic factors were identified with the use of national hospitalization databases. Applying these identified risk factors for mortality prediction has not been reported. Methods: We used the Kids' Inpatient Database for this project. We manually curated fourteen variables as predictors of mortality based on the current knowledge of the disease, and compared performance of mortality prediction between linear regression models and a machine learning (ML) model. For ML, the random forest algorithm was chosen because of the categorical nature of the variables. Based on variable importance scores, a reduced model was also developed for comparison. Results: We identified 4,144 patients from the database for randomization into the primary (for model development) and testing (for external validation) datasets. We found that the conventional logistic regression model had low sensitivity (~50%) despite high specificity (>95%) or overall accuracy. On the other hand, the ML model struck a good balance between sensitivity (89.9%) and specificity (85.8%). The reduced ML model with top five variables (mechanical ventilation, cardiac arrest, ECMO, acute kidney injury, ventricular fibrillation) were sufficient to approximate the prediction performance of the full model. Conclusions: The ML algorithm performs superiorly when compared to the linear regression model for mortality prediction in pediatric myocarditis in this retrospective dataset. Prospective studies are warranted to further validate the applicability of our model in clinical settings.
Project description:BackgroundMyocardial injury after noncardiac surgery (MINS) is associated with increased postoperative mortality, but the relevant perioperative factors that contribute to the mortality of patients with MINS have not been fully evaluated.ObjectiveTo establish a comprehensive body of knowledge relating to patients with MINS, we researched the best performing predictive model based on machine learning algorithms.MethodsUsing clinical data from 7629 patients with MINS from the clinical data warehouse, we evaluated 8 machine learning algorithms for accuracy, precision, recall, F1 score, area under the receiver operating characteristic (AUROC) curve, and area under the precision-recall curve to investigate the best model for predicting mortality. Feature importance and Shapley Additive Explanations values were analyzed to explain the role of each clinical factor in patients with MINS.ResultsExtreme gradient boosting outperformed the other models. The model showed an AUROC of 0.923 (95% CI 0.916-0.930). The AUROC of the model did not decrease in the test data set (0.894, 95% CI 0.86-0.922; P=.06). Antiplatelet drugs prescription, elevated C-reactive protein level, and beta blocker prescription were associated with reduced 30-day mortality.ConclusionsPredicting the mortality of patients with MINS was shown to be feasible using machine learning. By analyzing the impact of predictors, markers that should be cautiously monitored by clinicians may be identified.
Project description:Due to the continued evolution of the SARS-CoV-2 pandemic, researchers worldwide are working to mitigate, suppress its spread, and better understand it by deploying digital signal processing (DSP) and machine learning approaches. This study presents an alignment-free approach to classify the SARS-CoV-2 using complementary DNA, which is DNA synthesized from the single-stranded RNA virus. Herein, a total of 1582 samples, with different lengths of genome sequences from different regions, were collected from various data sources and divided into a SARS-CoV-2 and a non-SARS-CoV-2 group. We extracted eight biomarkers based on three-base periodicity, using DSP techniques, and ranked those based on a filter-based feature selection. The ranked biomarkers were fed into k-nearest neighbor, support vector machines, decision trees, and random forest classifiers for the classification of SARS-CoV-2 from other coronaviruses. The training dataset was used to test the performance of the classifiers based on accuracy and F-measure via 10-fold cross-validation. Kappa-scores were estimated to check the influence of unbalanced data. Further, 10 × 10 cross-validation paired t-test was utilized to test the best model with unseen data. Random forest was elected as the best model, differentiating the SARS-CoV-2 coronavirus from other coronaviruses and a control a group with an accuracy of 97.4 %, sensitivity of 96.2 %, and specificity of 98.2 %, when tested with unseen samples. Moreover, the proposed algorithm was computationally efficient, taking only 0.31 s to compute the genome biomarkers, outperforming previous studies.
Project description:ObjectivesTo determine whether machine learning algorithms can better predict PICU mortality than the Pediatric Logistic Organ Dysfunction-2 score.DesignRetrospective study.SettingQuaternary care medical-surgical PICU.PatientsAll patients admitted to the PICU from 2013 to 2019.InterventionsNone.Measurements and main resultsWe investigated the performance of various machine learning algorithms using the same variables used to calculate the Pediatric Logistic Organ Dysfunction-2 score to predict PICU mortality. We used 10,194 patient records from 2013 to 2017 for training and 4,043 patient records from 2018 to 2019 as a holdout validation cohort. Mortality rate was 3.0% in the training cohort and 3.4% in the validation cohort. The best performing algorithm was a random forest model (area under the receiver operating characteristic curve, 0.867 [95% CI, 0.863-0.895]; area under the precision-recall curve, 0.327 [95% CI, 0.246-0.414]; F1, 0.396 [95% CI, 0.321-0.468]) and significantly outperformed the Pediatric Logistic Organ Dysfunction-2 score (area under the receiver operating characteristic curve, 0.761 [95% CI, 0.713-0.810]; area under the precision-recall curve (0.239 [95% CI, 0.165-0.316]; F1, 0.284 [95% CI, 0.209-0.360]), although this difference was reduced after retraining the Pediatric Logistic Organ Dysfunction-2 logistic regression model at the study institution. The random forest model also showed better calibration than the Pediatric Logistic Organ Dysfunction-2 score, and calibration of the random forest model remained superior to the retrained Pediatric Logistic Organ Dysfunction-2 model.ConclusionsA machine learning model achieved better performance than a logistic regression-based score for predicting ICU mortality. Better estimation of mortality risk can improve our ability to adjust for severity of illness in future studies, although external validation is required before this method can be widely deployed.
Project description:BackgroundRisk scores can be useful in clinical risk stratification and accurate allocations of medical resources, helping health providers improve patient care. Point-based scores are more understandable and explainable than other complex models and are now widely used in clinical decision making. However, the development of the risk scoring model is nontrivial and has not yet been systematically presented, with few studies investigating methods of clinical score generation using electronic health records.ObjectiveThis study aims to propose AutoScore, a machine learning-based automatic clinical score generator consisting of 6 modules for developing interpretable point-based scores. Future users can employ the AutoScore framework to create clinical scores effortlessly in various clinical applications.MethodsWe proposed the AutoScore framework comprising 6 modules that included variable ranking, variable transformation, score derivation, model selection, score fine-tuning, and model evaluation. To demonstrate the performance of AutoScore, we used data from the Beth Israel Deaconess Medical Center to build a scoring model for mortality prediction and then compared the data with other baseline models using the receiver operating characteristic analysis. A software package in R 3.5.3 (R Foundation) was also developed to demonstrate the implementation of AutoScore.ResultsImplemented on the data set with 44,918 individual admission episodes of intensive care, the AutoScore-created scoring models performed comparably well as other standard methods (ie, logistic regression, stepwise regression, least absolute shrinkage and selection operator, and random forest) in terms of predictive accuracy and model calibration but required fewer predictors and presented high interpretability and accessibility. The nine-variable, AutoScore-created, point-based scoring model achieved an area under the curve (AUC) of 0.780 (95% CI 0.764-0.798), whereas the model of logistic regression with 24 variables had an AUC of 0.778 (95% CI 0.760-0.795). Moreover, the AutoScore framework also drives the clinical research continuum and automation with its integration of all necessary modules.ConclusionsWe developed an easy-to-use, machine learning-based automatic clinical score generator, AutoScore; systematically presented its structure; and demonstrated its superiority (predictive performance and interpretability) over other conventional methods using a benchmark database. AutoScore will emerge as a potential scoring tool in various medical applications.