Dataset Information

Postoperative delirium prediction using machine learning models and preoperative electronic health record data.

ABSTRACT:

Background

Accurate, pragmatic risk stratification for postoperative delirium (POD) is necessary to target preventative resources toward high-risk patients. Machine learning (ML) offers a novel approach to leveraging electronic health record (EHR) data for POD prediction. We sought to develop and internally validate a ML-derived POD risk prediction model using preoperative risk features, and to compare its performance to models developed with traditional logistic regression.

Methods

This was a retrospective analysis of preoperative EHR data from 24,885 adults undergoing a procedure requiring anesthesia care, recovering in the main post-anesthesia care unit, and staying in the hospital at least overnight between December 2016 and December 2019 at either of two hospitals in a tertiary care health system. One hundred fifteen preoperative risk features including demographics, comorbidities, nursing assessments, surgery type, and other preoperative EHR data were used to predict postoperative delirium (POD), defined as any instance of Nursing Delirium Screening Scale ≥2 or positive Confusion Assessment Method for the Intensive Care Unit within the first 7 postoperative days. Two ML models (Neural Network and XGBoost), two traditional logistic regression models ("clinician-guided" and "ML hybrid"), and a previously described delirium risk stratification tool (AWOL-S) were evaluated using the area under the receiver operating characteristic curve (AUC-ROC), sensitivity, specificity, positive likelihood ratio, and positive predictive value. Model calibration was assessed with a calibration curve. Patients with no POD assessments charted or at least 20% of input variables missing were excluded.

Results

POD incidence was 5.3%. The AUC-ROC for Neural Net was 0.841 [95% CI 0. 816-0.863] and for XGBoost was 0.851 [95% CI 0.827-0.874], which was significantly better than the clinician-guided (AUC-ROC 0.763 [0.734-0.793], p < 0.001) and ML hybrid (AUC-ROC 0.824 [0.800-0.849], p < 0.001) regression models and AWOL-S (AUC-ROC 0.762 [95% CI 0.713-0.812], p < 0.001). Neural Net, XGBoost, and ML hybrid models demonstrated excellent calibration, while calibration of the clinician-guided and AWOL-S models was moderate; they tended to overestimate delirium risk in those already at highest risk.

Conclusion

Using pragmatically collected EHR data, two ML models predicted POD in a broad perioperative population with high discrimination. Optimal application of the models would provide automated, real-time delirium risk stratification to improve perioperative management of surgical patients at risk for POD.

SUBMITTER: Bishara A

PROVIDER: S-EPMC8722098 | biostudies-literature | 2022 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Postoperative delirium prediction using machine learning models and preoperative electronic health record data.

Bishara Andrew A Chiu Catherine C Whitlock Elizabeth L EL Douglas Vanja C VC Lee Sei S Butte Atul J AJ Leung Jacqueline M JM Donovan Anne L AL

BMC anesthesiology 20220103 1

<h4>Background</h4>Accurate, pragmatic risk stratification for postoperative delirium (POD) is necessary to target preventative resources toward high-risk patients. Machine learning (ML) offers a novel approach to leveraging electronic health record (EHR) data for POD prediction. We sought to develop and internally validate a ML-derived POD risk prediction model using preoperative risk features, and to compare its performance to models developed with traditional logistic regression.<h4>Methods</ ...[more]

PMID: 34979919

Similar Datasets

Project description:BackgroundAlthough machine learning models demonstrate significant potential in predicting postoperative delirium, the advantages of their implementation in real-world settings remain unclear and require a comparison with conventional models in practical applications.ObjectiveThe objective of this study was to validate the temporal generalizability of decision tree ensemble and sparse linear regression models for predicting delirium after surgery compared with that of the traditional logistic regression model.MethodsThe health record data of patients hospitalized at an advanced emergency and critical care medical center in Kumamoto, Japan, were collected electronically. We developed a decision tree ensemble model using extreme gradient boosting (XGBoost) and a sparse linear regression model using least absolute shrinkage and selection operator (LASSO) regression. To evaluate the predictive performance of the model, we used the area under the receiver operating characteristic curve (AUROC) and the Matthews correlation coefficient (MCC) to measure discrimination and the slope and intercept of the regression between predicted and observed probabilities to measure calibration. The Brier score was evaluated as an overall performance metric. We included 11,863 consecutive patients who underwent surgery with general anesthesia between December 2017 and February 2022. The patients were divided into a derivation cohort before the COVID-19 pandemic and a validation cohort during the COVID-19 pandemic. Postoperative delirium was diagnosed according to the confusion assessment method.ResultsA total of 6497 patients (68.5, SD 14.4 years, women n=2627, 40.4%) were included in the derivation cohort, and 5366 patients (67.8, SD 14.6 years, women n=2105, 39.2%) were included in the validation cohort. Regarding discrimination, the XGBoost model (AUROC 0.87-0.90 and MCC 0.34-0.44) did not significantly outperform the LASSO model (AUROC 0.86-0.89 and MCC 0.34-0.41). The logistic regression model (AUROC 0.84-0.88, MCC 0.33-0.40, slope 1.01-1.19, intercept -0.16 to 0.06, and Brier score 0.06-0.07), with 8 predictors (age, intensive care unit, neurosurgery, emergency admission, anesthesia time, BMI, blood loss during surgery, and use of an ambulance) achieved good predictive performance.ConclusionsThe XGBoost model did not significantly outperform the LASSO model in predicting postoperative delirium. Furthermore, a parsimonious logistic model with a few important predictors achieved comparable performance to machine learning models in predicting postoperative delirium.

Project description:BackgroundA major problem in treating acute kidney injury (AKI) is that clinical criteria for recognition are markers of established kidney damage or impaired function; treatment before such damage manifests is desirable. Clinicians could intervene during what may be a crucial stage for preventing permanent kidney injury if patients with incipient AKI and those at high risk of developing AKI could be identified.ObjectiveIn this study, we evaluate a machine learning algorithm for early detection and prediction of AKI.DesignWe used a machine learning technique, boosted ensembles of decision trees, to train an AKI prediction tool on retrospective data taken from more than 300 000 inpatient encounters.SettingData were collected from inpatient wards at Stanford Medical Center and intensive care unit patients at Beth Israel Deaconess Medical Center.PatientsPatients older than the age of 18 whose hospital stays lasted between 5 and 1000 hours and who had at least one documented measurement of heart rate, respiratory rate, temperature, serum creatinine (SCr), and Glasgow Coma Scale (GCS).MeasurementsWe tested the algorithm's ability to detect AKI at onset and to predict AKI 12, 24, 48, and 72 hours before onset.MethodsWe tested AKI detection and prediction using the National Health Service (NHS) England AKI Algorithm as a gold standard. We additionally tested the algorithm's ability to detect AKI as defined by the Kidney Disease: Improving Global Outcomes (KDIGO) guidelines. We compared the algorithm's 3-fold cross-validation performance to the Sequential Organ Failure Assessment (SOFA) score for AKI identification in terms of area under the receiver operating characteristic (AUROC).ResultsThe algorithm demonstrated high AUROC for detecting and predicting NHS-defined AKI at all tested time points. The algorithm achieves AUROC of 0.872 (95% confidence interval [CI], 0.867-0.878) for AKI detection at time of onset. For prediction 12 hours before onset, the algorithm achieves an AUROC of 0.800 (95% CI, 0.792-0.809). For 24-hour predictions, the algorithm achieves AUROC of 0.795 (95% CI, 0.785-0.804). For 48-hour and 72-hour predictions, the algorithm achieves AUROC values of 0.761 (95% CI, 0.753-0.768) and 0.728 (95% CI, 0.719-0.737), respectively.LimitationsBecause of the retrospective nature of this study, we cannot draw any conclusions about the impact the algorithm's predictions will have on patient outcomes in a clinical setting.ConclusionsThe results of these experiments suggest that a machine learning-based AKI prediction tool may offer important prognostic capabilities for determining which patients are likely to suffer AKI, potentially allowing clinicians to intervene before kidney damage manifests.

Project description:BackgroundPhenotyping analysis that includes time course is useful for understanding the mechanisms and clinical management of postoperative delirium. However, postoperative delirium has not been fully phenotyped. Hypothesis-free categorization of heterogeneous symptoms may be useful for understanding the mechanisms underlying delirium, although evidence is currently lacking. Therefore, we aimed to explore the phenotypes of postoperative delirium following invasive cancer surgery using a data-driven approach with minimal prior knowledge.MethodsWe recruited patients who underwent elective invasive cancer resection. After surgery, participants completed 5 consecutive days of delirium assessments using the Delirium Rating Scale-Revised-98 (DRS-R-98) severity scale. We categorized 65 (13 questionnaire items/day × 5 days) dimensional DRS-R-98 scores using unsupervised machine learning (K-means clustering) to derive a small set of grouped features representing distinct symptoms across all participants. We then reapplied K-means clustering to this set of grouped features to delineate multiple clusters of delirium symptoms.ResultsParticipants were 286 patients, of whom 91 developed delirium defined according to Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, criteria. Following the first K-means clustering, we derived four grouped symptom features: (1) mixed motor, (2) cognitive and higher-order thinking domain with perceptual disturbance and thought content abnormalities, (3) acute and temporal response, and (4) sleep-wake cycle disturbance. Subsequent K-means clustering permitted classification of participants into seven subgroups: (i) cognitive and higher-order thinking domain dominant delirium, (ii) prolonged delirium, (iii) acute and brief delirium, (iv) subsyndromal delirium-enriched, (v) subsyndromal delirium-enriched with insomnia, (vi) insomnia, and (vii) fit.ConclusionWe found that patients who have undergone invasive cancer resection can be delineated using unsupervised machine learning into three delirium clusters, two subsyndromal delirium clusters, and an insomnia cluster. Validation of clusters and research into the pathophysiology underlying each cluster will help to elucidate the mechanisms of postoperative delirium after invasive cancer surgery.

Project description:ImportanceA variety of perioperative risk factors are associated with postoperative mortality risk. However, the relative contribution of routinely collected intraoperative clinical parameters to short-term and long-term mortality remains understudied.ObjectiveTo examine the performance of multiple machine learning models with data from different perioperative periods to predict 30-day, 1-year, and 5-year mortality and investigate factors that contribute to these predictions.Design, setting, and participantsIn this prognostic study using prospectively collected data, risk prediction models were developed for short-term and long-term mortality after cardiac surgery. Included participants were adult patients undergoing a first-time valve operation, coronary artery bypass grafting, or a combination of both between 1997 and 2017 in a single center, the University Medical Centre Groningen in the Netherlands. Mortality data were obtained in November 2017. Data analysis took place between February 2020 and August 2021.ExposureCardiac surgery.Main outcomes and measuresPostoperative mortality rates at 30 days, 1 year, and 5 years were the primary outcomes. The area under the receiver operating characteristic curve (AUROC) was used to assess discrimination. The contribution of all preoperative, intraoperative hemodynamic and temperature, and postoperative factors to mortality was investigated using Shapley additive explanations (SHAP) values.ResultsData from 9415 patients who underwent cardiac surgery (median [IQR] age, 68 [60-74] years; 2554 [27.1%] women) were included. Overall mortality rates at 30 days, 1 year, and 5 years were 268 patients (2.8%), 420 patients (4.5%), and 612 patients (6.5%), respectively. Models including preoperative, intraoperative, and postoperative data achieved AUROC values of 0.82 (95% CI, 0.78-0.86), 0.81 (95% CI, 0.77-0.85), and 0.80 (95% CI, 0.75-0.84) for 30-day, 1-year, and 5-year mortality, respectively. Models including only postoperative data performed similarly (30 days: 0.78 [95% CI, 0.73-0.82]; 1 year: 0.79 [95% CI, 0.74-0.83]; 5 years: 0.77 [95% CI, 0.73-0.82]). However, models based on all perioperative data provided less clinically usable predictions, with lower detection rates; for example, postoperative models identified a high-risk group with a 2.8-fold increase in risk for 5-year mortality (4.1 [95% CI, 3.3-5.1]) vs an increase of 11.3 (95% CI, 6.8-18.7) for the high-risk group identified by the full perioperative model. Postoperative markers associated with metabolic dysfunction and decreased kidney function were the main factors contributing to mortality risk.Conclusions and relevanceThis study found that the addition of continuous intraoperative hemodynamic and temperature data to postoperative data was not associated with improved machine learning-based identification of patients at increased risk of short-term and long-term mortality after cardiac operations.

Dataset Information

Postoperative delirium prediction using machine learning models and preoperative electronic health record data.

Background

Methods

Results

Conclusion

Publications

Postoperative delirium prediction using machine learning models and preoperative electronic health record data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets