Dataset Information

PREDICTIVE MODELING OF HOSPITAL READMISSION RATES USING ELECTRONIC MEDICAL RECORD-WIDE MACHINE LEARNING: A CASE-STUDY USING MOUNT SINAI HEART FAILURE COHORT.

ABSTRACT: Reduction of preventable hospital readmissions that result from chronic or acute conditions like stroke, heart failure, myocardial infarction and pneumonia remains a significant challenge for improving the outcomes and decreasing the cost of healthcare delivery in the United States. Patient readmission rates are relatively high for conditions like heart failure (HF) despite the implementation of high-quality healthcare delivery operation guidelines created by regulatory authorities. Multiple predictive models are currently available to evaluate potential 30-day readmission rates of patients. Most of these models are hypothesis driven and repetitively assess the predictive abilities of the same set of biomarkers as predictive features. In this manuscript, we discuss our attempt to develop a data-driven, electronic-medical record-wide (EMR-wide) feature selection approach and subsequent machine learning to predict readmission probabilities. We have assessed a large repertoire of variables from electronic medical records of heart failure patients in a single center. The cohort included 1,068 patients with 178 patients were readmitted within a 30-day interval (16.66% readmission rate). A total of 4,205 variables were extracted from EMR including diagnosis codes (n=1,763), medications (n=1,028), laboratory measurements (n=846), surgical procedures (n=564) and vital signs (n=4). We designed a multistep modeling strategy using the Naïve Bayes algorithm. In the first step, we created individual models to classify the cases (readmitted) and controls (non-readmitted). In the second step, features contributing to predictive risk from independent models were combined into a composite model using a correlation-based feature selection (CFS) method. All models were trained and tested using a 5-fold cross-validation method, with 70% of the cohort used for training and the remaining 30% for testing. Compared to existing predictive models for HF readmission rates (AUCs in the range of 0.6-0.7), results from our EMR-wide predictive model (AUC=0.78; Accuracy=83.19%) and phenome-wide feature selection strategies are encouraging and reveal the utility of such datadriven machine learning. Fine tuning of the model, replication using multi-center cohorts and prospective clinical trial to evaluate the clinical utility would help the adoption of the model as a clinical decision system for evaluating readmission status.

SUBMITTER: Shameer K

PROVIDER: S-EPMC5362124 | biostudies-other | 2017

REPOSITORIES: biostudies-other

ACCESS DATA

Publications

PREDICTIVE MODELING OF HOSPITAL READMISSION RATES USING ELECTRONIC MEDICAL RECORD-WIDE MACHINE LEARNING: A CASE-STUDY USING MOUNT SINAI HEART FAILURE COHORT.

Shameer Khader K Johnson Kipp W KW Yahi Alexandre A Miotto Riccardo R Li L I LI Ricks Doran D Jebakaran Jebakumar J Kovatch Patricia P Sengupta Partho P PP Gelijns Sengupta S Moskovitz Alan A Darrow Bruce B David David L DL Kasarskis Andrew A Tatonetti Nicholas P NP Pinney Sean S Dudley Joel T JT

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing 20170101

Reduction of preventable hospital readmissions that result from chronic or acute conditions like stroke, heart failure, myocardial infarction and pneumonia remains a significant challenge for improving the outcomes and decreasing the cost of healthcare delivery in the United States. Patient readmission rates are relatively high for conditions like heart failure (HF) despite the implementation of high-quality healthcare delivery operation guidelines created by regulatory authorities. Multiple pre ...[more]

PMID: 27896982

Similar Datasets

Project description:RationalePatients transferred from the intensive care unit to the wards who are later readmitted to the intensive care unit have increased length of stay, healthcare expenditure, and mortality compared with those who are never readmitted. Improving risk stratification for patients transferred to the wards could have important benefits for critically ill hospitalized patients.ObjectivesWe aimed to use a machine-learning technique to derive and validate an intensive care unit readmission prediction model with variables available in the electronic health record in real time and compare it to previously published algorithms.MethodsThis observational cohort study was conducted at an academic hospital in the United States with approximately 600 inpatient beds. A total of 24,885 intensive care unit transfers to the wards were included, with 14,962 transfers (60%) in the training cohort and 9,923 transfers (40%) in the internal validation cohort. Patient characteristics, nursing assessments, International Classification of Diseases, Ninth Revision codes from prior admissions, medications, intensive care unit interventions, diagnostic tests, vital signs, and laboratory results were extracted from the electronic health record and used as predictor variables in a gradient-boosted machine model. Accuracy for predicting intensive care unit readmission was compared with the Stability and Workload Index for Transfer score and Modified Early Warning Score in the internal validation cohort and also externally using the Medical Information Mart for Intensive Care database (n = 42,303 intensive care unit transfers).ResultsEleven percent (2,834) of discharges to the wards were later readmitted to the intensive care unit. The machine-learning-derived model had significantly better performance (area under the receiver operating curve, 0.76) than either the Stability and Workload Index for Transfer score (area under the receiver operating curve, 0.65), or Modified Early Warning Score (area under the receiver operating curve, 0.58; P value < 0.0001 for all comparisons). At a specificity of 95%, the derived model had a sensitivity of 28% compared with 15% for Stability and Workload Index for Transfer score and 7% for the Modified Early Warning Score. Accuracy improvements with the derived model over Modified Early Warning Score and Stability and Workload Index for Transfer were similar in the Medical Information Mart for Intensive Care-III cohort.ConclusionsA machine learning approach to predicting intensive care unit readmission was significantly more accurate than previously published algorithms in both our internal validation and the Medical Information Mart for Intensive Care-III cohort. Implementation of this approach could target patients who may benefit from additional time in the intensive care unit or more frequent monitoring after transfer to the hospital ward.

Project description:BackgroundEarly unplanned hospital readmissions are associated with increased harm to patients, increased medical costs, and negative hospital reputation. With the identification of at-risk patients, a crucial step toward improving care, appropriate interventions can be adopted to prevent readmission. This study aimed to build machine learning models to predict 14-day unplanned readmissions.MethodsWe conducted a retrospective cohort study on 37,091 consecutive hospitalized adult patients with 55,933 discharges between September 1, 2018, and August 31, 2019, in an 1193-bed university hospital. Patients who were aged < 20 years, were admitted for cancer-related treatment, participated in clinical trial, were discharged against medical advice, died during admission, or lived abroad were excluded. Predictors for analysis included 7 categories of variables extracted from hospital's medical record dataset. In total, four machine learning algorithms, namely logistic regression, random forest, extreme gradient boosting, and categorical boosting, were used to build classifiers for prediction. The performance of prediction models for 14-day unplanned readmission risk was evaluated using precision, recall, F1-score, area under the receiver operating characteristic curve (AUROC), and area under the precision-recall curve (AUPRC).ResultsIn total, 24,722 patients were included for the analysis. The mean age of the cohort was 57.34 ± 18.13 years. The 14-day unplanned readmission rate was 1.22%. Among the 4 machine learning algorithms selected, Catboost had the best average performance in fivefold cross-validation (precision: 0.9377, recall: 0.5333, F1-score: 0.6780, AUROC: 0.9903, and AUPRC: 0.7515). After incorporating 21 most influential features in the Catboost model, its performance improved (precision: 0.9470, recall: 0.5600, F1-score: 0.7010, AUROC: 0.9909, and AUPRC: 0.7711).ConclusionsOur models reliably predicted 14-day unplanned readmissions and were explainable. They can be used to identify patients with a high risk of unplanned readmission based on influential features, particularly features related to diagnoses. The operation of the models with physiological indicators also corresponded to clinical experience and literature. Identifying patients at high risk with these models can enable early discharge planning and transitional care to prevent readmissions. Further studies should include additional features that may enable further sensitivity in identifying patients at a risk of early unplanned readmissions.

Project description:ImportanceThe Affordable Care Act has led to US national reductions in hospital 30-day readmission rates for heart failure (HF), acute myocardial infarction (AMI), and pneumonia. Whether readmission reductions have had the unintended consequence of increasing mortality after hospitalization is unknown.ObjectiveTo examine the correlation of paired trends in hospital 30-day readmission rates and hospital 30-day mortality rates after discharge.Design, setting, and participantsRetrospective study of Medicare fee-for-service beneficiaries aged 65 years or older hospitalized with HF, AMI, or pneumonia from January 1, 2008, through December 31, 2014.ExposureThirty-day risk-adjusted readmission rate (RARR).Main outcomes and measuresThirty-day RARRs and 30-day risk-adjusted mortality rates (RAMRs) after discharge were calculated for each condition in each month at each hospital in 2008 through 2014. Monthly trends in each hospital's 30-day RARRs and 30-day RAMRs after discharge were examined for each condition. The weighted Pearson correlation coefficient was calculated for hospitals' paired monthly trends in 30-day RARRs and 30-day RAMRs after discharge for each condition.ResultsIn 2008 through 2014, 2 962 554 hospitalizations for HF, 1 229 939 for AMI, and 2 544 530 for pneumonia were identified at 5016, 4772, and 5057 hospitals, respectively. In January 2008, mean hospital 30-day RARRs and 30-day RAMRs after discharge were 24.6% and 8.4% for HF, 19.3% and 7.6% for AMI, and 18.3% and 8.5% for pneumonia. Hospital 30-day RARRs declined in the aggregate across hospitals from 2008 through 2014; monthly changes in RARRs were -0.053% (95% CI, -0.055% to -0.051%) for HF, -0.044% (95% CI, -0.047% to -0.041%) for AMI, and -0.033% (95% CI, -0.035% to -0.031%) for pneumonia. In contrast, monthly aggregate changes across hospitals in hospital 30-day RAMRs after discharge varied by condition: HF, 0.008% (95% CI, 0.007% to 0.010%); AMI, -0.003% (95% CI, -0.005% to -0.001%); and pneumonia, 0.001% (95% CI, -0.001% to 0.003%). However, correlation coefficients in hospitals' paired monthly changes in 30-day RARRs and 30-day RAMRs after discharge were weakly positive: HF, 0.066 (95% CI, 0.036 to 0.096); AMI, 0.067 (95% CI, 0.027 to 0.106); and pneumonia, 0.108 (95% CI, 0.079 to 0.137). Findings were similar in secondary analyses, including with alternate definitions of hospital mortality.Conclusions and relevanceAmong Medicare fee-for-service beneficiaries hospitalized for heart failure, acute myocardial infarction, or pneumonia, reductions in hospital 30-day readmission rates were weakly but significantly correlated with reductions in hospital 30-day mortality rates after discharge. These findings do not support increasing postdischarge mortality related to reducing hospital readmissions.

Project description:Background and objectivesDiabetes mellitus is a major chronic disease that results in readmissions due to poor disease control. Here we established and compared machine learning (ML)-based readmission prediction methods to predict readmission risks of diabetic patients.MethodsThe dataset analyzed in this study was acquired from the Health Facts Database, which includes over 100,000 records of diabetic patients from 1999 to 2008. The basic data distribution characteristics of this dataset were summarized and then analyzed. In this study, 30-days readmission was defined as a readmission period of less than 30 days. After data preprocessing and normalization, multiple risk factors in the dataset were examined for classifier training to predict the probability of readmission using ML models. Different ML classifiers such as random forest, Naive Bayes, and decision tree ensemble were adopted to improve the clinical efficiency of the classification. In this study, the Konstanz Information Miner platform was used to preprocess and model the data, and the performances of the different classifiers were compared.ResultsA total of 100,244 records were included in the model construction after the data preprocessing and normalization. A total of 23 attributes, including race, sex, age, admission type, admission location, length of stay, and drug use, were finally identified as modeling risk factors. Comparison of the performance indexes of the three algorithms revealed that the RF model had the best performance with a higher area under receiver operating characteristic curve (AUC) than the other two algorithms, suggesting that its use is more suitable for making readmission predictions.ConclusionThe factors influencing 30-days readmission predictions in diabetic patients, including number of inpatient admissions, age, diagnosis, number of emergencies, and sex, would help healthcare providers to identify patients who are at high risk of short-term readmission and reduce the probability of 30-days readmission. The RF algorithm with the highest AUC is more suitable for making 30-days readmission predictions and deserves further validation in clinical trials.

Dataset Information

PREDICTIVE MODELING OF HOSPITAL READMISSION RATES USING ELECTRONIC MEDICAL RECORD-WIDE MACHINE LEARNING: A CASE-STUDY USING MOUNT SINAI HEART FAILURE COHORT.

Publications

PREDICTIVE MODELING OF HOSPITAL READMISSION RATES USING ELECTRONIC MEDICAL RECORD-WIDE MACHINE LEARNING: A CASE-STUDY USING MOUNT SINAI HEART FAILURE COHORT.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets