Dataset Information

Machine-Learning vs. Expert-Opinion Driven Logistic Regression Modelling for Predicting 30-Day Unplanned Rehospitalisation in Preterm Babies: A Prospective, Population-Based Study (EPIPAGE 2).

ABSTRACT: Introduction: Preterm babies are a vulnerable population that experience significant short and long-term morbidity. Rehospitalisations constitute an important, potentially modifiable adverse event in this population. Improving the ability of clinicians to identify those patients at the greatest risk of rehospitalisation has the potential to improve outcomes and reduce costs. Machine-learning algorithms can provide potentially advantageous methods of prediction compared to conventional approaches like logistic regression. Objective: To compare two machine-learning methods (least absolute shrinkage and selection operator (LASSO) and random forest) to expert-opinion driven logistic regression modelling for predicting unplanned rehospitalisation within 30 days in a large French cohort of preterm babies. Design, Setting and Participants: This study used data derived exclusively from the population-based prospective cohort study of French preterm babies, EPIPAGE 2. Only those babies discharged home alive and whose parents completed the 1-year survey were eligible for inclusion in our study. All predictive models used a binary outcome, denoting a baby's status for an unplanned rehospitalisation within 30 days of discharge. Predictors included those quantifying clinical, treatment, maternal and socio-demographic factors. The predictive abilities of models constructed using LASSO and random forest algorithms were compared with a traditional logistic regression model. The logistic regression model comprised 10 predictors, selected by expert clinicians, while the LASSO and random forest included 75 predictors. Performance measures were derived using 10-fold cross-validation. Performance was quantified using area under the receiver operator characteristic curve, sensitivity, specificity, Tjur's coefficient of determination and calibration measures. Results: The rate of 30-day unplanned rehospitalisation in the eligible population used to construct the models was 9.1% (95% CI 8.2-10.1) (350/3,841). The random forest model demonstrated both an improved AUROC (0.65; 95% CI 0.59-0.7; p = 0.03) and specificity vs. logistic regression (AUROC 0.57; 95% CI 0.51-0.62, p = 0.04). The LASSO performed similarly (AUROC 0.59; 95% CI 0.53-0.65; p = 0.68) to logistic regression. Conclusions: Compared to an expert-specified logistic regression model, random forest offered improved prediction of 30-day unplanned rehospitalisation in preterm babies. However, all models offered relatively low levels of predictive ability, regardless of modelling method.

SUBMITTER: Reed RA

PROVIDER: S-EPMC7886676 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Machine-Learning vs. Expert-Opinion Driven Logistic Regression Modelling for Predicting 30-Day Unplanned Rehospitalisation in Preterm Babies: A Prospective, Population-Based Study (EPIPAGE 2).

Reed Robert A RA Morgan Andrei S AS Zeitlin Jennifer J Jarreau Pierre-Henri PH Torchin Héloïse H Pierrat Véronique V Ancel Pierre-Yves PY Khoshnood Babak B

Frontiers in pediatrics 20210203

<b>Introduction:</b> Preterm babies are a vulnerable population that experience significant short and long-term morbidity. Rehospitalisations constitute an important, potentially modifiable adverse event in this population. Improving the ability of clinicians to identify those patients at the greatest risk of rehospitalisation has the potential to improve outcomes and reduce costs. Machine-learning algorithms can provide potentially advantageous methods of prediction compared to conventional app ...[more]

PMID: 33614539

Similar Datasets

Project description:It is expected but unknown whether machine-learning models can outperform regression models, such as a logistic regression (LR) model, especially when the number and types of predictor variables increase in electronic health records (EHRs). We aimed to compare the predictive performance of gradient-boosted decision tree (GBDT), random forest (RF), deep neural network (DNN), and LR with the least absolute shrinkage and selection operator (LR-LASSO) for unplanned readmission. We used EHRs of patients discharged alive from 38 hospitals in 2015-2017 for derivation and in 2018 for validation, including basic characteristics, diagnosis, surgery, procedure, and drug codes, and blood-test results. The outcome was 30-day unplanned readmission. We created six patterns of data tables having different numbers of binary variables (that ≥5% or ≥1% of patients or ≥10 patients had) with and without blood-test results. For each pattern of data tables, we used the derivation data to establish the machine-learning and LR models, and used the validation data to evaluate the performance of each model. The incidence of outcome was 6.8% (23,108/339,513 discharges) and 6.4% (7,507/118,074 discharges) in the derivation and validation datasets, respectively. For the first data table with the smallest number of variables (102 variables that ≥5% of patients had, without blood-test results), the c-statistic was highest for GBDT (0.740), followed by RF (0.734), LR-LASSO (0.720), and DNN (0.664). For the last data table with the largest number of variables (1543 variables that ≥10 patients had, including blood-test results), the c-statistic was highest for GBDT (0.764), followed by LR-LASSO (0.755), RF (0.751), and DNN (0.720), suggesting that the difference between GBDT and LR-LASSO was small and their 95% confidence intervals overlapped. In conclusion, GBDT generally outperformed LR-LASSO to predict unplanned readmission, but the difference of c-statistic became smaller as the number of variables was increased and blood-test results were used.

Project description:IntroductionA common quality indicator for monitoring and comparing hospitals is based on death within 30 days of admission. An important use is to determine whether a hospital has higher or lower mortality than other hospitals. Thus, the ability to identify such outliers correctly is essential. Two approaches for detection are: 1) calculating the ratio of observed to expected number of deaths (OE) per hospital and 2) including all hospitals in a logistic regression (LR) comparing each hospital to a form of average over all hospitals. The aim of this study was to compare OE and LR with respect to correctly identifying 30-day mortality outliers. Modifications of the methods, i.e., variance corrected approach of OE (OE-Faris), bias corrected LR (LR-Firth), and trimmed mean variants of LR and LR-Firth were also studied.Materials and methodsTo study the properties of OE and LR and their variants, we performed a simulation study by generating patient data from hospitals with known outlier status (low mortality, high mortality, non-outlier). Data from simulated scenarios with varying number of hospitals, hospital volume, and mortality outlier status, were analysed by the different methods and compared by level of significance (ability to falsely claim an outlier) and power (ability to reveal an outlier). Moreover, administrative data for patients with acute myocardial infarction (AMI), stroke, and hip fracture from Norwegian hospitals for 2012-2014 were analysed.ResultsNone of the methods achieved the nominal (test) level of significance for both low and high mortality outliers. For low mortality outliers, the levels of significance were increased four- to fivefold for OE and OE-Faris. For high mortality outliers, OE and OE-Faris, LR 25% trimmed and LR-Firth 10% and 25% trimmed maintained approximately the nominal level. The methods agreed with respect to outlier status for 94.1% of the AMI hospitals, 98.0% of the stroke, and 97.8% of the hip fracture hospitals.ConclusionWe recommend, on the balance, LR-Firth 10% or 25% trimmed for detection of both low and high mortality outliers.

Project description:ObjectiveUnplanned hospital readmissions following surgical interventions are associated with adverse events and contribute to increasing health care costs. Despite numerous studies defining risk factors following lower extremity bypass surgery, evidence regarding readmission after endovascular interventions is limited. This study aimed to identify predictors of 30-day unplanned readmission following infrainguinal endovascular interventions.MethodsWe identified all patients undergoing an infrainguinal endovascular intervention in the targeted vascular module of the American College of Surgeons National Surgical Quality Improvement Program between 2012 and 2014. Perioperative outcomes were stratified by symptom status (chronic limb-threatening ischemia [CLI] vs claudication). Patients who died during index admission and those who remained in the hospital after 30 days were excluded. Indications for unplanned readmission related to the index procedure were evaluated. Multivariable logistic regression was used to identify preoperative and in-hospital (during index admission) risk factors of 30-day unplanned readmission.ResultsThere were 4449 patients who underwent infrainguinal endovascular intervention, of whom 2802 (63%) had CLI (66% tissue loss) and 1647 (37%) had claudication. The unplanned readmission rates for CLI and claudication patients were 16% (n = 447) and 6.5% (n = 107), respectively. Mortality after index admission was higher for readmitted patients compared with those not readmitted (CLI, 3.4% vs 0.7% [P < .001]; claudication, 2.8% vs 0.1% [P < .01]). Approximately 50% of all unplanned readmissions were related to the index procedure. Among CLI patients, the most common indication for readmission related to the index procedure was wound or infection related (42%), whereas patients with claudication were mainly readmitted for recurrent symptoms of peripheral vascular disease (28%). In patients with CLI, predictors of unplanned readmission included diabetes (odds ratio, 1.3; 95% confidence interval, 1.01-1.6), congestive heart failure (1.6; 1.1-2.5), renal insufficiency (1.7; 1.3-2.2), preoperative dialysis (1.4; 1.02-1.9), tibial angioplasty/stenting (1.3; 1.04-1.6), in-hospital bleeding (1.9; 1.04-3.5), in-hospital unplanned return to the operating room (1.9; 1.1-3.5), and discharge other than to home (1.5; 1.1-2.0). Risk factors for those with claudication were dependent functional status (3.5; 1.4-8.7), smoking (1.6; 1.02-2.5), diabetes (1.5; 1.01-2.3), preoperative dialysis (3.6; 1.6-8.3), procedure time exceeding 120 minutes (1.8; 1.1-2.7), in-hospital bleeding (2.9; 1.2-7.4), and in-hospital unplanned return to the operating room (3.4; 1.2-9.4).ConclusionsUnplanned readmission after endovascular treatment is relatively common, especially in patients with CLI, and is associated with substantially increased mortality. Awareness of these risk factors will help providers identify patients at high risk who may benefit from early surveillance, and prophylactic measures focused on decreasing postoperative complications may reduce the rate of readmission.

Project description:Background and objectivesPatients on hemodialysis have high 30-day unplanned readmission rates. Using a national all-payer administrative database, we describe the epidemiology of 30-day unplanned readmissions in patients on hemodialysis, determine concordance of reasons for initial admission and readmission, and identify predictors for readmission.Design, setting, participants, & measurementsThis is a retrospective cohort study using the Nationwide Readmission Database from the year 2013 to identify index admissions and readmission in patients with ESRD on hemodialysis. The Clinical Classification Software was used to categorize admission diagnosis into mutually exclusive clinically meaningful categories and determine concordance of reasons for admission on index hospitalizations and readmissions. Survey logistic regression was used to identify predictors of at least one readmission.ResultsDuring 2013, there were 87,302 (22%) index admissions with at least one 30-day unplanned readmission. Although patient and hospital characteristics were statistically different between those with and without readmissions, there were small absolute differences. The highest readmission rate was for acute myocardial infarction (25%), whereas the lowest readmission rate was for hypertension (20%). The primary reasons for initial hospitalization and subsequent 30-day readmission were discordant in 80% of admissions. Comorbidities that were associated with readmissions included depression (odds ratio, 1.10; 95% confidence interval [95% CI], 1.05 to 1.15; P<0.001), drug abuse (odds ratio, 1.41; 95% CI, 1.31 to 1.51; P<0.001), and discharge against medical advice (odds ratio, 1.57; 95% CI, 1.45 to 1.70; P<0.001). A group of high utilizers, which constituted 2% of the population, was responsible for 20% of all readmissions.ConclusionsIn patients with ESRD on hemodialysis, nearly one quarter of admissions were followed by a 30-day unplanned readmission. Most readmissions were for primary diagnoses that were different from initial hospitalization. A small proportion of patients accounted for a disproportionate number of readmissions.

Dataset Information

Machine-Learning vs. Expert-Opinion Driven Logistic Regression Modelling for Predicting 30-Day Unplanned Rehospitalisation in Preterm Babies: A Prospective, Population-Based Study (EPIPAGE 2).

Publications

Machine-Learning vs. Expert-Opinion Driven Logistic Regression Modelling for Predicting 30-Day Unplanned Rehospitalisation in Preterm Babies: A Prospective, Population-Based Study (EPIPAGE 2).

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets