Dataset Information

Machine learning for patient risk stratification for acute respiratory distress syndrome.

ABSTRACT: BACKGROUND:Existing prediction models for acute respiratory distress syndrome (ARDS) require manual chart abstraction and have only fair performance-limiting their suitability for driving clinical interventions. We sought to develop a machine learning approach for the prediction of ARDS that (a) leverages electronic health record (EHR) data, (b) is fully automated, and (c) can be applied at clinically relevant time points throughout a patient's stay. METHODS AND FINDINGS:We trained a risk stratification model for ARDS using a cohort of 1,621 patients with moderate hypoxia from a single center in 2016, of which 51 patients developed ARDS. We tested the model in a temporally distinct cohort of 1,122 patients from 2017, of which 27 patients developed ARDS. Gold standard diagnosis of ARDS was made by intensive care trained physicians during retrospective chart review. We considered both linear and non-linear approaches to learning the model. The best model used L2-logistic regression with 984 features extracted from the EHR. For patients observed in the hospital at least six hours who then developed moderate hypoxia, the model achieved an area under the receiver operating characteristics curve (AUROC) of 0.81 (95% CI: 0.73-0.88). Selecting a threshold based on the 85th percentile of risk, the model had a sensitivity of 56% (95% CI: 35%, 74%), specificity of 86% (95% CI: 85%, 87%) and positive predictive value of 9% (95% CI: 5%, 14%), identifying a population at four times higher risk for ARDS than other patients with moderate hypoxia and 17 times the risk of hospitalized adults. CONCLUSIONS:We developed an ARDS prediction model based on EHR data with good discriminative performance. Our results demonstrate the feasibility of a machine learning approach to risk stratifying patients for ARDS solely from data extracted automatically from the EHR.

SUBMITTER: Zeiberg D

PROVIDER: S-EPMC6438573 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Machine learning for patient risk stratification for acute respiratory distress syndrome.

Zeiberg Daniel D Prahlad Tejas T Nallamothu Brahmajee K BK Iwashyna Theodore J TJ Wiens Jenna J Sjoding Michael W MW

PloS one 20190328 3

<h4>Background</h4>Existing prediction models for acute respiratory distress syndrome (ARDS) require manual chart abstraction and have only fair performance-limiting their suitability for driving clinical interventions. We sought to develop a machine learning approach for the prediction of ARDS that (a) leverages electronic health record (EHR) data, (b) is fully automated, and (c) can be applied at clinically relevant time points throughout a patient's stay.<h4>Methods and findings</h4>We traine ...[more]

PMID: 30921400

Similar Datasets

Project description:BackgroundAcute kidney injury (AKI) can make cases of acute respiratory distress syndrome (ARDS) more complex, and the combination of the two can significantly worsen the prognosis. Our objective is to utilize machine learning (ML) techniques to construct models that can promptly identify the risk of AKI in ARDS patients.MethodWe obtained data regarding ARDS patients from the Medical Information Mart for Intensive Care III (MIMIC-III) and MIMIC-IV databases. Within the MIMIC-III dataset, we developed 11 ML prediction models. By evaluating various metrics, we visualized the importance of its features using Shapley additive explanations (SHAP). We then created a more concise model using fewer variables, and optimized it using hyperparameter optimization (HPO). The model was validated using the MIMIC-IV dataset.ResultA total of 928 ARDS patients without AKI were included in the analysis from the MIMIC-III dataset, and among them, 179 (19.3%) developed AKI after admission to the intensive care unit (ICU). In the MIMIC-IV dataset, there were 653 ARDS patients included in the analysis, and among them, 237 (36.3%) developed AKI. A total of 43 features were used to build the model. Among all models, eXtreme gradient boosting (XGBoost) performed the best. We used the top 10 features to build a compact model with an area under the curve (AUC) of 0.850, which improved to an AUC of 0.865 after the HPO. In extra validation set, XGBoost_HPO achieved an AUC of 0.854. The accuracy, sensitivity, specificity, positive prediction value (PPV), negative prediction value (NPV), and F1 score of the XGBoost_HPO model on the test set are 0.865, 0.813, 0.877, 0.578, 0.957 and 0.675, respectively. On extra validation set, they are 0.724, 0.789, 0.688, 0.590, 0.851, and 0.675, respectively.ConclusionML algorithms, especially XGBoost, are reliable for predicting AKI in ARDS patients. The compact model maintains excellent predictive ability, and the web-based calculator improves clinical convenience. This provides valuable guidance in identifying AKI in ARDS, leading to improved patient outcomes.

Project description:ObjectivesTo identify differentially expressed genes and networks from the airway cells within 72 hours of intubation of children with and without pediatric acute respiratory distress syndrome. To test the use of a neutrophil transcription reporter assay to identify immunogenic responses to airway fluid from children with and without pediatric acute respiratory distress syndrome.DesignProspective cohort study.SettingThirty-six bed academic PICU.PatientsFifty-four immunocompetent children, 28 with pediatric acute respiratory distress syndrome, who were between 2 days to 18 years old within 72 hours of intubation for acute hypoxemic respiratory failure.InterventionsNone.Measurements and main resultsWe applied machine learning methods to a Nanostring transcriptomics on primary airway cells and a neutrophil reporter assay to discover gene networks differentiating pediatric acute respiratory distress syndrome from no pediatric acute respiratory distress syndrome. An analysis of moderate or severe pediatric acute respiratory distress syndrome versus no or mild pediatric acute respiratory distress syndrome was performed. Pathway network visualization was used to map pathways from 62 genes selected by ElasticNet associated with pediatric acute respiratory distress syndrome. The Janus kinase/signal transducer and activator of transcription pathway emerged. Support vector machine performed best for the primary airway cells and the neutrophil reporter assay using a leave-one-out cross-validation with an area under the operating curve and 95% CI of 0.75 (0.63-0.87) and 0.80 (0.70-1.0), respectively.ConclusionsWe identified gene networks important to the pediatric acute respiratory distress syndrome airway immune response using semitargeted transcriptomics from primary airway cells and a neutrophil reporter assay. These pathways will drive mechanistic investigations into pediatric acute respiratory distress syndrome. Further studies are needed to validate our findings and to test our models.

Project description:OBJECTIVES:The original Pediatric Sepsis Biomarker Risk Model and revised (Pediatric Sepsis Biomarker Risk Model-II) biomarker-based risk prediction models have demonstrated utility for estimating baseline 28-day mortality risk in pediatric sepsis. Given the paucity of prediction tools in pediatric acute respiratory distress syndrome, and given the overlapping pathophysiology between sepsis and acute respiratory distress syndrome, we tested the utility of Pediatric Sepsis Biomarker Risk Model and Pediatric Sepsis Biomarker Risk Model-II for mortality prediction in a cohort of pediatric acute respiratory distress syndrome, with an a priori plan to revise the model if these existing models performed poorly. DESIGN:Prospective observational cohort study. SETTING:University affiliated PICU. PATIENTS:Mechanically ventilated children with acute respiratory distress syndrome. INTERVENTIONS:Blood collection within 24 hours of acute respiratory distress syndrome onset and biomarker measurements. MEASUREMENTS AND MAIN RESULTS:In 152 children with acute respiratory distress syndrome, Pediatric Sepsis Biomarker Risk Model performed poorly and Pediatric Sepsis Biomarker Risk Model-II performed modestly (areas under receiver operating characteristic curve of 0.61 and 0.76, respectively). Therefore, we randomly selected 80% of the cohort (n = 122) to rederive a risk prediction model for pediatric acute respiratory distress syndrome. We used classification and regression tree methodology, considering the Pediatric Sepsis Biomarker Risk Model biomarkers in addition to variables relevant to acute respiratory distress syndrome. The final model was comprised of three biomarkers and age, and more accurately estimated baseline mortality risk (area under receiver operating characteristic curve 0.85, p < 0.001 and p = 0.053 compared with Pediatric Sepsis Biomarker Risk Model and Pediatric Sepsis Biomarker Risk Model-II, respectively). The model was tested in the remaining 20% of subjects (n = 30) and demonstrated similar test characteristics. CONCLUSIONS:A validated, biomarker-based risk stratification tool designed for pediatric sepsis was adapted for use in pediatric acute respiratory distress syndrome. The newly derived Pediatric Acute Respiratory Distress Syndrome Biomarker Risk Model demonstrates good test characteristics internally and requires external validation in a larger cohort. Tools such as Pediatric Acute Respiratory Distress Syndrome Biomarker Risk Model have the potential to provide improved risk stratification and prognostic enrichment for future trials in pediatric acute respiratory distress syndrome.

Project description:Rationale: Two distinct phenotypes of acute respiratory distress syndrome (ARDS) with differential clinical outcomes and responses to randomly assigned treatment have consistently been identified in randomized controlled trial cohorts using latent class analysis. Plasma biomarkers, key components in phenotype identification, currently lack point-of-care assays and represent a barrier to the clinical implementation of phenotypes.Objectives: The objective of this study was to develop models to classify ARDS phenotypes using readily available clinical data only.Methods: Three randomized controlled trial cohorts served as the training data set (ARMA [High vs. Low Vt], ALVEOLI [Assessment of Low Vt and Elevated End-Expiratory Pressure to Obviate Lung Injury], and FACTT [Fluids and Catheter Treatment Trial]; n = 2,022), and a fourth served as the validation data set (SAILS [Statins for Acutely Injured Lungs from Sepsis]; n = 745). A gradient-boosted machine algorithm was used to develop classifier models using 24 variables (demographics, vital signs, laboratory, and respiratory variables) at enrollment. In two secondary analyses, the ALVEOLI and FACTT cohorts each, individually, served as the validation data set, and the remaining combined cohorts formed the training data set for each analysis. Model performance was evaluated against the latent class analysis-derived phenotype.Measurements and Main Results: For the primary analysis, the model accurately classified the phenotypes in the validation cohort (area under the receiver operating characteristic curve [AUC], 0.95; 95% confidence interval [CI], 0.94-0.96). Using a probability cutoff of 0.5 to assign class, inflammatory biomarkers (IL-6, IL-8, and sTNFR-1; P < 0.0001) and 90-day mortality (38% vs. 24%; P = 0.0002) were significantly higher in the hyperinflammatory phenotype as classified by the model. Model accuracy was similar when ALVEOLI (AUC, 0.94; 95% CI, 0.92-0.96) and FACTT (AUC, 0.94; 95% CI, 0.92-0.95) were used as the validation cohorts. Significant treatment interactions were observed with the clinical classifier model-assigned phenotypes in both ALVEOLI (P = 0.0113) and FACTT (P = 0.0072) cohorts.Conclusions: ARDS phenotypes can be accurately identified using machine learning models based on readily available clinical data and may enable rapid phenotype identification at the bedside.

Dataset Information

Machine learning for patient risk stratification for acute respiratory distress syndrome.

Publications

Machine learning for patient risk stratification for acute respiratory distress syndrome.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets