Dataset Information

Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations.

ABSTRACT:

Objective

This study aimed to develop and validate a claims-based, machine learning algorithm to predict clinical outcomes across both medical and surgical patient populations.

Methods

This retrospective, observational cohort study, used a random 5% sample of 770,777 fee-for-service Medicare beneficiaries with an inpatient hospitalization between 2009-2011. The machine learning algorithms tested included: support vector machine, random forest, multilayer perceptron, extreme gradient boosted tree, and logistic regression. The extreme gradient boosted tree algorithm outperformed the alternatives and was the machine learning method used for the final risk model. Primary outcome was 30-day mortality. Secondary outcomes were: rehospitalization, and any of 23 adverse clinical events occurring within 30 days of the index admission date.

Results

The machine learning algorithm performance was evaluated by both the area under the receiver operating curve (AUROC) and Brier Score. The risk model demonstrated high performance for prediction of: 30-day mortality (AUROC = 0.88; Brier Score = 0.06), and 17 of the 23 adverse events (AUROC range: 0.80-0.86; Brier Score range: 0.01-0.05). The risk model demonstrated moderate performance for prediction of: rehospitalization within 30 days (AUROC = 0.73; Brier Score: = 0.07) and six of the 23 adverse events (AUROC range: 0.74-0.79; Brier Score range: 0.01-0.02). The machine learning risk model performed comparably on a second, independent validation dataset, confirming that the risk model was not overfit.

Conclusions and relevance

We have developed and validated a robust, claims-based, machine learning risk model that is applicable to both medical and surgical patient populations and demonstrates comparable predictive accuracy to existing risk models.

SUBMITTER: MacKay EJ

PROVIDER: S-EPMC8174683 | biostudies-literature | 2021

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations.

MacKay Emily J EJ Stubna Michael D MD Chivers Corey C Draugelis Michael E ME Hanson William J WJ Desai Nimesh D ND Groeneveld Peter W PW

PloS one 20210603 6

<h4>Objective</h4>This study aimed to develop and validate a claims-based, machine learning algorithm to predict clinical outcomes across both medical and surgical patient populations.<h4>Methods</h4>This retrospective, observational cohort study, used a random 5% sample of 770,777 fee-for-service Medicare beneficiaries with an inpatient hospitalization between 2009-2011. The machine learning algorithms tested included: support vector machine, random forest, multilayer perceptron, extreme gradie ...[more]

PMID: 34081720

Similar Datasets

Project description:IntroductionBody mass index (BMI) is inadequately recorded in US administrative claims databases. We aimed to validate the sensitivity and positive predictive value (PPV) of BMI-related diagnosis codes using an electronic medical records (EMR) claims-linked database. Additionally, we applied machine learning (ML) to identify features in US claims databases to predict obesity status.Research design and methodsThis observational, retrospective analysis included 692 119 people ≥18 years of age, with ≥1 BMI reading in MarketScan Explorys Claims-EMR data (January 2013-December 2019). Claims-based obesity status was compared with EMR-based BMI (gold standard) to assess BMI-related diagnosis code sensitivity and PPV. Logistic regression (LR), penalized LR with L1 penalty (Least Absolute Shrinkage and Selection Operator), extreme gradient boosting (XGBoost) and random forest, with features drawn from insurance claims, were trained to predict obesity status (BMI≥30 kg/m2) from EMR as the gold standard. Model performance was compared using several metrics, including the area under the receiver operating characteristic curve. The best-performing model was applied to assess feature importance. Obesity risk scores were computed from the best model generated from the claims database and compared against the BMI recorded in the EMR.ResultsThe PPV of diagnosis codes from claims alone remained high over the study period (85.4-89.2%); sensitivity was low (16.8-44.8%). XGBoost performed the best at predicting obesity with the highest area under the curve (AUC; 79.4%) and the lowest Brier score. The number of obesity diagnoses and obesity diagnoses from inpatient settings were the most important predictors of obesity. XGBoost showed an AUC of 74.1% when trained without an obesity diagnosis.ConclusionsObesity prevalence is under-reported in claims databases. ML models, with or without explicit obesity, show promise in improving obesity prediction accuracy compared with obesity codes alone. Improved obesity status prediction may assist practitioners and payors to estimate the burden of obesity and investigate the potential unmet needs of current treatments.

Project description:Importance:Accurate risk stratification of patients with heart failure (HF) is critical to deploy targeted interventions aimed at improving patients' quality of life and outcomes. Objectives:To compare machine learning approaches with traditional logistic regression in predicting key outcomes in patients with HF and evaluate the added value of augmenting claims-based predictive models with electronic medical record (EMR)-derived information. Design, Setting, and Participants:A prognostic study with a 1-year follow-up period was conducted including 9502 Medicare-enrolled patients with HF from 2 health care provider networks in Boston, Massachusetts ("providers" includes physicians, clinicians, other health care professionals, and their institutions that comprise the networks). The study was performed from January 1, 2007, to December 31, 2014; data were analyzed from January 1 to December 31, 2018. Main Outcomes and Measures:All-cause mortality, HF hospitalization, top cost decile, and home days loss greater than 25% were modeled using logistic regression, least absolute shrinkage and selection operation regression, classification and regression trees, random forests, and gradient-boosted modeling (GBM). All models were trained using data from network 1 and tested in network 2. After selecting the most efficient modeling approach based on discrimination, Brier score, and calibration, area under precision-recall curves (AUPRCs) and net benefit estimates from decision curves were calculated to focus on the differences when using claims-only vs claims + EMR predictors. Results:A total of 9502 patients with HF with a mean (SD) age of 78 (8) years were included: 6113 from network 1 (training set) and 3389 from network 2 (testing set). Gradient-boosted modeling consistently provided the highest discrimination, lowest Brier scores, and good calibration across all 4 outcomes; however, logistic regression had generally similar performance (C statistics for logistic regression based on claims-only predictors: mortality, 0.724; 95% CI, 0.705-0.744; HF hospitalization, 0.707; 95% CI, 0.676-0.737; high cost, 0.734; 95% CI, 0.703-0.764; and home days loss claims only, 0.781; 95% CI, 0.764-0.798; C statistics for GBM: mortality, 0.727; 95% CI, 0.708-0.747; HF hospitalization, 0.745; 95% CI, 0.718-0.772; high cost, 0.733; 95% CI, 0.703-0.763; and home days loss, 0.790; 95% CI, 0.773-0.807). Higher AUPRCs were obtained for claims + EMR vs claims-only GBMs predicting mortality (0.484 vs 0.423), HF hospitalization (0.413 vs 0.403), and home time loss (0.575 vs 0.521) but not cost (0.249 vs 0.252). The net benefit for claims + EMR vs claims-only GBMs was higher at various threshold probabilities for mortality and home time loss outcomes but similar for the other 2 outcomes. Conclusions and Relevance:Machine learning methods offered only limited improvement over traditional logistic regression in predicting key HF outcomes. Inclusion of additional predictors from EMRs to claims-based models appeared to improve prediction for some, but not all, outcomes.

Project description:ObjectivesAdministrative claims data sets are often used for emergency care research and policy investigations of healthcare resource utilization, acute care practices, and evaluation of quality improvement interventions. Despite the high profile of emergency department (ED) visits in analyses using administrative claims, little work has evaluated the degree to which existing definitions based on claims data accurately captures conventionally defined hospital-based ED services. We sought to construct an operational definition for ED visitation using a comprehensive Medicare data set and to compare this definition to existing operational definitions used by researchers and policymakers.MethodsWe examined four operational definitions of an ED visit commonly used by researchers and policymakers using a 20% sample of the 2012 Medicare Chronic Condition Warehouse (CCW) data set. The CCW data set included all Part A (hospital) and Part B (hospital outpatient, physician) claims for a nationally representative sample of continuously enrolled Medicare fee-for-services beneficiaries. Three definitions were based on published research or existing quality metrics including: 1) provider claims-based definition, 2) facility claims-based definition, and 3) CMS Research Data Assistance Center (ResDAC) definition. In addition, we developed a fourth operational definition (Yale definition) that sought to incorporate additional coding rules for identifying ED visits. We report levels of agreement and disagreement among the four definitions.ResultsOf 10,717,786 beneficiaries included in the sample data set, 22% had evidence of ED use during the study year under any of the ED visit definitions. The definition using provider claims identified a total of 4,199,148 ED visits, the facility definition 4,795,057 visits, the ResDAC definition 5,278,980 ED visits, and the Yale definition 5,192,235 ED visits. The Yale definition identified a statistically different (p < 0.05) collection of ED visits than all other definitions including 17% more ED visits than the provider definition and 2% fewer visits than the ResDAC definition. Differences in ED visitation counts between each definition occurred for several reasons including the inclusion of critical care or observation services in the ED, discrepancies between facility and provider billing regulations, and operational decisions of each definition.ConclusionCurrent operational definitions of ED visitation using administrative claims produce different estimates of ED visitation based on the underlying assumptions applied to billing data and data set availability. Future analyses using administrative claims data should seek to validate specific definitions and inform the development of a consistent, consensus ED visitation definitions to standardize research reporting and the interpretation of policy interventions.

Dataset Information

Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations.

Objective

Methods

Results

Conclusions and relevance

Publications

Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets