Dataset Information

Machine learning optimization of an electronic health record audit for heart failure in primary care.

ABSTRACT:

Aims

The diagnosis of heart failure (HF) is an important problem in primary care. We previously demonstrated a 74% increase in registered HF diagnoses in primary care electronic health records (EHRs) following an extended audit procedure. What remains unclear is the accuracy of registered HF pre-audit and which EHR variables are most important in the extended audit strategy. This study aims to describe the diagnostic HF classification sequence at different stages, assess general practitioner (GP) HF misclassification, and test the predictive performance of an optimized audit.

Methods and results

This is a secondary analysis of the OSCAR-HF study, a prospective observational trial including 51 participating GPs. OSCAR used an extended audit based on typical HF risk factors, signs, symptoms, and medications in GPs' EHR. This resulted in a list of possible HF patients, which participating GPs had to classify as HF or non-HF. We compared registered HF diagnoses before and after GPs' assessment. For our analysis of audit performance, we used GPs' assessment of HF as primary outcome and audit queries as dichotomous predictor variables for a gradient boosted machine (GBM) decision tree algorithm and logistic regression model. Of the 18 011 patients eligible for the audit intervention, 4678 (26.0%) were identified as possible HF patients and submitted for GPs' assessment in the audit stage. There were 310 patients with registered HF before GP assessment, of whom 146 (47.1%) were judged not to have HF by their GP (over-registration). There were 538 patients with registered HF after GP assessment, of whom 374 (69.5%) did not have registered HF before GP assessment (under-registration). The GBM and logistic regression model had a comparable predictive performance (area under the curve of 0.70 [95% confidence interval 0.65-0.77] and 0.69 [95% confidence interval 0.64-0.75], respectively). This was not significantly impacted by reducing the set of predictor variables to the 10 most important variables identified in the GBM model (free-text and coded cardiomyopathy, ischaemic heart disease and atrial fibrillation, digoxin, mineralocorticoid receptor antagonists, and combinations of renin-angiotensin system inhibitors and beta-blockers with diuretics). This optimized query set was enough to identify 86% (n = 461/538) of GPs' self-assessed HF population with a 33% reduction (n = 1537/4678) in screening caseload.

Conclusions

Diagnostic coding of HF in primary care health records is inaccurate with a high degree of under-registration and over-registration. An optimized query set enabled identification of more than 80% of GPs' self-assessed HF population.

SUBMITTER: Raat W

PROVIDER: S-EPMC8787980 | biostudies-literature | 2022 Feb

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Machine learning optimization of an electronic health record audit for heart failure in primary care.

Raat Willem W Smeets Miek M Henrard Severine S Aertgeerts Bert B Penders Joris J Droogne Walter W Mullens Wilfried W Janssens Stefan S Vaes Bert B

ESC heart failure 20211123 1

<h4>Aims</h4>The diagnosis of heart failure (HF) is an important problem in primary care. We previously demonstrated a 74% increase in registered HF diagnoses in primary care electronic health records (EHRs) following an extended audit procedure. What remains unclear is the accuracy of registered HF pre-audit and which EHR variables are most important in the extended audit strategy. This study aims to describe the diagnostic HF classification sequence at different stages, assess general practiti ...[more]

PMID: 34816632

Similar Datasets

Project description:Reduction of preventable hospital readmissions that result from chronic or acute conditions like stroke, heart failure, myocardial infarction and pneumonia remains a significant challenge for improving the outcomes and decreasing the cost of healthcare delivery in the United States. Patient readmission rates are relatively high for conditions like heart failure (HF) despite the implementation of high-quality healthcare delivery operation guidelines created by regulatory authorities. Multiple predictive models are currently available to evaluate potential 30-day readmission rates of patients. Most of these models are hypothesis driven and repetitively assess the predictive abilities of the same set of biomarkers as predictive features. In this manuscript, we discuss our attempt to develop a data-driven, electronic-medical record-wide (EMR-wide) feature selection approach and subsequent machine learning to predict readmission probabilities. We have assessed a large repertoire of variables from electronic medical records of heart failure patients in a single center. The cohort included 1,068 patients with 178 patients were readmitted within a 30-day interval (16.66% readmission rate). A total of 4,205 variables were extracted from EMR including diagnosis codes (n=1,763), medications (n=1,028), laboratory measurements (n=846), surgical procedures (n=564) and vital signs (n=4). We designed a multistep modeling strategy using the Naïve Bayes algorithm. In the first step, we created individual models to classify the cases (readmitted) and controls (non-readmitted). In the second step, features contributing to predictive risk from independent models were combined into a composite model using a correlation-based feature selection (CFS) method. All models were trained and tested using a 5-fold cross-validation method, with 70% of the cohort used for training and the remaining 30% for testing. Compared to existing predictive models for HF readmission rates (AUCs in the range of 0.6-0.7), results from our EMR-wide predictive model (AUC=0.78; Accuracy=83.19%) and phenome-wide feature selection strategies are encouraging and reveal the utility of such datadriven machine learning. Fine tuning of the model, replication using multi-center cohorts and prospective clinical trial to evaluate the clinical utility would help the adoption of the model as a clinical decision system for evaluating readmission status.

Project description:OBJECTIVES:The goal of this study was to use machine learning to more accurately predict survival after echocardiography. BACKGROUND:Predicting patient outcomes (e.g., survival) following echocardiography is primarily based on ejection fraction (EF) and comorbidities. However, there may be significant predictive information within additional echocardiography-derived measurements combined with clinical electronic health record data. METHODS:Mortality was studied in 171,510 unselected patients who underwent 331,317 echocardiograms in a large regional health system. The authors investigated the predictive performance of nonlinear machine learning models compared with that of linear logistic regression models using 3 different inputs: 1) clinical variables, including 90 cardiovascular-relevant International Classification of Diseases, Tenth Revision, codes, and age, sex, height, weight, heart rate, blood pressures, low-density lipoprotein, high-density lipoprotein, and smoking; 2) clinical variables plus physician-reported EF; and 3) clinical variables and EF, plus 57 additional echocardiographic measurements. Missing data were imputed with a multivariate imputation by using a chained equations algorithm (MICE). The authors compared models versus each other and baseline clinical scoring systems by using a mean area under the curve (AUC) over 10 cross-validation folds and across 10 survival durations (6 to 60 months). RESULTS:Machine learning models achieved significantly higher prediction accuracy (all AUC >0.82) over common clinical risk scores (AUC = 0.61 to 0.79), with the nonlinear random forest models outperforming logistic regression (p < 0.01). The random forest model including all echocardiographic measurements yielded the highest prediction accuracy (p < 0.01 across all models and survival durations). Only 10 variables were needed to achieve 96% of the maximum prediction accuracy, with 6 of these variables being derived from echocardiography. Tricuspid regurgitation velocity was more predictive of survival than LVEF. In a subset of studies with complete data for the top 10 variables, multivariate imputation by chained equations yielded slightly reduced predictive accuracies (difference in AUC of 0.003) compared with the original data. CONCLUSIONS:Machine learning can fully utilize large combinations of disparate input variables to predict survival after echocardiography with superior accuracy.

Project description:RationalePatients transferred from the intensive care unit to the wards who are later readmitted to the intensive care unit have increased length of stay, healthcare expenditure, and mortality compared with those who are never readmitted. Improving risk stratification for patients transferred to the wards could have important benefits for critically ill hospitalized patients.ObjectivesWe aimed to use a machine-learning technique to derive and validate an intensive care unit readmission prediction model with variables available in the electronic health record in real time and compare it to previously published algorithms.MethodsThis observational cohort study was conducted at an academic hospital in the United States with approximately 600 inpatient beds. A total of 24,885 intensive care unit transfers to the wards were included, with 14,962 transfers (60%) in the training cohort and 9,923 transfers (40%) in the internal validation cohort. Patient characteristics, nursing assessments, International Classification of Diseases, Ninth Revision codes from prior admissions, medications, intensive care unit interventions, diagnostic tests, vital signs, and laboratory results were extracted from the electronic health record and used as predictor variables in a gradient-boosted machine model. Accuracy for predicting intensive care unit readmission was compared with the Stability and Workload Index for Transfer score and Modified Early Warning Score in the internal validation cohort and also externally using the Medical Information Mart for Intensive Care database (n = 42,303 intensive care unit transfers).ResultsEleven percent (2,834) of discharges to the wards were later readmitted to the intensive care unit. The machine-learning-derived model had significantly better performance (area under the receiver operating curve, 0.76) than either the Stability and Workload Index for Transfer score (area under the receiver operating curve, 0.65), or Modified Early Warning Score (area under the receiver operating curve, 0.58; P value < 0.0001 for all comparisons). At a specificity of 95%, the derived model had a sensitivity of 28% compared with 15% for Stability and Workload Index for Transfer score and 7% for the Modified Early Warning Score. Accuracy improvements with the derived model over Modified Early Warning Score and Stability and Workload Index for Transfer were similar in the Medical Information Mart for Intensive Care-III cohort.ConclusionsA machine learning approach to predicting intensive care unit readmission was significantly more accurate than previously published algorithms in both our internal validation and the Medical Information Mart for Intensive Care-III cohort. Implementation of this approach could target patients who may benefit from additional time in the intensive care unit or more frequent monitoring after transfer to the hospital ward.

Project description:ObjectiveElectronic health records (EHR) offer medical and pharmacogenomics research unprecedented opportunities to identify and classify patients at risk. EHRs are collections of highly inter-dependent records that include biological, anatomical, physiological, and behavioral observations. They comprise a patient's clinical phenome, where each patient has thousands of date-stamped records distributed across many relational tables. Development of EHR computer-based phenotyping algorithms require time and medical insight from clinical experts, who most often can only review a small patient subset representative of the total EHR records, to identify phenotype features. In this research we evaluate whether relational machine learning (ML) using inductive logic programming (ILP) can contribute to addressing these issues as a viable approach for EHR-based phenotyping.MethodsTwo relational learning ILP approaches and three well-known WEKA (Waikato Environment for Knowledge Analysis) implementations of non-relational approaches (PART, J48, and JRIP) were used to develop models for nine phenotypes. International Classification of Diseases, Ninth Revision (ICD-9) coded EHR data were used to select training cohorts for the development of each phenotypic model. Accuracy, precision, recall, F-Measure, and Area Under the Receiver Operating Characteristic (AUROC) curve statistics were measured for each phenotypic model based on independent manually verified test cohorts. A two-sided binomial distribution test (sign test) compared the five ML approaches across phenotypes for statistical significance.ResultsWe developed an approach to automatically label training examples using ICD-9 diagnosis codes for the ML approaches being evaluated. Nine phenotypic models for each ML approach were evaluated, resulting in better overall model performance in AUROC using ILP when compared to PART (p=0.039), J48 (p=0.003) and JRIP (p=0.003).DiscussionILP has the potential to improve phenotyping by independently delivering clinically expert interpretable rules for phenotype definitions, or intuitive phenotypes to assist experts.ConclusionRelational learning using ILP offers a viable approach to EHR-driven phenotyping.

Project description:AimsWe aimed to create a predictive model utilizing machine learning (ML) to identify new cases of congestive heart failure (CHF) in individuals with diabetes in primary health care (PHC) through the analysis of diagnostic data.MethodsWe used a sex- and age-matched case-control design. Cases of new CHF were identified across all outpatient care settings 2015-2022 (n = 9098). We included individuals 30 years and above, by sex and age groups of 30-65 years and >65 years. The controls (five per case) were sampled from the individuals in 2015-2022 without CHF at any time between 2010 and 2022, in total 45 490. From the stochastic gradient boosting (SGB) technique model, we obtained a rank of the 10 most important factors related to newly diagnosed CHF in individuals with diabetes, with the normalized relative influence (NRI) score and a corresponding odds ratio of marginal effects (ORME). Area under curve (AUC) was calculated.ResultsFor women 30-65 years and >65 years, we identified 488 and 3240 new cases of CHF, respectively, and men 30-65 years and >65 years 1196 and 4174 new cases. Among the 10 most important factors in the four groups (divided by sex and lower and higher age) for newly diagnosed CHF, we found the number of visits 12 months before diagnosis (NRI 44.3%-55.9%), coronary artery disease (NRI 2.9%-7.8%), atrial fibrillation and flutter (NRI 6.6%-12.2%) and 'abnormalities of breathing' (ICD-10 code R06) (NRI 2.6%-4.4%) were predictive in all groups. For younger women, a diagnosis of COPD (NRI 2.7%) contributed to the predictive effect, while for older women, oedema (NRI 3.1%) and number of years with diabetes (NRI 3.5%) contributed to the predictive effect. For men in both age groups, chronic renal disease had predictive effect (NRI 3.9%-5.1%) The model prediction of CHF among patients with diabetes was high, AUC around 0.85 for the four groups, and with sensitivity over 0.783 and specificity over 0.708 for all four groups.ConclusionsAn SGB model using routinely collected data about diagnoses and number of visits in primary care, can accurately predict risk for diagnosis of heart failure in individuals with diabetes. Age and sex difference in predictive factors warrant further examination.

Project description:BackgroundPoor functional status is a key marker of morbidity, yet is not routinely captured in clinical encounters. We developed and evaluated the accuracy of a machine learning algorithm that leveraged electronic health record (EHR) data to provide a scalable process for identification of functional impairment.MethodsWe identified a cohort of patients with an electronically captured screening measure of functional status (Older Americans Resources and Services ADL/IADL) between 2018 and 2020 (N = 6484). Patients were classified using unsupervised learning K means and t-distributed Stochastic Neighbor Embedding into normal function (NF), mild to moderate functional impairment (MFI), and severe functional impairment (SFI) states. Using 11 EHR clinical variable domains (832 variable input features), we trained an Extreme Gradient Boosting supervised machine learning algorithm to distinguish functional status states, and measured prediction accuracies. Data were randomly split into training (80%) and test (20%) sets. The SHapley Additive Explanations (SHAP) feature importance analysis was used to list the EHR features in rank order of their contribution to the outcome.ResultsMedian age was 75.3 years, 62% female, 60% White. Patients were classified as 53% NF (n = 3453), 30% MFI (n = 1947), and 17% SFI (n = 1084). Summary of model performance for identifying functional status state (NF, MFI, SFI) was AUROC (area under the receiving operating characteristic curve) 0.92, 0.89, and 0.87, respectively. Age, falls, hospitalization, home health use, labs (e.g., albumin), comorbidities (e.g., dementia, heart failure, chronic kidney disease, chronic pain), and social determinants of health (e.g., alcohol use) were highly ranked features in predicting functional status states.ConclusionA machine learning algorithm run on EHR clinical data has potential utility for differentiating functional status in the clinical setting. Through further validation and refinement, such algorithms can complement traditional screening methods and result in a population-based strategy for identifying patients with poor functional status who need additional health resources.

Dataset Information

Machine learning optimization of an electronic health record audit for heart failure in primary care.

Aims

Methods and results

Conclusions

Publications

Machine learning optimization of an electronic health record audit for heart failure in primary care.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets