Dataset Information

Machine Learning Model for Risk Prediction of Community-Acquired Acute Kidney Injury Hospitalization From Electronic Health Records: Development and Validation Study.

ABSTRACT: BACKGROUND:Community-acquired acute kidney injury (CA-AKI)-associated hospitalizations impose significant health care needs and contribute to in-hospital mortality. However, most risk prediction models developed to date have focused on AKI in a specific group of patients during hospitalization, and there is limited knowledge on the baseline risk in the general population for preventing CA-AKI-associated hospitalization. OBJECTIVE:To gain further insight into risk exploration, the aim of this study was to develop, validate, and establish a scoring system to facilitate health professionals in enabling early recognition and intervention of CA-AKI to prevent permanent kidney damage using different machine-learning techniques. METHODS:A nested case-control study design was employed using electronic health records derived from a group of Chang Gung Memorial Hospitals in Taiwan from 2010 to 2017 to identify 234,867 adults with at least two measures of serum creatinine at hospital admission. Patients were classified into a derivation cohort (2010-2016) and a temporal validation cohort (2017). Patients with the first episode of CA-AKI at hospital admission were classified into the case group and those without CA-AKI were classified in the control group. A total of 47 potential candidate variables, including age, gender, prior use of nephrotoxic medications, Charlson comorbid conditions, commonly measured laboratory results, and recent use of health services, were tested to develop a CA-AKI hospitalization risk model. Permutation-based selection with both the extreme gradient boost (XGBoost) and least absolute shrinkage and selection operator (LASSO) algorithms was performed to determine the top 10 important features for scoring function development. RESULTS:The discriminative ability of the risk model was assessed by the area under the receiver operating characteristic curve (AUC), and the predictive CA-AKI risk model derived by the logistic regression algorithm achieved an AUC of 0.767 (95% CI 0.764-0.770) on derivation and 0.761 on validation for any stage of AKI, with positive and negative predictive values of 19.2% and 96.1%, respectively. The risk model for prediction of CA-AKI stages 2 and 3 had an AUC value of 0.818 for the validation cohort with positive and negative predictive values of 13.3% and 98.4%, respectively. These metrics were evaluated at a cut-off value of 7.993, which was determined as the threshold to discriminate the risk of AKI. CONCLUSIONS:A machine learning-generated risk score model can identify patients at risk of developing CA-AKI-related hospitalization through a routine care data-driven approach. The validated multivariate risk assessment tool could help clinicians to stratify patients in primary care, and to provide monitoring and early intervention for preventing AKI while improving the quality of AKI care in the general population.

SUBMITTER: Hsu CN

PROVIDER: S-EPMC7435690 | biostudies-literature | 2020 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Machine Learning Model for Risk Prediction of Community-Acquired Acute Kidney Injury Hospitalization From Electronic Health Records: Development and Validation Study.

Hsu Chien-Ning CN Liu Chien-Liang CL Tain You-Lin YL Kuo Chin-Yu CY Lin Yun-Chun YC

Journal of medical Internet research 20200804 8

<h4>Background</h4>Community-acquired acute kidney injury (CA-AKI)-associated hospitalizations impose significant health care needs and contribute to in-hospital mortality. However, most risk prediction models developed to date have focused on AKI in a specific group of patients during hospitalization, and there is limited knowledge on the baseline risk in the general population for preventing CA-AKI-associated hospitalization.<h4>Objective</h4>To gain further insight into risk exploration, the ...[more]

PMID: 32749223

Similar Datasets

Project description:BackgroundCohort studies identifying the incidence, complications and co-morbidities associated with community acquired pneumonia (CAP) are largely based on administrative datasets and rely on International Classification of Diseases (ICD) codes; however, the reliability of ICD codes for hospital admissions for CAP in people with HIV (PWH) has not been systematically assessed.MethodsWe used data from the Veterans Aging Cohort Study survey sample (N = 6824; 3410 PWH and 3414 uninfected) to validate the use of electronic health records (EHR) data to identify CAP hospitalizations when compared to chart review and to compare the performance in PWH vs. uninfected patients. We used different EHR algorithms that included a broad set of CAP ICD-9 codes, a set restricted to bacterial and viral CAP codes, and algorithms that included pharmacy data and/or other ICD-9 diagnoses frequently associated with CAP. We also compared microbiologic workup and etiologic diagnosis by HIV status among those with CAP.ResultsFive hundred forty-nine patients were identified as having an ICD-9 code compatible with a CAP diagnosis (13% of PWH and 4% of the uninfected, p < 0.01). The EHR algorithm with the best overall positive predictive value (82%) was obtained by using the restricted set of ICD-9 codes (480-487) in primary position or secondary only to selected codes as primary (HIV disease, respiratory failure, sepsis or bacteremia) with the addition of EHR pharmacy data; this algorithm yielded PPVs of 83% in PWH and 73% in uninfected (P = 0.1) groups. Adding aspiration pneumonia (ICD-9 code 507) to any of the ICD-9 code/pharmacy combinations increased the number of cases but decreased the overall PPV. Allowing COPD exacerbation in the primary position improved the PPV among the uninfected group only (to 76%). More PWH than uninfected patients underwent microbiologic evaluation or had respiratory samples submitted.ConclusionsICD-9 code-based algorithms perform similarly to identify CAP in PLWH and uninfected individuals. Adding antimicrobial use data and allowing as primary diagnoses ICD-9 codes frequently used in patients with CAP improved the performance of the algorithms in both groups of patients. The algorithms consistently performed better among PWH.

Project description:BackgroundColorectal Polyps are the main source of precancerous lesions in colorectal cancer. To increase the early diagnosis of tumors and improve their screening, we aimed to develop a simple and non-invasive diagnostic prediction model for colorectal polyps based on machine learning (ML) and using accessible health examination records.MethodsWe conducted a single-center observational retrospective study in China. The derivation cohort, consisting of 5426 individuals who underwent colonoscopy screening from January 2021 to January 2024, was separated for training (cohort 1) and validation (cohort 2). The variables considered in this study included demographic data, vital signs, and laboratory results recorded by health examination records. With features selected by univariate analysis and Lasso regression analysis, nine machine learning methods were utilized to develop a colorectal polyp diagnostic model. Several evaluation indexes, including the area under the receiver-operating-characteristic curve (AUC), were used to compare the predictive performance. The SHapley additive explanation method (SHAP) was used to rank the feature importance and explain the final model.Results14 independent predictors were identified as the most valuable features to establish the models. The adaptive boosting machine (AdaBoost) model exhibited the best performance among the 9 ML models in cohort 1, with accuracy, sensitivity, specificity, positive predictive value, negative predictive value, F1 score, and AUC (95% CI) of 0.632 (0.618-0.646), 0.635 (0.550-0.721), 0.674 (0.591-0.758), 0.593 (0.576-0.611), 0.673 (0.654-0.691), 0.608 (0.560-0.655) and 0.687 (0.626-0.749), respectively. The final model gave an AUC of 0.675 in cohort 2. Additionally, the precision recall (PR) curve for the AdaBoost model reached the highest AUPR of 0.648, positioning it nearest to the upper right corner. SHAP analysis provided visualized explanations, reaffirming the critical factors associated with the risk of colorectal polyps in the asymptomatic population.ConclusionsThis study integrated the clinical and laboratory indicators with machine learning techniques to establish the predictive model for colorectal polyps, providing non-invasive, cost-effective screening strategies for asymptomatic individuals and guiding decisions for further examination and treatment.

Project description:AimsHypokalemia is a common complication following traumatic brain injury, which may complicate treatment and lead to unfavorable outcomes. Identifying patients at risk of hypokalemia on the first day of admission helps to implement prophylactic treatment, reduce complications, and improve prognosis.MethodsThis multicenter retrospective study was performed between January 2017 and December 2020 using the electronic medical records of patients admitted due to traumatic brain injury. A propensity score matching approach was adopted with a ratio of 1:1 to overcome overfitting and data imbalance during subgroup analyses. Five machine learning algorithms were applied to generate a best-performed prediction model for in-hospital hypokalemia. The internal fivefold cross-validation and external validation were performed to demonstrate the interpretability and generalizability.ResultsA total of 4445 TBI patients were recruited for analysis and model generation. Hypokalemia occurred in 46.55% of recruited patients and the incidences of mild, moderate, and severe hypokalemia were 32.06%, 12.69%, and 1.80%, respectively. Hypokalemia was associated with increased mortality, while severe hypokalemia cast greater impacts. The logistic regression algorithm had the best performance in predicting decreased serum potassium and moderate-to-severe hypokalemia, with an AUC of 0.73 ± 0.011 and 0.74 ± 0.019, respectively. The prediction model was further verified using two external datasets, including our previous published data and the open-assessed Medical Information Mart for Intensive Care database. Linearized calibration curves showed no statistical difference (p > 0.05) with perfect predictions.ConclusionsThe occurrence of hypokalemia following traumatic brain injury can be predicted by first hospitalization day records and machine learning algorithms. The logistic regression algorithm showed an optimal predicting performance verified by both internal and external validation.

Project description:BackgroundA major problem in treating acute kidney injury (AKI) is that clinical criteria for recognition are markers of established kidney damage or impaired function; treatment before such damage manifests is desirable. Clinicians could intervene during what may be a crucial stage for preventing permanent kidney injury if patients with incipient AKI and those at high risk of developing AKI could be identified.ObjectiveIn this study, we evaluate a machine learning algorithm for early detection and prediction of AKI.DesignWe used a machine learning technique, boosted ensembles of decision trees, to train an AKI prediction tool on retrospective data taken from more than 300 000 inpatient encounters.SettingData were collected from inpatient wards at Stanford Medical Center and intensive care unit patients at Beth Israel Deaconess Medical Center.PatientsPatients older than the age of 18 whose hospital stays lasted between 5 and 1000 hours and who had at least one documented measurement of heart rate, respiratory rate, temperature, serum creatinine (SCr), and Glasgow Coma Scale (GCS).MeasurementsWe tested the algorithm's ability to detect AKI at onset and to predict AKI 12, 24, 48, and 72 hours before onset.MethodsWe tested AKI detection and prediction using the National Health Service (NHS) England AKI Algorithm as a gold standard. We additionally tested the algorithm's ability to detect AKI as defined by the Kidney Disease: Improving Global Outcomes (KDIGO) guidelines. We compared the algorithm's 3-fold cross-validation performance to the Sequential Organ Failure Assessment (SOFA) score for AKI identification in terms of area under the receiver operating characteristic (AUROC).ResultsThe algorithm demonstrated high AUROC for detecting and predicting NHS-defined AKI at all tested time points. The algorithm achieves AUROC of 0.872 (95% confidence interval [CI], 0.867-0.878) for AKI detection at time of onset. For prediction 12 hours before onset, the algorithm achieves an AUROC of 0.800 (95% CI, 0.792-0.809). For 24-hour predictions, the algorithm achieves AUROC of 0.795 (95% CI, 0.785-0.804). For 48-hour and 72-hour predictions, the algorithm achieves AUROC values of 0.761 (95% CI, 0.753-0.768) and 0.728 (95% CI, 0.719-0.737), respectively.LimitationsBecause of the retrospective nature of this study, we cannot draw any conclusions about the impact the algorithm's predictions will have on patient outcomes in a clinical setting.ConclusionsThe results of these experiments suggest that a machine learning-based AKI prediction tool may offer important prognostic capabilities for determining which patients are likely to suffer AKI, potentially allowing clinicians to intervene before kidney damage manifests.

Project description:BackgroundAcute kidney injury (AKI), particularly community-acquired AKI (CA-AKI), is a major health concern globally. The International Society of Nephrology's "0 by 25" initiative to reduce preventable deaths from AKI to zero by 2025 is not achievable in low and middle income countries, such as India, possibly due to a lack of data and measures to tackle this urgent public health issue. In India, CA-AKI predisposes younger patients to hospitalization, morbidity, and mortality. This is the first multicenter, prospective, cohort study investigating CA-AKI and its consequences in India.MethodsThis study included data from patients with CA-AKI (>12 years of age) housed in the Indian Society of Nephrology-AKI registry, involving 9 participating tertiary care centers in India, for the period between November 2016 and October 2019. The etiological spectrum and renal and patient outcomes of CA-AKI at the index visit and at 1-month and 3-month follow-ups were analyzed. The impact of socioeconomic status (SES) on outcomes was also analyzed.FindingsData from 3711 patients (mean [±SD] age 44.7 ± 16.5 years; 66.6% male) were analyzed. The most common comorbidities included hypertension (21.1%) and diabetes (19.1%). AKI occurred in medical, surgical, and obstetrical settings in 86.7%, 7.3%, and 6%, respectively. The most common causes of AKI were associated with sepsis (34.7%) and tropical fever (9.8%). Mortality at the index admission was 10.8%. Complete recovery (CR), partial recovery (PR), and dialysis dependency among survivors at the time of discharge were 22.1%, 57.7%, and 9.4%, respectively. Overall, at 3 months of follow-up, mortality rate, CR, PR, and dialysis dependency rates were 11.4%, 72.2%, 7.2%, and 1%, respectively. Multivariate analysis revealed that age >65 years, alcoholism, anuria, hypotension at presentation, thrombocytopenia, vasopressor use, transaminitis, and low SES were associated with mortality at the index admission.InterpretationSepsis and tropical fever were the most common causes of CA-AKI. Presentation of CA-AKI to tertiary care units was associated with high mortality, and a significant number of patients progressed to CKD. Individuals with a low SES had increased risk of mortality and require immediate attention and intervention.FundingThis study was funded by the Indian Society of Nephrology.

Dataset Information

Machine Learning Model for Risk Prediction of Community-Acquired Acute Kidney Injury Hospitalization From Electronic Health Records: Development and Validation Study.

Publications

Machine Learning Model for Risk Prediction of Community-Acquired Acute Kidney Injury Hospitalization From Electronic Health Records: Development and Validation Study.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets