Unknown

Dataset Information

0

Early prediction of end-stage kidney disease using electronic health record data: a machine learning approach with a 2-year horizon.


ABSTRACT:

Objectives

In the United States, end-stage kidney disease (ESKD) is responsible for high mortality and significant healthcare costs, with the number of cases sharply increasing in the past 2 decades. In this study, we aimed to reduce these impacts by developing an ESKD model for predicting its occurrence in a 2-year period.

Materials and methods

We developed a machine learning (ML) pipeline to test different models for the prediction of ESKD. The electronic health record was used to capture several kidney disease-related variables. Various imputation methods, feature selection, and sampling approaches were tested. We compared the performance of multiple ML models using area under the ROC curve (AUCROC), area under the Precision-Recall curve (PR-AUC), and Brier scores for discrimination, precision, and calibration, respectively. Explainability methods were applied to the final model.

Results

Our best model was a gradient-boosting machine with feature selection and imputation methods as additional components. The model exhibited an AUCROC of 0.97, a PR-AUC of 0.33, and a Brier score of 0.002 on a holdout test set. A chart review analysis by expert physicians indicated clinical utility.

Discussion and conclusion

An ESKD prediction model can identify individuals at risk for ESKD and has been successfully deployed within our health system.

SUBMITTER: Petousis P 

PROVIDER: S-EPMC10898824 | biostudies-literature | 2024 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Early prediction of end-stage kidney disease using electronic health record data: a machine learning approach with a 2-year horizon.

Petousis Panayiotis P   Wilson James M JM   Gelvezon Alex V AV   Alam Shafiul S   Jain Ankur A   Prichard Laura L   Elashoff David A DA   Raja Naveen N   Bui Alex A T AAT  

JAMIA open 20240227 1


<h4>Objectives</h4>In the United States, end-stage kidney disease (ESKD) is responsible for high mortality and significant healthcare costs, with the number of cases sharply increasing in the past 2 decades. In this study, we aimed to reduce these impacts by developing an ESKD model for predicting its occurrence in a 2-year period.<h4>Materials and methods</h4>We developed a machine learning (ML) pipeline to test different models for the prediction of ESKD. The electronic health record was used  ...[more]

Similar Datasets

| S-EPMC8965763 | biostudies-literature
| S-EPMC8861926 | biostudies-literature
| S-EPMC6080076 | biostudies-literature
| S-EPMC9120106 | biostudies-literature
| S-EPMC9020923 | biostudies-literature
| S-EPMC6950922 | biostudies-literature
| S-EPMC8722098 | biostudies-literature
| S-EPMC9756119 | biostudies-literature
| S-EPMC5975145 | biostudies-literature
| S-EPMC7756814 | biostudies-literature