Unknown

Dataset Information

0

Machine Learning-Based Cardiovascular Disease Prediction Model: A Cohort Study on the Korean National Health Insurance Service Health Screening Database.


ABSTRACT: This study proposes a cardiovascular diseases (CVD) prediction model using machine learning (ML) algorithms based on the National Health Insurance Service-Health Screening datasets. We extracted 4699 patients aged over 45 as the CVD group, diagnosed according to the international classification of diseases system (I20-I25). In addition, 4699 random subjects without CVD diagnosis were enrolled as a non-CVD group. Both groups were matched by age and gender. Various ML algorithms were applied to perform CVD prediction; then, the performances of all the prediction models were compared. The extreme gradient boosting, gradient boosting, and random forest algorithms exhibited the best average prediction accuracy (area under receiver operating characteristic curve (AUROC): 0.812, 0.812, and 0.811, respectively) among all algorithms validated in this study. Based on AUROC, the ML algorithms improved the CVD prediction performance, compared to previously proposed prediction models. Preexisting CVD history was the most important factor contributing to the accuracy of the prediction model, followed by total cholesterol, low-density lipoprotein cholesterol, waist-height ratio, and body mass index. Our results indicate that the proposed health screening dataset-based CVD prediction model using ML algorithms is readily applicable, produces validated results and outperforms the previous CVD prediction models.

SUBMITTER: Kim JOR 

PROVIDER: S-EPMC8229422 | biostudies-literature | 2021 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Machine Learning-Based Cardiovascular Disease Prediction Model: A Cohort Study on the Korean National Health Insurance Service Health Screening Database.

Kim Joung Ouk Ryan JOR   Jeong Yong-Suk YS   Kim Jin Ho JH   Lee Jong-Weon JW   Park Dougho D   Kim Hyoung-Seop HS  

Diagnostics (Basel, Switzerland) 20210525 6


<h4>Background</h4>This study proposes a cardiovascular diseases (CVD) prediction model using machine learning (ML) algorithms based on the National Health Insurance Service-Health Screening datasets.<h4>Methods</h4>We extracted 4699 patients aged over 45 as the CVD group, diagnosed according to the international classification of diseases system (I20-I25). In addition, 4699 random subjects without CVD diagnosis were enrolled as a non-CVD group. Both groups were matched by age and gender. Variou  ...[more]

Similar Datasets

| S-EPMC8060521 | biostudies-literature
| S-EPMC7537455 | biostudies-literature
| S-EPMC10065252 | biostudies-literature
| S-EPMC9308178 | biostudies-literature
| S-EPMC7646339 | biostudies-literature
| S-EPMC7441000 | biostudies-literature
| S-EPMC10415647 | biostudies-literature
| S-EPMC8041859 | biostudies-literature
| S-EPMC8879589 | biostudies-literature
| S-EPMC8320502 | biostudies-literature