Dataset Information

Machine Learning for Predicting the 3-Year Risk of Incident Diabetes in Chinese Adults.

ABSTRACT: Purpose: We aimed to establish and validate a risk assessment system that combines demographic and clinical variables to predict the 3-year risk of incident diabetes in Chinese adults. Methods: A 3-year cohort study was performed on 15,928 Chinese adults without diabetes at baseline. All participants were randomly divided into a training set (n = 7,940) and a validation set (n = 7,988). XGBoost method is an effective machine learning technique used to select the most important variables from candidate variables. And we further established a stepwise model based on the predictors chosen by the XGBoost model. The area under the receiver operating characteristic curve (AUC), decision curve and calibration analysis were used to assess discrimination, clinical use and calibration of the model, respectively. The external validation was performed on a cohort of 11,113 Japanese participants. Result: In the training and validation sets, 148 and 145 incident diabetes cases occurred. XGBoost methods selected the 10 most important variables from 15 candidate variables. Fasting plasma glucose (FPG), body mass index (BMI) and age were the top 3 important variables. And we further established a stepwise model and a prediction nomogram. The AUCs of the stepwise model were 0.933 and 0.910 in the training and validation sets, respectively. The Hosmer-Lemeshow test showed a perfect fit between the predicted diabetes risk and the observed diabetes risk (p = 0.068 for the training set, p = 0.165 for the validation set). Decision curve analysis presented the clinical use of the stepwise model and there was a wide range of alternative threshold probability spectrum. And there were almost no the interactions between these predictors (most P-values for interaction >0.05). Furthermore, the AUC for the external validation set was 0.830, and the Hosmer-Lemeshow test for the external validation set showed no statistically significant difference between the predicted diabetes risk and observed diabetes risk (P = 0.824). Conclusion: We established and validated a risk assessment system for characterizing the 3-year risk of incident diabetes.

SUBMITTER: Wu Y

PROVIDER: S-EPMC8275929 | biostudies-literature | 2021

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Machine Learning for Predicting the 3-Year Risk of Incident Diabetes in Chinese Adults.

Wu Yang Y Hu Haofei H Cai Jinlin J Chen Runtian R Zuo Xin X Cheng Heng H Yan Dewen D

Frontiers in public health 20210629

Purpose: We aimed to establish and validate a risk assessment system that combines demographic and clinical variables to predict the 3-year risk of incident diabetes in Chinese adults. Methods: A 3-year cohort study was performed on 15,928 Chinese adults without diabetes at baseline. All participants were randomly divided into a training set (n = 7,940) and a validation set (n = 7,988). XGBoost method is an effective machine learning technique used to select the most ...[more]

PMID: 34268283

Similar Datasets

Project description:Early identification is crucial to effectively intervene in individuals at high risk of developing pre-diabetes. This study aimed to create a personalized nomogram to determine the 5-year risk of pre-diabetes among Chinese adults. This retrospective cohort study included 184,188 participants without prediabetes at baseline. Training cohorts (92,177) and validation cohorts (92,011) were randomly assigned (92,011). We compared five prediction models on the training cohorts: full cox proportional hazards model, stepwise cox proportional hazards model, multivariable fractional polynomials (MFP), machine learning, and least absolute shrinkage and selection operator (LASSO) models. At the same time, we validated the above five models on the validation set. And we chose the LASSO model as the final risk prediction model for prediabetes. We presented the model with a nomogram. The model's performance was evaluated in terms of its discriminative ability, clinical utility, and calibration using the area under the receiver operating characteristic (ROC) curve, decision curve analysis, and calibration analysis on the training cohorts. Simultaneously, we also evaluated the above nomogram on the validation set. The 5-year incidence of prediabetes was 10.70% and 10.69% in the training and validation cohort, respectively. We developed a simple nomogram that predicted the risk of prediabetes by using the parameters of age, body mass index (BMI), fasting plasma glucose (FBG), triglycerides (TG), systolic blood pressure (SBP), and serum creatinine (Scr). The nomogram's area under the receiver operating characteristic curve (AUC) was 0.7341 (95% CI 0.7290-0.7392) for the training cohort and 0.7336 (95% CI 0.7285-0.7387) for the validation cohort, indicating good discriminative ability. The calibration curve showed a perfect fit between the predicted prediabetes risk and the observed prediabetes risk. An analysis of the decision curve presented the clinical application of the nomogram, with alternative threshold probability spectrums being presented as well. A personalized prediabetes prediction nomogram was developed and validated among Chinese adults, identifying high-risk individuals. Doctors and others can easily and efficiently use our prediabetes prediction model when assessing prediabetes risk.

Project description:ImportanceSpouses share common socioeconomic, environmental, and lifestyle factors, and multiple studies have found that spousal diabetes status was associated with diabetes prevalence. But the association of spousal diabetes status and ideal cardiovascular health metrics (ICVHMs) assessed by the American Heart Association's Life's Essential 8 measures with incident diabetes has not been comprehensively characterized, especially in large-scale cohort studies.ObjectiveTo explore the association of spousal diabetes status and cardiovascular health metrics with risk of incident diabetes in Chinese adults.Design, setting, and participantsThis cohort study included individuals in the China Cardiovascular Disease and Cancer Cohort without diabetes who underwent baseline and follow-up glucose measurements and had spouses with baseline glucose measurements. The data were collected in January 2011 to December 2012 and March 2014 to December 2016. The spousal study had a mean (SD) follow-up of 3.6 (0.9) years (median [IQR], 3.2 [2.9-4.5] years). Statistical analysis was performed from July to November 2022.ExposureSpousal diabetes status was diagnosed according to the 2010 American Diabetes Association (ADA) criteria. All participants provided detailed clinical, sociodemographic, and lifestyle information included in cardiovascular health metrics.Main outcomes and measuresIncident diabetes, diagnosed according to 2010 ADA criteria.ResultsOverall, 34 821 individuals were included, with a mean (SD) age of 56.4 (8.3) years and 16 699 (48.0%) male participants. Spousal diabetes diagnosis was associated with an increased risk of incident diabetes (hazard ratio [HR], 1.15; 95% CI, 1.03-1.30). Furthermore, participants whose spouses had uncontrolled glycated hemoglobin (HbA1c) had a higher risk of diabetes (HR, 1.20; 95% CI, 1.04-1.39) but the risk of diabetes in participants whose spouses had controlled HbA1c did not increase significantly (HR, 1.10; 95% CI, 0.92-1.30). Moreover, this association varied with composite cardiovascular health status. Diabetes risk in individuals who had poor cardiovascular health status (<4 ICVHMs) was associated with spousal diabetes status (3 ICVHMs: HR, 1.50; 95% CI, 1.15-1.97), while diabetes risk in individuals who had intermediate to ideal cardiovascular health status (≥4 ICVHMs) was not associated with it (4 ICVHMs: HR, 1.01; 95% CI, 0.69-1.50).Conclusions and relevanceIn this study, spousal diabetes diagnosis with uncontrolled HbA1c level was associated with increased risk of incident diabetes, but strict management of spousal HbA1c level and improving ICVHM profiles may attenuate the association of spousal diabetes status with diabetes risk. These findings suggest the potential benefit of couple-based lifestyle or pharmaceutical interventions for diabetes.

Project description:ObjectivesThe purpose of this scoping review is to: (1) identify existing supervised machine learning (ML) approaches on the prediction of cancer in asymptomatic adults; (2) to compare the performance of ML models with each other and (3) to identify potential gaps in research.DesignScoping review using the population, concept and context approach.Search strategyPubMed search engine was used from inception to 10 November 2020 to identify literature meeting following inclusion criteria: (1) a general adult (≥18 years) population, either sex, asymptomatic (population); (2) any study using ML techniques to derive predictive models for future cancer risk using clinical and/or demographic and/or basic laboratory data (concept) and (3) original research articles conducted in all settings in any region of the world (context).ResultsThe search returned 627 unique articles, of which 580 articles were excluded because they did not meet the inclusion criteria, were duplicates or were related to benign neoplasm. Full-text reviews were conducted for 47 articles and a final set of 10 articles were included in this scoping review. These 10 very heterogeneous studies used ML to predict future cancer risk in asymptomatic individuals. All studies reported area under the receiver operating characteristics curve (AUC) values as metrics of model performance, but no study reported measures of model calibration.ConclusionsResearch gaps that must be addressed in order to deliver validated ML-based models to assist clinical decision-making include: (1) establishing model generalisability through validation in independent cohorts, including those from low-income and middle-income countries; (2) establishing models for all cancer types; (3) thorough comparisons of ML models with best available clinical tools to ensure transparency of their potential clinical utility; (4) reporting of model calibration performance and (5) comparisons of different methods on the same cohort to reveal important information about model generalisability and performance.

Dataset Information

Machine Learning for Predicting the 3-Year Risk of Incident Diabetes in Chinese Adults.

Publications

Machine Learning for Predicting the 3-Year Risk of Incident Diabetes in Chinese Adults.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets