Dataset Information

Prediction of type 2 diabetes mellitus onset using logistic regression-based scorecards.

ABSTRACT:

SUBMITTER: Edlitz Y

PROVIDER: S-EPMC9255967 | biostudies-literature | 2022 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Prediction of type 2 diabetes mellitus onset using logistic regression-based scorecards.

Edlitz Yochai Y Segal Eran E

eLife 20220622

<h4>Background</h4>Type 2 diabetes (T2D) accounts for ~90% of all cases of diabetes, resulting in an estimated 6.7 million deaths in 2021, according to the International Diabetes Federation. Early detection of patients with high risk of developing T2D can reduce the incidence of the disease through a change in lifestyle, diet, or medication. Since populations of lower socio-demographic status are more susceptible to T2D and might have limited resources or access to sophisticated computational re ...[more]

PMID: 35731045

Similar Datasets

Project description:BACKGROUND:HPV-16-positive HNSCC and HPV-16-negative HNSCC have different clinical factors, representing distinct forms of cancers. The study aimed to identify patient-specific factors for HPV-16-positive HNSCC based on baseline clinical data. METHOD:Factors associated with HPV-16-positive HNSCC were identified using the data from 210 patients diagnosed with HNSCC at University College of London Hospital between January 1, 2003, and April 30, 2015, inclusive. A series of models were developed using logistic regression methods, and the overall model fit was compared using Akaike Information Criterion. Survival analysis was carried with Cox proportional hazards model for survival-time outcomes. The survival time for individual patients was defined as the time from diagnosis of HNSCC to the date of death from any cause. For patients who did not die, they were censored at the end of study on April 30, 2015. RESULTS:Of the 210 patients, 151 (72%) were found to have HPV-16-positive HNSCC. The logistic regression model showed that the prevalence of developing HPV-16-positive HNSCC was 3.79 times higher in patients with Type 2 Diabetes Mellitus (T2DM) (odd ratio [OR], 3.79; 95% CI, 1.70-8.44) than in those without T2DM, and 8.84 times higher in patients with history of primary HNSCC (OR, 8.84; 95% CI, 2.30-33.88) than in those without a history of primary HNSCC. HPV-16-positive HNSCC was also observed more in tonsils (OR, 4.02; 95% CL, 1.56-10.36) and less in non-alcohol drinker's oral cavity (OR, 0.14; 95% CI, 0.03-0.56). Furthermore, individual patients were followed-up for 1 to 13 years (median of 1 year). Patients with HPV-positive HNSCC had a median survival of 5 years (95% CI, 2.6-7.3 years). Among HPV-16-positive HNSCC cohort, T2DM was a risk for poorer prognosis (hazard ratio, 2.57; 95% Cl, 1.09-6.07), and had lower median survival of 3 years (95% CI, 1.8-4.1 years), as compared to 6 years (95% CI, 2.8-9.1 years) in non-T2DM. CONCLUSIONS:Patient-specific factors for HPV-positive HNSCC are T2DM, history of primary HNSCC and tonsillar site. T2DM is associated with poorer prognosis. These findings suggest that it might be beneficial if routine HPV-16 screening is carried out in T2DM patients which can provide better therapeutic and management strategies.

Project description:ObjectiveTo predict preterm birth in nulliparous women using logistic regression and machine learning.DesignPopulation-based retrospective cohort.ParticipantsNulliparous women (N = 112,963) with a singleton gestation who gave birth between 20-42 weeks gestation in Ontario hospitals from April 1, 2012 to March 31, 2014.MethodsWe used data during the first and second trimesters to build logistic regression and machine learning models in a "training" sample to predict overall and spontaneous preterm birth. We assessed model performance using various measures of accuracy including sensitivity, specificity, positive predictive value, negative predictive value, and area under the receiver operating characteristic curve (AUC) in an independent "validation" sample.ResultsDuring the first trimester, logistic regression identified 13 variables associated with preterm birth, of which the strongest predictors were diabetes (Type I: adjusted odds ratio (AOR): 4.21; 95% confidence interval (CI): 3.23-5.42; Type II: AOR: 2.68; 95% CI: 2.05-3.46) and abnormal pregnancy-associated plasma protein A concentration (AOR: 2.04; 95% CI: 1.80-2.30). During the first trimester, the maximum AUC was 60% (95% CI: 58-62%) with artificial neural networks in the validation sample. During the second trimester, 17 variables were significantly associated with preterm birth, among which complications during pregnancy had the highest AOR (13.03; 95% CI: 12.21-13.90). During the second trimester, the AUC increased to 65% (95% CI: 63-66%) with artificial neural networks in the validation sample. Including complications during the pregnancy yielded an AUC of 80% (95% CI: 79-81%) with artificial neural networks. All models yielded 94-97% negative predictive values for spontaneous PTB during the first and second trimesters.ConclusionAlthough artificial neural networks provided slightly higher AUC than logistic regression, prediction of preterm birth in the first trimester remained elusive. However, including data from the second trimester improved prediction to a moderate level by both logistic regression and machine learning approaches.

Project description:Type 1 diabetes mellitus (T1DM) patients are a significant threat to chronic kidney disease (CKD) development during their life. However, there is always a high chance of delay in CKD detection because CKD can be asymptomatic, and T1DM patients bypass traditional CKD tests during their routine checkups. This study aims to develop and validate a prediction model and nomogram of CKD in T1DM patients using readily available routine checkup data for early CKD detection. This research utilized 1375 T1DM patients' sixteen years of longitudinal data from multi-center Epidemiology of Diabetes Interventions and Complications (EDIC) clinical trials conducted at 28 sites in the USA and Canada and considered 17 routinely available features. Three feature ranking algorithms, extreme gradient boosting (XGB), random forest (RF), and extremely randomized trees classifier (ERT), were applied to create three feature ranking lists, and logistic regression analyses were performed to develop CKD prediction models using these ranked feature lists to identify the best performing top-ranked features combination. Finally, the most significant features were selected to develop a multivariate logistic regression-based CKD prediction model for T1DM patients. This model was evaluated using sensitivity, specificity, accuracy, precision, and F1 score on train and test data. A nomogram of the final model was further generated for easy application in clinical practices. Hypertension, duration of diabetes, drinking habit, triglycerides, ACE inhibitors, low-density lipoprotein (LDL) cholesterol, age, and smoking habit were the top-8 features ranked by the XGB model and identified as the most important features for predicting CKD in T1DM patients. These eight features were selected to develop the final prediction model using multivariate logistic regression, which showed 90.04% and 88.59% accuracy in internal and test data validation. The proposed model showed excellent performance and can be used for CKD identification in T1DM patients during routine checkups.

Dataset Information

Prediction of type 2 diabetes mellitus onset using logistic regression-based scorecards.

Publications

Prediction of type 2 diabetes mellitus onset using logistic regression-based scorecards.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets