Dataset Information

Assessing reproducibility and utility of clustering of patients with type 2 diabetes and established CV disease (SAVOR -TIMI 53 trial).

ABSTRACT:

Objective

To assess the reproducibility and clinical utility of clustering-based subtyping of patients with type 2 diabetes (T2D) and established cardiovascular (CV) disease.

Methods

The cardiovascular outcome trial SAVOR-TIMI 53 (n = 16,492) was used. Analyses focused on T2D patients with established CV disease. Unsupervised machine learning technique called "k-means clustering" was used to divide patients into subtypes. K-means clustering including HbA1c, age of diagnosis, BMI, HOMA2-IR and HOMA2-B was used to assign clusters to the following diabetes subtypes: severe insulin deficient diabetes (SIDD); severe insulin-resistant diabetes (SIRD); mild obesity-related diabetes (MOD); mild age-related diabetes (MARD). We refer these subtypes as "clustering-based diabetes subtypes". A simulation study using randomly generated data was conducted to understand how correlations between the above variables influence the formation of the cluster-based diabetes subtypes. The predictive utility of clustering-based diabetes subtypes for CV events (3-point MACE), renal function reduction (eGFR decrease >30%) and diabetic disease progression (introduction of additional anti-diabetic medication) were compared with conventional risk scores. Hazard ratios (HR) were estimated by Cox-proportional hazard models.

Results

In the SAVOR-TIMI 53 trial based dataset, the percentage of the clustering-based T2D subtypes were; SIDD (18%), SIRD (17%), MOD (29%), MARD (37%). Using the simulated dataset, the diabetes subtypes could be largely reproduced from a log-normal distribution when including known correlations between variables. The predictive utility of clustering-based diabetic subtypes on CV events, renal function reduction, and diabetic disease progression did not show an advantage compared to conventional risk scores.

Conclusions

The consistent reproduction of four clustering-based T2D subtypes can be explained by the correlations between the variables used for clustering. Subtypes of T2D based on clustering had limited advantage compared to conventional risk scores to predict clinical outcome in patients with T2D and established CV disease.

SUBMITTER: Aoki Y

PROVIDER: S-EPMC8604302 | biostudies-literature | 2021

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Assessing reproducibility and utility of clustering of patients with type 2 diabetes and established CV disease (SAVOR -TIMI 53 trial).

Aoki Yasunori Y Hamrén Bengt B Clegg Lindsay E LE Stahre Christina C Bhatt Deepak L DL Raz Itamar I Scirica Benjamin M BM Oscarsson Jan J Carlsson Björn B

PloS one 20211119 11

<h4>Objective</h4>To assess the reproducibility and clinical utility of clustering-based subtyping of patients with type 2 diabetes (T2D) and established cardiovascular (CV) disease.<h4>Methods</h4>The cardiovascular outcome trial SAVOR-TIMI 53 (n = 16,492) was used. Analyses focused on T2D patients with established CV disease. Unsupervised machine learning technique called "k-means clustering" was used to divide patients into subtypes. K-means clustering including HbA1c, age of diagnosis, BMI, ...[more]

PMID: 34797832

Similar Datasets

Project description:ImportanceAn elevated level of urinary albumin to creatinine ratio (UACR) is a marker of renal dysfunction and predictor of kidney failure/death in patients with type 2 diabetes. The prognostic use of UACR in established cardiac biomarkers is not well described.ObjectiveTo evaluate whether UACR offers incremental prognostic benefit beyond risk factors and established plasma cardiovascular biomarkers.Design, setting, and participantsThe Saxagliptin Assessment of Vascular Outcomes Recorded in Patients With Diabetes Mellitus-Thrombolysis in Myocardial Infarction (SAVOR-TIMI) 53 study was performed from May 2010 to May 2013 and evaluated the safety of saxagliptin vs placebo in patients with type 2 diabetes with overt cardiovascular disease or multiple risk factors. Median follow-up was 2.1 years (interquartile range, 1.8-2.3 years).InterventionsPatients were randomized to saxagliptin vs placebo plus standard care.Main outcomes and measuresBaseline UACR was measured in 15 760 patients (95.6% of the trial population) and categorized into thresholds.ResultsOf 15 760 patients, 5205 were female (33.0%). The distribution of UARC categories were: 5805 patients (36.8%) less than 10 mg/g, 3891 patients (24.7%) at 10 to 30 mg/g, 4426 patients (28.1%) at 30 to 300 mg/g, and 1638 patients (10.4%) at more than 300 mg/g. When evaluated without cardiac biomarkers, there was a stepwise increase with each higher UACR category in the incidence of the primary composite end point (cardiovascular death, myocardial infarction, or ischemic stroke) (3.9%, 6.9%, 9.2%, and 14.3%); cardiovascular death (1.4%, 2.6%, 4.1%, and 6.9%); and hospitalization for heart failure (1.5%, 2.5%, 4.0%, and 8.3%) (adjusted P < .001 for trend). The net reclassification improvement at the event rate for each end point was 0.081 (95% CI, 0.025 to 0.161), 0.129 (95% CI, 0.029 to 0.202), and 0.056 (95% CI, -0.005 to 0.141), respectively. The stepwise increased cardiovascular risk associated with a UACR of more than 10 mg/g was also present within each chronic kidney disease category. The UACR was associated with outcomes after including cardiac biomarkers. However, the improvement in discrimination and reclassification was attenuated; net reclassification improvement at the event rate was 0.022 (95% CI, -0.022 to 0.067), -0.008 (-0.034 to 0.053), and 0.043 (-0.030 to 0.052) for the primary end point, cardiovascular death, and hospitalization for heart failure, respectively.Conclusions and relevanceIn patients with type 2 diabetes, UACR was independently associated with increased risk for a spectrum of adverse cardiovascular outcomes. However, the incremental cardiovascular prognostic value of UACR was minimal when evaluated together with contemporary cardiac biomarkers.Trial registrationclinicaltrials.gov Identifier: NCT01107886.

Project description:IntroductionSince 2008 United State (US) food drug administration mandate, several newer anti-diabetic drugs (ADD) have undergone a mandatory cardiovascular (CV) outcome trial (CVOT) in type diabetes (T2DM) patients with high CV risk. These includes CVOT done with dipeptidyl-peptidase-4 inhibitors, sodium-glucose co-transporter-2 inhibitors and glucagon-like peptide-1 receptor agonist (GLP-1RAs). Several double-blind, randomized, placebo-controlled CVOT have been presented and published in the last decade (2008-2018).Aims and objectivesWe systematically searched the database of PubMed and ClinicalTrials.gov from January 1, 2008 to December 31, 2018 using specific key words. Subsequently, we pooled the data of different cardiovascular endpoints and made a comparative forest plot using GraphPad software Inc. Prism Version 8, US.Results and conclusionSaxagliptin, alogliptin, sitagliptin and linagliptin are CV neutral drugs. Saxagliptin showed a significantly higher hospitalization due to heart failure (HHF). Empagliflozin and canagliflozin have shown a significant reduction in composite of 3-point major cardiac adverse events (3P-MACE). Additionally, empagliflozin, canagliflozin and dapagliflozin significantly reduced the HHF and the composite of CV death or HHF. Moreover, empagliflozin showed significant reduction in CV- and all-cause death in patients with T2DM with established CV disease. While both exendin-backbone-based GLP-1RAs such as lixisenatide and extended-release exenatide were CV neutral; GLP-1-backbone-based GLP-1RAs such as liraglutide, semaglutide and albiglutide shown a significant reduction in the composite of 3-P MACE. Additionally, liraglutide shown a significant reduction in CV- and all-cause death. Moreover, semaglutide reduced non-fatal stroke and albiglutide reduced myocardial infarction, while extended-release exenatide reduced all-cause death; however, P value of significance for these outcomes should be considered nominal.

Project description:Data are the foundation of empirical research, yet all too often the datasets underlying published papers are unavailable, incorrect, or poorly curated. This is a serious issue, because future researchers are then unable to validate published results or reuse data to explore new ideas and hypotheses. Even if data files are securely stored and accessible, they must also be accompanied by accurate labels and identifiers. To assess how often problems with metadata or data curation affect the reproducibility of published results, we attempted to reproduce Discriminant Function Analyses (DFAs) from the field of organismal biology. DFA is a commonly used statistical analysis that has changed little since its inception almost eight decades ago, and therefore provides an opportunity to test reproducibility among datasets of varying ages. Out of 100 papers we initially surveyed, fourteen were excluded because they did not present the common types of quantitative result from their DFA or gave insufficient details of their DFA. Of the remaining 86 datasets, there were 15 cases for which we were unable to confidently relate the dataset we received to the one used in the published analysis. The reasons ranged from incomprehensible or absent variable labels, the DFA being performed on an unspecified subset of the data, or the dataset we received being incomplete. We focused on reproducing three common summary statistics from DFAs: the percent variance explained, the percentage correctly assigned and the largest discriminant function coefficient. The reproducibility of the first two was fairly high (20 of 26, and 44 of 60 datasets, respectively), whereas our success rate with the discriminant function coefficients was lower (15 of 26 datasets). When considering all three summary statistics, we were able to completely reproduce 46 (65%) of 71 datasets. While our results show that a majority of studies are reproducible, they highlight the fact that many studies still are not the carefully curated research that the scientific community and public expects.

Dataset Information

Assessing reproducibility and utility of clustering of patients with type 2 diabetes and established CV disease (SAVOR -TIMI 53 trial).

Objective

Methods

Results

Conclusions

Publications

Assessing reproducibility and utility of clustering of patients with type 2 diabetes and established CV disease (SAVOR -TIMI 53 trial).

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets