Unknown

Dataset Information

0

In simulated data and health records, latent class analysis was the optimum multimorbidity clustering algorithm.


ABSTRACT:

Background and objectives

To investigate the reproducibility and validity of latent class analysis (LCA) and hierarchical cluster analysis (HCA), multiple correspondence analysis followed by k-means (MCA-kmeans) and k-means (kmeans) for multimorbidity clustering.

Methods

We first investigated clustering algorithms in simulated datasets with 26 diseases of varying prevalence in predetermined clusters, comparing the derived clusters to known clusters using the adjusted Rand Index (aRI). We then them investigated in the medical records of male patients, aged 65 to 84 years from 50 UK general practices, with 49 long-term health conditions. We compared within cluster morbidity profiles using the Pearson correlation coefficient and assessed cluster stability was in 400 bootstrap samples.

Results

In the simulated datasets, the closest agreement (largest aRI) to known clusters was with LCA and then MCA-kmeans algorithms. In the medical records dataset, all four algorithms identified one cluster of 20-25% of the dataset with about 82% of the same patients across all four algorithms. LCA and MCA-kmeans both found a second cluster of 7% of the dataset. Other clusters were found by only one algorithm. LCA and MCA-kmeans clustering gave the most similar partitioning (aRI 0.54).

Conclusion

LCA achieved higher aRI than other clustering algorithms.

SUBMITTER: Nichols L 

PROVIDER: S-EPMC7613854 | biostudies-literature | 2022 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

In simulated data and health records, latent class analysis was the optimum multimorbidity clustering algorithm.

Nichols Linda L   Taverner Tom T   Crowe Francesca F   Richardson Sylvia S   Yau Christopher C   Kiddle Steven S   Kirk Paul P   Barrett Jessica J   Nirantharakumar Krishnarajah K   Griffin Simon S   Edwards Duncan D   Marshall Tom T  

Journal of clinical epidemiology 20221011


<h4>Background and objectives</h4>To investigate the reproducibility and validity of latent class analysis (LCA) and hierarchical cluster analysis (HCA), multiple correspondence analysis followed by k-means (MCA-kmeans) and k-means (kmeans) for multimorbidity clustering.<h4>Methods</h4>We first investigated clustering algorithms in simulated datasets with 26 diseases of varying prevalence in predetermined clusters, comparing the derived clusters to known clusters using the adjusted Rand Index (a  ...[more]

Similar Datasets

| S-EPMC10810435 | biostudies-literature
| S-EPMC10686427 | biostudies-literature
| S-EPMC9875075 | biostudies-literature
| S-EPMC11909183 | biostudies-literature
| S-EPMC6652024 | biostudies-literature
| S-EPMC7665942 | biostudies-literature
| S-EPMC7242436 | biostudies-literature
| S-EPMC8160293 | biostudies-literature
| S-EPMC4165436 | biostudies-literature
| S-EPMC9277367 | biostudies-literature