Dataset Information

Clustering of temporal gene expression data with mixtures of mixed effects models with a penalized likelihood.

ABSTRACT: MOTIVATION:Clustering algorithms like K-Means and standard Gaussian mixture models (GMM) fail to account for the structure of variability of replicated data or repeated measures over time. Additionally, a priori cluster number assumptions add an additional complexity to the process. Current methods to optimize cluster labels and number can be inaccurate or computationally intensive for temporal gene expression data with this additional variability. RESULTS:An extension to a model-based clustering algorithm is proposed using mixtures of mixed effects polynomial regression models and the EM algorithm with an entropy penalized log-likelihood function (EPEM). The EPEM is used to cluster temporal gene expression data with this additional variability. The addition of random effects in our model decreased the misclassification error when compared to mixtures of fixed effects models or other methods such as K-Means and GMM. Applying our method to microarray data from a fracture healing study revealed distinct temporal patterns of gene expression. AVAILABILITY AND IMPLEMENTATION:https://github.com/darlenelu72/EPEM-GMM. SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.

SUBMITTER: Lu D

PROVIDER: S-EPMC6394398 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Clustering of temporal gene expression data with mixtures of mixed effects models with a penalized likelihood.

Lu Darlene D Tripodis Yorghos Y Gerstenfeld Louis C LC Demissie Serkalem S

Bioinformatics (Oxford, England) 20190301 5

<h4>Motivation</h4>Clustering algorithms like K-Means and standard Gaussian mixture models (GMM) fail to account for the structure of variability of replicated data or repeated measures over time. Additionally, a priori cluster number assumptions add an additional complexity to the process. Current methods to optimize cluster labels and number can be inaccurate or computationally intensive for temporal gene expression data with this additional variability.<h4>Results</h4>An extension to a model- ...[more]

PMID: 30101356

Dataset Information

Clustering of temporal gene expression data with mixtures of mixed effects models with a penalized likelihood.

Publications

Clustering of temporal gene expression data with mixtures of mixed effects models with a penalized likelihood.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Penalized mixtures of factor analyzers with application to clustering high-dimensional microarray data.
| S-EPMC2852217 | biostudies-literature

Generalized linear mixed models for binary data: are matching results from penalized quasi-likelihood and numerical integration less biased?
| S-EPMC3886992 | biostudies-literature

Clustering gene expression data with a penalized graph-based metric.
| S-EPMC3023695 | biostudies-literature

Clustering microbiome data using mixtures of logistic normal multinomial models.
| S-EPMC10484970 | biostudies-literature

One-step Sparse Estimates in Nonconcave Penalized Likelihood Models.
| S-EPMC2759727 | biostudies-literature

Bayesian penalized spline models for the analysis of spatio-temporal count data.
| S-EPMC4959802 | biostudies-literature

Penalized model-based clustering of fMRI data.
| S-EPMC9293048 | biostudies-literature

Laplace approximation, penalized quasi-likelihood, and adaptive Gauss-Hermite quadrature for generalized linear mixed models: towards meta-analysis of binary outcome with sparse data.
| S-EPMC7296731 | biostudies-literature

Information criteria for Firth's penalized partial likelihood approach in Cox regression models.
| S-EPMC6084330 | biostudies-literature

Penalized mediation models for multivariate data.
| S-EPMC8900147 | biostudies-literature