Unknown

Dataset Information

0

Clustering microbiome data using mixtures of logistic normal multinomial models.


ABSTRACT: Discrete data such as counts of microbiome taxa resulting from next-generation sequencing are routinely encountered in bioinformatics. Taxa count data in microbiome studies are typically high-dimensional, over-dispersed, and can only reveal relative abundance therefore being treated as compositional. Analyzing compositional data presents many challenges because they are restricted to a simplex. In a logistic normal multinomial model, the relative abundance is mapped from a simplex to a latent variable that exists on the real Euclidean space using the additive log-ratio transformation. While a logistic normal multinomial approach brings flexibility for modeling the data, it comes with a heavy computational cost as the parameter estimation typically relies on Bayesian techniques. In this paper, we develop a novel mixture of logistic normal multinomial models for clustering microbiome data. Additionally, we utilize an efficient framework for parameter estimation using variational Gaussian approximations (VGA). Adopting a variational Gaussian approximation for the posterior of the latent variable reduces the computational overhead substantially. The proposed method is illustrated on simulated and real datasets.

SUBMITTER: Fang Y 

PROVIDER: S-EPMC10484970 | biostudies-literature | 2023 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Clustering microbiome data using mixtures of logistic normal multinomial models.

Fang Yuan Y   Subedi Sanjeena S  

Scientific reports 20230907 1


Discrete data such as counts of microbiome taxa resulting from next-generation sequencing are routinely encountered in bioinformatics. Taxa count data in microbiome studies are typically high-dimensional, over-dispersed, and can only reveal relative abundance therefore being treated as compositional. Analyzing compositional data presents many challenges because they are restricted to a simplex. In a logistic normal multinomial model, the relative abundance is mapped from a simplex to a latent va  ...[more]

Similar Datasets

| S-EPMC9484567 | biostudies-literature
| S-EPMC5860108 | biostudies-literature
| S-EPMC6590172 | biostudies-literature
| S-EPMC7736568 | biostudies-literature
| S-EPMC11869466 | biostudies-literature
| S-EPMC7475459 | biostudies-literature
| S-EPMC6432796 | biostudies-literature
| S-EPMC9364382 | biostudies-literature
| S-EPMC10012398 | biostudies-literature
| S-EPMC10953401 | biostudies-literature