Ontology highlight
ABSTRACT:
SUBMITTER: Li Y
PROVIDER: S-EPMC9766880 | biostudies-literature | 2022 Dec
REPOSITORIES: biostudies-literature
Li Yujia Y Rahman Tanbin T Ma Tianzhou T Tang Lu L Tseng George C GC
Biostatistics (Oxford, England) 20221201 1
Clustering with variable selection is a challenging yet critical task for modern small-n-large-p data. Existing methods based on sparse Gaussian mixture models or sparse $K$-means provide solutions to continuous data. With the prevalence of RNA-seq technology and lack of count data modeling for clustering, the current practice is to normalize count expression data into continuous measures and apply existing models with a Gaussian assumption. In this article, we develop a negative binomial mixtur ...[more]