Unknown

Dataset Information

0

Bayesian gamma-negative binomial modeling of single-cell RNA sequencing data.


ABSTRACT:

Background

Single-cell RNA sequencing (scRNA-seq) is a powerful profiling technique at the single-cell resolution. Appropriate analysis of scRNA-seq data can characterize molecular heterogeneity and shed light into the underlying cellular process to better understand development and disease mechanisms. The unique analytic challenge is to appropriately model highly over-dispersed scRNA-seq count data with prevalent dropouts (zero counts), making zero-inflated dimensionality reduction techniques popular for scRNA-seq data analyses. Employing zero-inflated distributions, however, may place extra emphasis on zero counts, leading to potential bias when identifying the latent structure of the data.

Results

In this paper, we propose a fully generative hierarchical gamma-negative binomial (hGNB) model of scRNA-seq data, obviating the need for explicitly modeling zero inflation. At the same time, hGNB can naturally account for covariate effects at both the gene and cell levels to identify complex latent representations of scRNA-seq data, without the need for commonly adopted pre-processing steps such as normalization. Efficient Bayesian model inference is derived by exploiting conditional conjugacy via novel data augmentation techniques.

Conclusion

Experimental results on both simulated data and several real-world scRNA-seq datasets suggest that hGNB is a powerful tool for cell cluster discovery as well as cell lineage inference.

SUBMITTER: Dadaneh SZ 

PROVIDER: S-EPMC7487589 | biostudies-literature | 2020 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Bayesian gamma-negative binomial modeling of single-cell RNA sequencing data.

Dadaneh Siamak Zamani SZ   de Figueiredo Paul P   Sze Sing-Hoi SH   Zhou Mingyuan M   Qian Xiaoning X  

BMC genomics 20200909 Suppl 9


<h4>Background</h4>Single-cell RNA sequencing (scRNA-seq) is a powerful profiling technique at the single-cell resolution. Appropriate analysis of scRNA-seq data can characterize molecular heterogeneity and shed light into the underlying cellular process to better understand development and disease mechanisms. The unique analytic challenge is to appropriately model highly over-dispersed scRNA-seq count data with prevalent dropouts (zero counts), making zero-inflated dimensionality reduction tech  ...[more]

Similar Datasets

| S-EPMC8052637 | biostudies-literature
| S-EPMC7880198 | biostudies-literature
| S-EPMC3683603 | biostudies-literature
| S-EPMC8168892 | biostudies-literature
| S-EPMC4365073 | biostudies-literature
| S-EPMC7195715 | biostudies-literature
| S-EPMC4480965 | biostudies-literature
| S-EPMC4180062 | biostudies-literature
| S-EPMC5022247 | biostudies-literature