Unknown

Dataset Information

0

ScIGANs: single-cell RNA-seq imputation using generative adversarial networks.


ABSTRACT: Single-cell RNA-sequencing (scRNA-seq) enables the characterization of transcriptomic profiles at the single-cell resolution with increasingly high throughput. However, it suffers from many sources of technical noises, including insufficient mRNA molecules that lead to excess false zero values, termed dropouts. Computational approaches have been proposed to recover the biologically meaningful expression by borrowing information from similar cells in the observed dataset. However, these methods suffer from oversmoothing and removal of natural cell-to-cell stochasticity in gene expression. Here, we propose the generative adversarial networks (GANs) for scRNA-seq imputation (scIGANs), which uses generated cells rather than observed cells to avoid these limitations and balances the performance between major and rare cell populations. Evaluations based on a variety of simulated and real scRNA-seq datasets show that scIGANs is effective for dropout imputation and enhances various downstream analysis. ScIGANs is robust to small datasets that have very few genes with low expression and/or cell-to-cell variance. ScIGANs works equally well on datasets from different scRNA-seq protocols and is scalable to datasets with over 100 000 cells. We demonstrated in many ways with compelling evidence that scIGANs is not only an application of GANs in omics data but also represents a competing imputation method for the scRNA-seq data.

SUBMITTER: Xu Y 

PROVIDER: S-EPMC7470961 | biostudies-literature | 2020 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

scIGANs: single-cell RNA-seq imputation using generative adversarial networks.

Xu Yungang Y   Zhang Zhigang Z   You Lei L   Liu Jiajia J   Fan Zhiwei Z   Zhou Xiaobo X  

Nucleic acids research 20200901 15


Single-cell RNA-sequencing (scRNA-seq) enables the characterization of transcriptomic profiles at the single-cell resolution with increasingly high throughput. However, it suffers from many sources of technical noises, including insufficient mRNA molecules that lead to excess false zero values, termed dropouts. Computational approaches have been proposed to recover the biologically meaningful expression by borrowing information from similar cells in the observed dataset. However, these methods s  ...[more]

Similar Datasets

| S-EPMC6952370 | biostudies-literature
| S-EPMC11020228 | biostudies-literature
| S-EPMC8139054 | biostudies-literature
| S-EPMC10500083 | biostudies-literature
| S-EPMC9125575 | biostudies-literature
| S-EPMC7924467 | biostudies-literature
| S-EPMC7796974 | biostudies-literature
| S-EPMC8418522 | biostudies-literature
| S-EPMC6501316 | biostudies-literature
| S-EPMC10673642 | biostudies-literature