Unknown

Dataset Information

0

Detection of correlated hidden factors from single cell transcriptomes using Iteratively Adjusted-SVA (IA-SVA).


ABSTRACT: Single cell RNA-sequencing (scRNA-seq) precisely characterizes gene expression levels and dissects variation in expression associated with the state (technical or biological) and the type of the cell, which is averaged out in bulk measurements. Multiple and correlated sources contribute to gene expression variation in single cells, which makes their estimation difficult with the existing methods developed for batch correction (e.g., surrogate variable analysis (SVA)) that estimate orthogonal transformations of these sources. We developed iteratively adjusted surrogate variable analysis (IA-SVA) that can estimate hidden factors even when they are correlated with other sources of variation by identifying a set of genes associated with each hidden factor in an iterative manner. Analysis of scRNA-seq data from human cells showed that IA-SVA could accurately capture hidden variation arising from technical (e.g., stacked doublet cells) or biological sources (e.g., cell type or cell-cycle stage). Furthermore, IA-SVA delivers a set of genes associated with the detected hidden source to be used in downstream data analyses. As a proof of concept, IA-SVA recapitulated known marker genes for islet cell subsets (e.g., alpha, beta), which improved the grouping of subsets into distinct clusters. Taken together, IA-SVA is an effective and novel method to dissect multiple and correlated sources of variation in scRNA-seq data.

SUBMITTER: Lee D 

PROVIDER: S-EPMC6242813 | biostudies-other | 2018 Nov

REPOSITORIES: biostudies-other

altmetric image

Publications

Detection of correlated hidden factors from single cell transcriptomes using Iteratively Adjusted-SVA (IA-SVA).

Lee Donghyung D   Cheng Anthony A   Lawlor Nathan N   Bolisetty Mohan M   Ucar Duygu D  

Scientific reports 20181119 1


Single cell RNA-sequencing (scRNA-seq) precisely characterizes gene expression levels and dissects variation in expression associated with the state (technical or biological) and the type of the cell, which is averaged out in bulk measurements. Multiple and correlated sources contribute to gene expression variation in single cells, which makes their estimation difficult with the existing methods developed for batch correction (e.g., surrogate variable analysis (SVA)) that estimate orthogonal tra  ...[more]

Similar Datasets

| S-EPMC6551256 | biostudies-literature
| S-EPMC8011753 | biostudies-literature
| S-EPMC5707349 | biostudies-literature
| S-EPMC6280782 | biostudies-literature
| S-EPMC9616830 | biostudies-literature
| S-EPMC7649823 | biostudies-literature
| S-EPMC9458468 | biostudies-literature
| S-EPMC2770071 | biostudies-literature
| S-EPMC5800078 | biostudies-literature
| S-EPMC6306246 | biostudies-literature