Transcriptomics

Dataset Information

0

Signal recovery in single cell batch integration


ABSTRACT: Data integration to align cells across batches has become a cornerstone of single cell data analysis, critically affecting downstream results. Yet, how much biological signal is erased during integration? Currently, there are no guidelines for when the biological differences between samples are separable from batch effects, and thus, data integration involves a lot of guesswork: Cells across batches should be aligned to be “appropriately” mixed, while preserving “main cell type clusters”. We show evidence that current paradigms for single cell data integration are unnecessarily aggressive, removing biologically meaningful variation and introducing distortion. To remedy this, we present a novel statistical model and computationally scalable algorithm, CellANOVA, that harnesses experimental design to explicitly recover biological signals that are erased during single cell data integration. CellANOVA utilizes a “pool-of-controls” design concept, applicable across diverse settings, to separate unwanted variation from biological variation of interest. When applied with existing integration methods, CellANOVA allows the recovery of subtle biological signals and corrects, to a large extent, the data distortion introduced by integration. Further, CellANOVA explicitly estimates cell- and gene-specific batch effect terms which can be used to identify the cell types and pathways exhibiting the largest batch variations, providing clarity as to which biological signals can be recovered. These concepts are illustrated on studies of diverse designs, where the biological signals that are recovered by CellANOVA are validated by orthogonal assays. In particular, we show that CellANOVA is effective in the challenging case of single-cell and single-nuclei data integration, where it recovered subtle biological signals are can be validated and replicated by external data.

ORGANISM(S): Mus musculus

PROVIDER: GSE280883 | GEO | 2024/11/08

REPOSITORIES: GEO

Dataset's files

Source:
Action DRS
Other
Items per page:
1 - 1 of 1

Similar Datasets

2015-06-01 | GSE54275 | GEO
2021-01-31 | E-MTAB-9916 | biostudies-arrayexpress
2016-12-12 | GSE53355 | GEO
2021-07-05 | MTBLS2483 | MetaboLights
2015-06-30 | GSE44900 | GEO
2023-04-12 | GSE189788 | GEO
2023-11-25 | GSE223041 | GEO
2021-06-15 | GSE153071 | GEO
2024-03-18 | PXD036799 | Pride
2019-05-16 | GSE129865 | GEO