Unknown

Dataset Information

0

Deterministic column subset selection for single-cell RNA-Seq.


ABSTRACT: Analysis of single-cell RNA sequencing (scRNA-Seq) data often involves filtering out uninteresting or poorly measured genes and dimensionality reduction to reduce noise and simplify data visualization. However, techniques such as principal components analysis (PCA) fail to preserve non-negativity and sparsity structures present in the original matrices, and the coordinates of projected cells are not easily interpretable. Commonly used thresholding methods to filter genes avoid those pitfalls, but ignore collinearity and covariance in the original matrix. We show that a deterministic column subset selection (DCSS) method possesses many of the favorable properties of common thresholding methods and PCA, while avoiding pitfalls from both. We derive new spectral bounds for DCSS. We apply DCSS to two measures of gene expression from two scRNA-Seq experiments with different clustering workflows, and compare to three thresholding methods. In each case study, the clusters based on the small subset of the complete gene expression profile selected by DCSS are similar to clusters produced from the full set. The resulting clusters are informative for cell type.

SUBMITTER: McCurdy SR 

PROVIDER: S-EPMC6347249 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

Deterministic column subset selection for single-cell RNA-Seq.

McCurdy Shannon R SR   McCurdy Shannon R SR   Ntranos Vasilis V   Pachter Lior L  

PloS one 20190125 1


Analysis of single-cell RNA sequencing (scRNA-Seq) data often involves filtering out uninteresting or poorly measured genes and dimensionality reduction to reduce noise and simplify data visualization. However, techniques such as principal components analysis (PCA) fail to preserve non-negativity and sparsity structures present in the original matrices, and the coordinates of projected cells are not easily interpretable. Commonly used thresholding methods to filter genes avoid those pitfalls, bu  ...[more]

Similar Datasets

| S-EPMC8644062 | biostudies-literature
| S-EPMC6544759 | biostudies-literature
| S-EPMC8498858 | biostudies-literature
| S-EPMC5499114 | biostudies-other
| S-ECPF-GEOD-57872 | biostudies-other
| S-EPMC6927135 | biostudies-literature
| S-EPMC8568278 | biostudies-literature
| S-EPMC6501316 | biostudies-literature
| S-EPMC5782816 | biostudies-literature
| S-EPMC5985341 | biostudies-literature