Unknown

Dataset Information

0

A Tool for Visualization and Analysis of Single-Cell RNA-Seq Data Based on Text Mining.


ABSTRACT: Gene expression in individual cells can now be measured for thousands of cells in a single experiment thanks to innovative sample-preparation and sequencing technologies. State-of-the-art computational pipelines for single-cell RNA-sequencing data, however, still employ computational methods that were developed for traditional bulk RNA-sequencing data, thus not accounting for the peculiarities of single-cell data, such as sparseness and zero-inflated counts. Here, we present a ready-to-use pipeline named gf-icf (gene frequency-inverse cell frequency) for normalization of raw counts, feature selection, and dimensionality reduction of scRNA-seq data for their visualization and subsequent analyses. Our work is based on a data transformation model named term frequency-inverse document frequency (TF-IDF), which has been extensively used in the field of text mining where extremely sparse and zero-inflated data are common. Using benchmark scRNA-seq datasets, we show that the gf-icf pipeline outperforms existing state-of-the-art methods in terms of improved visualization and ability to separate and distinguish different cell types.

SUBMITTER: Gambardella G 

PROVIDER: S-EPMC6696874 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Tool for Visualization and Analysis of Single-Cell RNA-Seq Data Based on Text Mining.

Gambardella Gennaro G   di Bernardo Diego D  

Frontiers in genetics 20190809


Gene expression in individual cells can now be measured for thousands of cells in a single experiment thanks to innovative sample-preparation and sequencing technologies. State-of-the-art computational pipelines for single-cell RNA-sequencing data, however, still employ computational methods that were developed for traditional bulk RNA-sequencing data, thus not accounting for the peculiarities of single-cell data, such as sparseness and zero-inflated counts. Here, we present a ready-to-use pipel  ...[more]

Similar Datasets

| S-EPMC6402590 | biostudies-literature
| S-EPMC7260831 | biostudies-literature
| S-EPMC5870842 | biostudies-other
| S-EPMC7235421 | biostudies-literature
| S-EPMC6304778 | biostudies-other
| S-EPMC8742092 | biostudies-literature
| S-EPMC8215916 | biostudies-literature
| S-EPMC4995760 | biostudies-literature
| S-EPMC6218547 | biostudies-literature
| S-EPMC8379379 | biostudies-literature