Unknown

Dataset Information

0

D-EE: Distributed software for visualizing intrinsic structure of large-scale single-cell data.


ABSTRACT: BACKGROUND:Dimensionality reduction and visualization play vital roles in single-cell RNA sequencing (scRNA-seq) data analysis. While they have been extensively studied, state-of-the-art dimensionality reduction algorithms are often unable to preserve the global structures underlying data. Elastic embedding (EE), a nonlinear dimensionality reduction method, has shown promise in revealing low-dimensional intrinsic local and global data structure. However, the current implementation of the EE algorithm lacks scalability to large-scale scRNA-seq data. RESULTS:We present a distributed optimization implementation of the EE algorithm, termed distributed elastic embedding (D-EE). D-EE reveals the low-dimensional intrinsic structures of data with accuracy equal to that of elastic embedding, and it is scalable to large-scale scRNA-seq data. It leverages distributed storage and distributed computation, achieving memory efficiency and high-performance computing simultaneously. In addition, an extended version of D-EE, termed distributed optimization implementation of time-series elastic embedding (D-TSEE), enables the user to visualize large-scale time-series scRNA-seq data by incorporating experimentally temporal information. Results with large-scale scRNA-seq data indicate that D-TSEE can uncover oscillatory gene expression patterns by using experimentally temporal information. CONCLUSIONS:D-EE is a distributed dimensionality reduction and visualization tool. Its distributed storage and distributed computation technique allow us to efficiently analyze large-scale single-cell data at the cost of constant time speedup. The source code for D-EE algorithm based on C and MPI tailored to a high-performance computing cluster is available at https://github.com/ShaokunAn/D-EE.

SUBMITTER: An S 

PROVIDER: S-EPMC7657844 | biostudies-literature | 2020 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

D-EE: Distributed software for visualizing intrinsic structure of large-scale single-cell data.

An Shaokun S   Huang Jizu J   Wan Lin L  

GigaScience 20201101 11


<h4>Background</h4>Dimensionality reduction and visualization play vital roles in single-cell RNA sequencing (scRNA-seq) data analysis. While they have been extensively studied, state-of-the-art dimensionality reduction algorithms are often unable to preserve the global structures underlying data. Elastic embedding (EE), a nonlinear dimensionality reduction method, has shown promise in revealing low-dimensional intrinsic local and global data structure. However, the current implementation of the  ...[more]

Similar Datasets

| S-EPMC8478610 | biostudies-literature
| S-EPMC1428800 | biostudies-literature
| S-EPMC5290626 | biostudies-literature
| S-EPMC5850912 | biostudies-literature
| S-EPMC6179193 | biostudies-literature
| S-EPMC5802054 | biostudies-other
| S-EPMC8098009 | biostudies-literature
| S-EPMC5870549 | biostudies-literature