Unknown

Dataset Information

0

CellAtlasSearch: a scalable search engine for single cells.


ABSTRACT: Owing to the advent of high throughput single cell transcriptomics, past few years have seen exponential growth in production of gene expression data. Recently efforts have been made by various research groups to homogenize and store single cell expression from a large number of studies. The true value of this ever increasing data deluge can be unlocked by making it searchable. To this end, we propose CellAtlasSearch, a novel search architecture for high dimensional expression data, which is massively parallel as well as light-weight, thus infinitely scalable. In CellAtlasSearch, we use a Graphical Processing Unit (GPU) friendly version of Locality Sensitive Hashing (LSH) for unmatched speedup in data processing and query. Currently, CellAtlasSearch features over 300 000 reference expression profiles including both bulk and single-cell data. It enables the user query individual single cell transcriptomes and finds matching samples from the database along with necessary meta information. CellAtlasSearch aims to assist researchers and clinicians in characterizing unannotated single cells. It also facilitates noise free, low dimensional representation of single-cell expression profiles by projecting them on a wide variety of reference samples. The web-server is accessible at: http://www.cellatlassearch.com.

SUBMITTER: Srivastava D 

PROVIDER: S-EPMC6030823 | biostudies-literature | 2018 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

CellAtlasSearch: a scalable search engine for single cells.

Srivastava Divyanshu D   Iyer Arvind A   Kumar Vibhor V   Sengupta Debarka D  

Nucleic acids research 20180701 W1


Owing to the advent of high throughput single cell transcriptomics, past few years have seen exponential growth in production of gene expression data. Recently efforts have been made by various research groups to homogenize and store single cell expression from a large number of studies. The true value of this ever increasing data deluge can be unlocked by making it searchable. To this end, we propose CellAtlasSearch, a novel search architecture for high dimensional expression data, which is mas  ...[more]

Similar Datasets

| S-EPMC5525214 | biostudies-literature
| S-EPMC11165154 | biostudies-literature
| PRJEB4647 | ENA
| S-EPMC8187950 | biostudies-literature
2022-12-27 | GSE192644 | GEO
| S-EPMC4301746 | biostudies-literature
2019-07-23 | PXD014705 |
| S-EPMC6546127 | biostudies-literature
| S-EPMC3310226 | biostudies-literature
| S-EPMC6513154 | biostudies-literature