Unknown

Dataset Information

0

Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples.


ABSTRACT:

Motivation

As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be difficult to obtain.

Results

Snaptron is a search engine for summarized RNA sequencing data with a query planner that leverages R-tree, B-tree and inverted indexing strategies to rapidly execute queries over 146 million exon-exon splice junctions from over 70 000 human RNA-seq samples. Queries can be tailored by constraining which junctions and samples to consider. Snaptron can score junctions according to tissue specificity or other criteria, and can score samples according to the relative frequency of different splicing patterns. We describe the software and outline biological questions that can be explored with Snaptron queries.

Availability and implementation

Documentation is at http://snaptron.cs.jhu.edu. Source code is at https://github.com/ChristopherWilks/snaptron and https://github.com/ChristopherWilks/snaptron-experiments with a CC BY-NC 4.0 license.

Contact

chris.wilks@jhu.edu or langmea@cs.jhu.edu.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Wilks C 

PROVIDER: S-EPMC5870547 | biostudies-literature | 2018 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples.

Wilks Christopher C   Gaddipati Phani P   Nellore Abhinav A   Langmead Ben B  

Bioinformatics (Oxford, England) 20180101 1


<h4>Motivation</h4>As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be difficult to obtain.<h4>Results</h4>Snaptron is a search engine for summarized RNA sequencing data with a query planner that leverages R-tree, B-tree and inverted indexing strategies to rapidly execute queries over 146 million exon-exon splice junctions from over 70 000 human RNA-se  ...[more]

Similar Datasets

| S-EPMC6194578 | biostudies-literature
| S-EPMC5444801 | biostudies-literature
| S-ECPF-ERAD-61 | biostudies-other
| S-EPMC7851401 | biostudies-literature
| S-EPMC3892928 | biostudies-literature
| S-EPMC4150760 | biostudies-literature
| S-EPMC4132698 | biostudies-literature
| S-EPMC11336631 | biostudies-literature
| S-EPMC8871965 | biostudies-literature
| S-EPMC9825497 | biostudies-literature