Dataset Information

A memory-efficient algorithm to obtain splicing graphs and de novo expression estimates from de Bruijn graphs of RNA-Seq data.

ABSTRACT: BACKGROUND: The recent advance of high-throughput sequencing makes it feasible to study entire transcriptomes through the application of de novo sequence assembly algorithms. While a popular strategy is to first construct an intermediate de Bruijn graph structure to represent the transcriptome, an additional step is needed to construct predicted transcripts from the graph. RESULTS: Since the de Bruijn graph contains all branching possibilities, we develop a memory-efficient algorithm to recover alternative splicing information and library-specific expression information directly from the graph without prior genomic knowledge. We implement the algorithm as a postprocessing module of the Velvet assembler. We validate our algorithm by simulating the transcriptome assembly of Drosophila using its known genome, and by performing Drosophila transcriptome assembly using publicly available RNA-Seq libraries. Under a range of conditions, our algorithm recovers sequences and alternative splicing junctions with higher specificity than Oases or Trans-ABySS. CONCLUSIONS: Since our postprocessing algorithm does not consume as much memory as Velvet and is less memory-intensive than Oases, it allows biologists to assemble large libraries with limited computational resources. Our algorithm has been applied to perform transcriptome assembly of the non-model blow fly Lucilia sericata that was reported in a previous article, which shows that the assembly is of high quality and it facilitates comparison of the Lucilia sericata transcriptome to Drosophila and two mosquitoes, prediction and experimental validation of alternative splicing, investigation of differential expression among various developmental stages, and identification of transposable elements.

SUBMITTER: Sze SH

PROVIDER: S-EPMC4120145 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A memory-efficient algorithm to obtain splicing graphs and de novo expression estimates from de Bruijn graphs of RNA-Seq data.

Sze Sing-Hoi SH Tarone Aaron M AM

BMC genomics 20140714

<h4>Background</h4>The recent advance of high-throughput sequencing makes it feasible to study entire transcriptomes through the application of de novo sequence assembly algorithms. While a popular strategy is to first construct an intermediate de Bruijn graph structure to represent the transcriptome, an additional step is needed to construct predicted transcripts from the graph.<h4>Results</h4>Since the de Bruijn graph contains all branching possibilities, we develop a memory-efficient algorith ...[more]

PMID: 25082000

Dataset Information

A memory-efficient algorithm to obtain splicing graphs and de novo expression estimates from de Bruijn graphs of RNA-Seq data.

Publications

A memory-efficient algorithm to obtain splicing graphs and de novo expression estimates from de Bruijn graphs of RNA-Seq data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Identifying splicing regulatory elements with de Bruijn graphs.
| S-EPMC4253301 | biostudies-literature

De novo assembly and genotyping of variants using colored de Bruijn graphs.
| S-EPMC3272472 | biostudies-literature

Velvet: algorithms for de novo short read assembly using de Bruijn graphs.
| S-EPMC2336801 | biostudies-literature

Succinct colored de Bruijn graphs.
| S-EPMC5872255 | biostudies-literature

Compacting de Bruijn graphs from sequencing data quickly and in low memory.
| S-EPMC4908363 | biostudies-literature

Lossless indexing with counting de Bruijn graphs.
| S-EPMC9528980 | biostudies-literature

Building large updatable colored de Bruijn graphs via merging.
| S-EPMC6612864 | biostudies-other

Buffering updates enables efficient dynamic de Bruijn graphs.
| S-EPMC8326735 | biostudies-literature

Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2.
| S-EPMC9454175 | biostudies-literature

Recycler: an algorithm for detecting plasmids from de novo assembly graphs.
| S-EPMC5408804 | biostudies-literature