Unknown

Dataset Information

0

Compacting and correcting Trinity and Oases RNA-Seq de novo assemblies.


ABSTRACT:

Background

De novo transcriptome assembly of short reads is now a common step in expression analysis of organisms lacking a reference genome sequence. Several software packages are available to perform this task. Even if their results are of good quality it is still possible to improve them in several ways including redundancy reduction or error correction. Trinity and Oases are two commonly used de novo transcriptome assemblers. The contig sets they produce are of good quality. Still, their compaction (number of contigs needed to represent the transcriptome) and their quality (chimera and nucleotide error rates) can be improved.

Results

We built a de novo RNA-Seq Assembly Pipeline (DRAP) which wraps these two assemblers (Trinity and Oases) in order to improve their results regarding the above-mentioned criteria. DRAP reduces from 1.3 to 15 fold the number of resulting contigs of the assemblies depending on the read set and the assembler used. This article presents seven assembly comparisons showing in some cases drastic improvements when using DRAP. DRAP does not significantly impair assembly quality metrics such are read realignment rate or protein reconstruction counts.

Conclusion

Transcriptome assembly is a challenging computational task even if good solutions are already available to end-users, these solutions can still be improved while conserving the overall representation and quality of the assembly. The de novo RNA-Seq Assembly Pipeline (DRAP) is an easy to use software package to produce compact and corrected transcript set. DRAP is free, open-source and available under GPL V3 license at http://www.sigenae.org/drap.

SUBMITTER: Cabau C 

PROVIDER: S-EPMC5316280 | biostudies-literature | 2017

REPOSITORIES: biostudies-literature

altmetric image

Publications

Compacting and correcting Trinity and Oases RNA-Seq <i>de novo</i> assemblies.

Cabau Cédric C   Escudié Frédéric F   Djari Anis A   Guiguen Yann Y   Bobe Julien J   Klopp Christophe C  

PeerJ 20170216


<h4>Background</h4><i>De novo</i> transcriptome assembly of short reads is now a common step in expression analysis of organisms lacking a reference genome sequence. Several software packages are available to perform this task. Even if their results are of good quality it is still possible to improve them in several ways including redundancy reduction or error correction. Trinity and Oases are two commonly used <i>de novo</i> transcriptome assemblers. The contig sets they produce are of good qua  ...[more]

Similar Datasets

| S-EPMC3324515 | biostudies-literature
| S-EPMC4298084 | biostudies-literature
| S-EPMC3875132 | biostudies-literature
| S-EPMC8070742 | biostudies-literature
| S-EPMC4670531 | biostudies-literature
| S-EPMC8170690 | biostudies-literature
| S-EPMC3493127 | biostudies-literature
| S-EPMC4316799 | biostudies-literature
| S-EPMC4595904 | biostudies-other
| S-EPMC3358658 | biostudies-literature