Unknown

Dataset Information

0

De novo transcriptome assembly: A comprehensive cross-species comparison of short-read RNA-Seq assemblers.


ABSTRACT: BACKGROUND:In recent years, massively parallel complementary DNA sequencing (RNA sequencing [RNA-Seq]) has emerged as a fast, cost-effective, and robust technology to study entire transcriptomes in various manners. In particular, for non-model organisms and in the absence of an appropriate reference genome, RNA-Seq is used to reconstruct the transcriptome de novo. Although the de novo transcriptome assembly of non-model organisms has been on the rise recently and new tools are frequently developing, there is still a knowledge gap about which assembly software should be used to build a comprehensive de novo assembly. RESULTS:Here, we present a large-scale comparative study in which 10 de novo assembly tools are applied to 9 RNA-Seq data sets spanning different kingdoms of life. Overall, we built >200 single assemblies and evaluated their performance on a combination of 20 biological-based and reference-free metrics. Our study is accompanied by a comprehensive and extensible Electronic Supplement that summarizes all data sets, assembly execution instructions, and evaluation results. Trinity, SPAdes, and Trans-ABySS, followed by Bridger and SOAPdenovo-Trans, generally outperformed the other tools compared. Moreover, we observed species-specific differences in the performance of each assembler. No tool delivered the best results for all data sets. CONCLUSIONS:We recommend a careful choice and normalization of evaluation metrics to select the best assembling results as a critical step in the reconstruction of a comprehensive de novo transcriptome assembly.

SUBMITTER: Holzer M 

PROVIDER: S-EPMC6511074 | biostudies-literature | 2019 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

De novo transcriptome assembly: A comprehensive cross-species comparison of short-read RNA-Seq assemblers.

Hölzer Martin M   Marz Manja M  

GigaScience 20190501 5


<h4>Background</h4>In recent years, massively parallel complementary DNA sequencing (RNA sequencing [RNA-Seq]) has emerged as a fast, cost-effective, and robust technology to study entire transcriptomes in various manners. In particular, for non-model organisms and in the absence of an appropriate reference genome, RNA-Seq is used to reconstruct the transcriptome de novo. Although the de novo transcriptome assembly of non-model organisms has been on the rise recently and new tools are frequently  ...[more]

Similar Datasets

| S-EPMC3485621 | biostudies-literature
| S-EPMC3287467 | biostudies-literature
| S-EPMC3663818 | biostudies-literature
| S-EPMC3749127 | biostudies-literature
| S-EPMC4124492 | biostudies-literature
| S-EPMC3223268 | biostudies-literature
| S-EPMC3091720 | biostudies-literature
| S-EPMC2336801 | biostudies-literature
| S-EPMC4824410 | biostudies-literature
| S-EPMC3227110 | biostudies-literature