Unknown

Dataset Information

0

Enhancing de novo transcriptome assembly by incorporating multiple overlap sizes.


ABSTRACT: Background. The emergence of next-generation sequencing platform gives rise to a new generation of assembly algorithms. Compared with the Sanger sequencing data, the next-generation sequence data present shorter reads, higher coverage depth, and different error profiles. These features bring new challenging issues for de novo transcriptome assembly. Methodology. To explore the influence of these features on assembly algorithms, we studied the relationship between read overlap size, coverage depth, and error rate using simulated data. According to the relationship, we propose a de novo transcriptome assembly procedure, called Euler-mix, and demonstrate its performance on a real transcriptome dataset of mice. The simulation tool and evaluation tool are freely available as open source. Significance. Euler-mix is a straightforward pipeline; it focuses on dealing with the variation of coverage depth of short reads dataset. The experiment result showed that Euler-mix improves the performance of de novo transcriptome assembly.

SUBMITTER: Chen CC 

PROVIDER: S-EPMC4417554 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

altmetric image

Publications

Enhancing de novo transcriptome assembly by incorporating multiple overlap sizes.

Chen Chien-Chih CC   Lin Wen-Dar WD   Chang Yu-Jung YJ   Chen Chuen-Liang CL   Ho Jan-Ming JM  

ISRN bioinformatics 20120423


Background. The emergence of next-generation sequencing platform gives rise to a new generation of assembly algorithms. Compared with the Sanger sequencing data, the next-generation sequence data present shorter reads, higher coverage depth, and different error profiles. These features bring new challenging issues for de novo transcriptome assembly. Methodology. To explore the influence of these features on assembly algorithms, we studied the relationship between read overlap size, coverage dept  ...[more]

Similar Datasets

| S-EPMC5411778 | biostudies-literature
| S-EPMC5200869 | biostudies-literature
| S-EPMC4892416 | biostudies-literature
| S-EPMC4664767 | biostudies-literature
| S-EPMC6549443 | biostudies-literature
| S-EPMC4778644 | biostudies-literature
| S-EPMC4878842 | biostudies-literature
| S-EPMC4878839 | biostudies-literature
| S-EPMC8417280 | biostudies-literature
| S-EPMC2824677 | biostudies-literature