Unknown

Dataset Information

0

Exploiting sparseness in de novo genome assembly.


ABSTRACT: BACKGROUND: The very large memory requirements for the construction of assembly graphs for de novo genome assembly limit current algorithms to super-computing environments. METHODS: In this paper, we demonstrate that constructing a sparse assembly graph which stores only a small fraction of the observed k-mers as nodes and the links between these nodes allows the de novo assembly of even moderately-sized genomes (~500 M) on a typical laptop computer. RESULTS: We implement this sparse graph concept in a proof-of-principle software package, SparseAssembler, utilizing a new sparse k-mer graph structure evolved from the de Bruijn graph. We test our SparseAssembler with both simulated and real data, achieving ~90% memory savings and retaining high assembly accuracy, without sacrificing speed in comparison to existing de novo assemblers.

SUBMITTER: Ye C 

PROVIDER: S-EPMC3369186 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

altmetric image

Publications

Exploiting sparseness in de novo genome assembly.

Ye Chengxi C   Ma Zhanshan Sam ZS   Cannon Charles H CH   Pop Mihai M   Yu Douglas W DW  

BMC bioinformatics 20120419


<h4>Background</h4>The very large memory requirements for the construction of assembly graphs for de novo genome assembly limit current algorithms to super-computing environments.<h4>Methods</h4>In this paper, we demonstrate that constructing a sparse assembly graph which stores only a small fraction of the observed k-mers as nodes and the links between these nodes allows the de novo assembly of even moderately-sized genomes (~500 M) on a typical laptop computer.<h4>Results</h4>We implement this  ...[more]

Similar Datasets

| S-EPMC6533699 | biostudies-literature
| S-EPMC8289375 | biostudies-literature
| S-EPMC10997618 | biostudies-literature
| S-EPMC5681816 | biostudies-other
| PRJEB20588 | ENA
| PRJEB58627 | ENA
| PRJEB59410 | ENA
| S-EPMC3158087 | biostudies-literature
| S-EPMC6326164 | biostudies-literature
| S-EPMC6362891 | biostudies-literature