Unknown

Dataset Information

0

Employing whole genome mapping for optimal de novo assembly of bacterial genomes.


ABSTRACT: BACKGROUND: De novo genome assembly can be challenging due to inherent properties of the reads, even when using current state-of-the-art assembly tools based on de Bruijn graphs. Often users are not bio-informaticians and, in a black box approach, utilise assembly parameters such as contig length and N50 to generate whole genome sequences, potentially resulting in mis-assemblies. FINDINGS: Utilising several assembly tools based on de Bruijn graphs like Velvet, SPAdes and IDBA, we demonstrate that at the optimal N50, mis-assemblies do occur, even when using the multi-k-mer approaches of SPAdes and IDBA. We demonstrate that whole genome mapping can be used to identify these mis-assemblies and can guide the selection of the best k-mer size which yields the highest N50 without mis-assemblies. CONCLUSIONS: We demonstrate the utility of whole genome mapping (WGM) as a tool to identify mis-assemblies and to guide k-mer selection and higher quality de novo genome assembly of bacterial genomes.

SUBMITTER: Xavier BB 

PROVIDER: S-EPMC4118782 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Employing whole genome mapping for optimal de novo assembly of bacterial genomes.

Xavier Basil Britto BB   Sabirova Julia J   Pieter Moons M   Hernalsteens Jean-Pierre JP   de Greve Henri H   Goossens Herman H   Malhotra-Kumar Surbhi S  

BMC research notes 20140730


<h4>Background</h4>De novo genome assembly can be challenging due to inherent properties of the reads, even when using current state-of-the-art assembly tools based on de Bruijn graphs. Often users are not bio-informaticians and, in a black box approach, utilise assembly parameters such as contig length and N50 to generate whole genome sequences, potentially resulting in mis-assemblies.<h4>Findings</h4>Utilising several assembly tools based on de Bruijn graphs like Velvet, SPAdes and IDBA, we de  ...[more]

Similar Datasets

| S-EPMC5389512 | biostudies-literature
| S-EPMC2731171 | biostudies-literature
| S-EPMC8306402 | biostudies-literature
| S-EPMC3629165 | biostudies-literature
| S-EPMC4120091 | biostudies-literature
| S-EPMC4779561 | biostudies-literature
| S-EPMC6052550 | biostudies-literature
| S-EPMC3558281 | biostudies-literature
| S-EPMC3767511 | biostudies-literature
| S-EPMC8727959 | biostudies-literature