Unknown

Dataset Information

0

Evaluation and validation of de novo and hybrid assembly techniques to derive high-quality genome sequences.


ABSTRACT: MOTIVATION: To assess the potential of different types of sequence data combined with de novo and hybrid assembly approaches to improve existing draft genome sequences. RESULTS: Illumina, 454 and PacBio sequencing technologies were used to generate de novo and hybrid genome assemblies for four different bacteria, which were assessed for quality using summary statistics (e.g. number of contigs, N50) and in silico evaluation tools. Differences in predictions of multiple copies of rDNA operons for each respective bacterium were evaluated by PCR and Sanger sequencing, and then the validated results were applied as an additional criterion to rank assemblies. In general, assemblies using longer PacBio reads were better able to resolve repetitive regions. In this study, the combination of Illumina and PacBio sequence data assembled through the ALLPATHS-LG algorithm gave the best summary statistics and most accurate rDNA operon number predictions. This study will aid others looking to improve existing draft genome assemblies. AVAILABILITY AND IMPLEMENTATION: All assembly tools except CLC Genomics Workbench are freely available under GNU General Public License. CONTACT: brownsd@ornl.gov SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

SUBMITTER: Utturkar SM 

PROVIDER: S-EPMC4173024 | biostudies-literature | 2014 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Evaluation and validation of de novo and hybrid assembly techniques to derive high-quality genome sequences.

Utturkar Sagar M SM   Klingeman Dawn M DM   Land Miriam L ML   Schadt Christopher W CW   Doktycz Mitchel J MJ   Pelletier Dale A DA   Brown Steven D SD  

Bioinformatics (Oxford, England) 20140614 19


<h4>Motivation</h4>To assess the potential of different types of sequence data combined with de novo and hybrid assembly approaches to improve existing draft genome sequences.<h4>Results</h4>Illumina, 454 and PacBio sequencing technologies were used to generate de novo and hybrid genome assemblies for four different bacteria, which were assessed for quality using summary statistics (e.g. number of contigs, N50) and in silico evaluation tools. Differences in predictions of multiple copies of rDNA  ...[more]

Similar Datasets

| S-EPMC3983118 | biostudies-literature
| S-EPMC4999880 | biostudies-other
| S-EPMC7417191 | biostudies-literature
| S-EPMC6741814 | biostudies-literature
| S-EPMC10996661 | biostudies-literature
| S-EPMC8041623 | biostudies-literature
| S-EPMC4403674 | biostudies-literature
| S-EPMC4101353 | biostudies-literature
| S-EPMC8590762 | biostudies-literature
2016-06-07 | MSV000079801 | MassIVE