Unknown

Dataset Information

0

A pipeline for completing bacterial genomes using in silico and wet lab approaches.


ABSTRACT: BACKGROUND: Despite the large volume of genome sequencing data produced by next-generation sequencing technologies and the highly sophisticated software dedicated to handling these types of data, gaps are commonly found in draft genome assemblies. The existence of gaps compromises our ability to take full advantage of the genome data. This study aims to identify a practical approach for biologists to complete their own genome assemblies using commonly available tools and resources. RESULTS: A pipeline was developed to assemble complete genomes primarily from the next generation sequencing (NGS) data. The input of the pipeline is paired-end Illumina sequence reads, and the output is a high quality complete genome sequence. The pipeline alternates the employment of computational and biological methods in seven steps. It combines the strengths of de novo assembly, reference-based assembly, customized programming, public databases utilization, and wet lab experimentation. The application of the pipeline is demonstrated by the completion of a bacterial genome, Thermotoga sp. strain RQ7, a hydrogen-producing strain. CONCLUSIONS: The developed pipeline provides an example of effective integration of computational and biological principles. It highlights the complementary roles that in silico and wet lab methodologies play in bioinformatical studies. The constituting principles and methods are applicable to similar studies on both prokaryotic and eukaryotic genomes.

SUBMITTER: Puranik R 

PROVIDER: S-EPMC4331810 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

altmetric image

Publications

A pipeline for completing bacterial genomes using in silico and wet lab approaches.

Puranik Rutika R   Quan Guangri G   Werner Jacob J   Zhou Rong R   Xu Zhaohui Z  

BMC genomics 20150129


<h4>Background</h4>Despite the large volume of genome sequencing data produced by next-generation sequencing technologies and the highly sophisticated software dedicated to handling these types of data, gaps are commonly found in draft genome assemblies. The existence of gaps compromises our ability to take full advantage of the genome data. This study aims to identify a practical approach for biologists to complete their own genome assemblies using commonly available tools and resources.<h4>Res  ...[more]

Similar Datasets

| S-EPMC2660880 | biostudies-other
| S-EPMC7367185 | biostudies-literature
| S-EPMC11234778 | biostudies-literature
| S-EPMC6737777 | biostudies-literature
| S-EPMC6522066 | biostudies-literature
| S-EPMC7406220 | biostudies-literature
| S-EPMC8441997 | biostudies-literature
| S-EPMC9973855 | biostudies-literature
| S-EPMC8373214 | biostudies-literature
| S-EPMC10385524 | biostudies-literature