Unknown

Dataset Information

0

GFinisher: a new strategy to refine and finish bacterial genome assemblies.


ABSTRACT: Despite the development in DNA sequencing technology, improving the number and the length of reads, the process of reconstruction of complete genome sequences, the so called genome assembly, is still complex. Only 13% of the prokaryotic genome sequencing projects have been completed. Draft genome sequences deposited in public databases are fragmented in contigs and may lack the full gene complement. The aim of the present work is to identify assembly errors and improve the assembly process of bacterial genomes. The biological patterns observed in genomic sequences and the application of a priori information can allow the identification of misassembled regions, and the reorganization and improvement of the overall de novo genome assembly. GFinisher starts generating a Fuzzy GC skew graphs for each contig in an assembly and follows breaking down the contigs in critical points in order to reassemble and close them using jFGap. This has been successfully applied to dataset from 96 genome assemblies, decreasing the number of contigs by up to 86%. GFinisher can easily optimize assemblies of prokaryotic draft genomes and can be used to improve the assembly programs based on nucleotide sequence patterns in the genome. The software and source code are available at http://gfinisher.sourceforge.net/.

SUBMITTER: Guizelini D 

PROVIDER: S-EPMC5056350 | biostudies-literature | 2016 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

GFinisher: a new strategy to refine and finish bacterial genome assemblies.

Guizelini Dieval D   Raittz Roberto T RT   Cruz Leonardo M LM   Souza Emanuel M EM   Steffens Maria B R MB   Pedrosa Fabio O FO  

Scientific reports 20161010


Despite the development in DNA sequencing technology, improving the number and the length of reads, the process of reconstruction of complete genome sequences, the so called genome assembly, is still complex. Only 13% of the prokaryotic genome sequencing projects have been completed. Draft genome sequences deposited in public databases are fragmented in contigs and may lack the full gene complement. The aim of the present work is to identify assembly errors and improve the assembly process of ba  ...[more]

Similar Datasets

| S-EPMC4348652 | biostudies-literature
| S-EPMC5695209 | biostudies-literature
| S-EPMC8812927 | biostudies-literature
| S-EPMC5481147 | biostudies-literature
| S-EPMC156095 | biostudies-literature
| S-EPMC4779615 | biostudies-literature
| S-EPMC8248862 | biostudies-literature
| S-EPMC9846640 | biostudies-literature
2023-04-20 | GSE149028 | GEO
| S-EPMC11261834 | biostudies-literature