Unknown

Dataset Information

0

Versatile genome assembly evaluation with QUAST-LG.


ABSTRACT:

Motivation

The emergence of high-throughput sequencing technologies revolutionized genomics in early 2000s. The next revolution came with the era of long-read sequencing. These technological advances along with novel computational approaches became the next step towards the automatic pipelines capable to assemble nearly complete mammalian-size genomes.

Results

In this manuscript, we demonstrate performance of the state-of-the-art genome assembly software on six eukaryotic datasets sequenced using different technologies. To evaluate the results, we developed QUAST-LG-a tool that compares large genomic de novo assemblies against reference sequences and computes relevant quality metrics. Since genomes generally cannot be reconstructed completely due to complex repeat patterns and low coverage regions, we introduce a concept of upper bound assembly for a given genome and set of reads, and compute theoretical limits on assembly correctness and completeness. Using QUAST-LG, we show how close the assemblies are to the theoretical optimum, and how far this optimum is from the finished reference.

Availability and implementation

http://cab.spbu.ru/software/quast-lg.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Mikheenko A 

PROVIDER: S-EPMC6022658 | biostudies-literature | 2018 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Versatile genome assembly evaluation with QUAST-LG.

Mikheenko Alla A   Prjibelski Andrey A   Saveliev Vladislav V   Antipov Dmitry D   Gurevich Alexey A  

Bioinformatics (Oxford, England) 20180701 13


<h4>Motivation</h4>The emergence of high-throughput sequencing technologies revolutionized genomics in early 2000s. The next revolution came with the era of long-read sequencing. These technological advances along with novel computational approaches became the next step towards the automatic pipelines capable to assemble nearly complete mammalian-size genomes.<h4>Results</h4>In this manuscript, we demonstrate performance of the state-of-the-art genome assembly software on six eukaryotic datasets  ...[more]

Similar Datasets

2013-07-15 | E-MTAB-1730 | biostudies-arrayexpress
2003-11-30 | GSE844 | GEO
2013-10-28 | E-GEOD-51619 | biostudies-arrayexpress
| S-EPMC3798757 | biostudies-literature
2013-10-28 | GSE51619 | GEO
| S-EPMC5749717 | biostudies-literature
| S-EPMC3290791 | biostudies-literature
| S-EPMC9045610 | biostudies-literature
| S-EPMC8877152 | biostudies-literature
| S-EPMC5563063 | biostudies-other