Unknown

Dataset Information

0

REAPR: a universal tool for genome assembly evaluation.


ABSTRACT: Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/.

SUBMITTER: Hunt M 

PROVIDER: S-EPMC3798757 | biostudies-literature | 2013 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

REAPR: a universal tool for genome assembly evaluation.

Hunt Martin M   Kikuchi Taisei T   Sanders Mandy M   Newbold Chris C   Berriman Matthew M   Otto Thomas D TD  

Genome biology 20130527 5


Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free,  ...[more]

Similar Datasets

| S-EPMC10556450 | biostudies-literature
| S-EPMC4184257 | biostudies-literature
| S-EPMC8627028 | biostudies-literature
2013-07-15 | E-MTAB-1730 | biostudies-arrayexpress
| S-EPMC6022658 | biostudies-literature
| S-EPMC6203371 | biostudies-literature
| S-EPMC9825286 | biostudies-literature
2009-01-19 | GSE14435 | GEO
2010-05-06 | E-GEOD-14435 | biostudies-arrayexpress
| S-EPMC5749717 | biostudies-literature