Unknown

Dataset Information

0

ELECTOR: evaluator for long reads correction methods.


ABSTRACT: The error rates of third-generation sequencing data have been capped >5%, mainly containing insertions and deletions. Thereby, an increasing number of diverse long reads correction methods have been proposed. The quality of the correction has huge impacts on downstream processes. Therefore, developing methods allowing to evaluate error correction tools with precise and reliable statistics is a crucial need. These evaluation methods rely on costly alignments to evaluate the quality of the corrected reads. Thus, key features must allow the fast comparison of different tools, and scale to the increasing length of the long reads. Our tool, ELECTOR, evaluates long reads correction and is directly compatible with a wide range of error correction tools. As it is based on multiple sequence alignment, we introduce a new algorithmic strategy for alignment segmentation, which enables us to scale to large instances using reasonable resources. To our knowledge, we provide the unique method that allows producing reproducible correction benchmarks on the latest ultra-long reads (>100 k bases). It is also faster than the current state-of-the-art on other datasets and provides a wider set of metrics to assess the read quality improvement after correction. ELECTOR is available on GitHub (https://github.com/kamimrcht/ELECTOR) and Bioconda.

SUBMITTER: Marchet C 

PROVIDER: S-EPMC7671326 | biostudies-literature | 2020 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

ELECTOR: evaluator for long reads correction methods.

Marchet Camille C   Morisse Pierre P   Lecompte Lolita L   Lefebvre Arnaud A   Lecroq Thierry T   Peterlongo Pierre P   Limasset Antoine A  

NAR genomics and bioinformatics 20191114 1


The error rates of third-generation sequencing data have been capped >5%, mainly containing insertions and deletions. Thereby, an increasing number of diverse long reads correction methods have been proposed. The quality of the correction has huge impacts on downstream processes. Therefore, developing methods allowing to evaluate error correction tools with precise and reliable statistics is a crucial need. These evaluation methods rely on costly alignments to evaluate the quality of the correct  ...[more]

Similar Datasets

| S-EPMC6362602 | biostudies-literature
| S-EPMC6966875 | biostudies-literature
| S-EPMC10690975 | biostudies-literature
| S-EPMC8557608 | biostudies-literature
| S-EPMC10245045 | biostudies-literature
| S-EPMC4073643 | biostudies-literature
| S-EPMC8882733 | biostudies-literature
| S-EPMC7671305 | biostudies-literature
| S-EPMC6265270 | biostudies-literature
| S-EPMC5351550 | biostudies-literature