Dataset Information

Efficient alignment of pyrosequencing reads for re-sequencing applications.

ABSTRACT: BACKGROUND: Over the past few years, new massively parallel DNA sequencing technologies have emerged. These platforms generate massive amounts of data per run, greatly reducing the cost of DNA sequencing. However, these techniques also raise important computational difficulties mostly due to the huge volume of data produced, but also because of some of their specific characteristics such as read length and sequencing errors. Among the most critical problems is that of efficiently and accurately mapping reads to a reference genome in the context of re-sequencing projects. RESULTS: We present an efficient method for the local alignment of pyrosequencing reads produced by the GS FLX (454) system against a reference sequence. Our approach explores the characteristics of the data in these re-sequencing applications and uses state of the art indexing techniques combined with a flexible seed-based approach, leading to a fast and accurate algorithm which needs very little user parameterization. An evaluation performed using real and simulated data shows that our proposed method outperforms a number of mainstream tools on the quantity and quality of successful alignments, as well as on the execution time. CONCLUSIONS: The proposed methodology was implemented in a software tool called TAPyR--Tool for the Alignment of Pyrosequencing Reads--which is publicly available from http://www.tapyr.net.

SUBMITTER: Fernandes F

PROVIDER: S-EPMC3118166 | biostudies-literature | 2011

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Efficient alignment of pyrosequencing reads for re-sequencing applications.

Fernandes Francisco F da Fonseca Paulo G S PG Russo Luis M S LM Oliveira Arlindo L AL Freitas Ana T AT

BMC bioinformatics 20110516

<h4>Background</h4>Over the past few years, new massively parallel DNA sequencing technologies have emerged. These platforms generate massive amounts of data per run, greatly reducing the cost of DNA sequencing. However, these techniques also raise important computational difficulties mostly due to the huge volume of data produced, but also because of some of their specific characteristics such as read length and sequencing errors. Among the most critical problems is that of efficiently and accu ...[more]

PMID: 21672185

Dataset Information

Efficient alignment of pyrosequencing reads for re-sequencing applications.

Publications

Efficient alignment of pyrosequencing reads for re-sequencing applications.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Re-alignment of the unmapped reads with base quality score.
| S-EPMC4402702 | biostudies-literature

Accurate spliced alignment of long RNA sequencing reads.
| S-EPMC8665758 | biostudies-literature

FANSe2: a robust and cost-efficient alignment tool for quantitative next-generation sequencing applications.
| S-EPMC3990525 | biostudies-literature

Next generation sequencing reads comparison with an alignment-free distance.
| S-EPMC4265526 | biostudies-literature

Alignment-free sequence comparison based on next-generation sequencing reads.
| S-EPMC3581251 | biostudies-literature

BatAlign: an incremental method for accurate alignment of sequencing reads.
| S-EPMC4652746 | biostudies-literature

BIGrat: a repeat resolver for pyrosequencing-based re-sequencing with Newbler.
| S-EPMC3599625 | biostudies-literature

smsMap: mapping single molecule sequencing reads by locating the alignment starting positions.
| S-EPMC7430848 | biostudies-literature

Rapid and accurate alignment of nucleotide conversion sequencing reads with HISAT-3N.
| S-EPMC8256862 | biostudies-literature

Filtering duplicate reads from 454 pyrosequencing data.
| S-EPMC3605598 | biostudies-literature