Dataset Information

RAMICS: trainable, high-speed and biologically relevant alignment of high-throughput sequencing reads to coding DNA.

ABSTRACT: The challenge presented by high-throughput sequencing necessitates the development of novel tools for accurate alignment of reads to reference sequences. Current approaches focus on using heuristics to map reads quickly to large genomes, rather than generating highly accurate alignments in coding regions. Such approaches are, thus, unsuited for applications such as amplicon-based analysis and the realignment phase of exome sequencing and RNA-seq, where accurate and biologically relevant alignment of coding regions is critical. To facilitate such analyses, we have developed a novel tool, RAMICS, that is tailored to mapping large numbers of sequence reads to short lengths (<10 000 bp) of coding DNA. RAMICS utilizes profile hidden Markov models to discover the open reading frame of each sequence and aligns to the reference sequence in a biologically relevant manner, distinguishing between genuine codon-sized indels and frameshift mutations. This approach facilitates the generation of highly accurate alignments, accounting for the error biases of the sequencing machine used to generate reads, particularly at homopolymer regions. Performance improvements are gained through the use of graphics processing units, which increase the speed of mapping through parallelization. RAMICS substantially outperforms all other mapping approaches tested in terms of alignment quality while maintaining highly competitive speed performance.

SUBMITTER: Wright IA

PROVIDER: S-EPMC4117746 | biostudies-literature | 2014 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

RAMICS: trainable, high-speed and biologically relevant alignment of high-throughput sequencing reads to coding DNA.

Wright Imogen A IA Travers Simon A SA

Nucleic acids research 20140526 13

The challenge presented by high-throughput sequencing necessitates the development of novel tools for accurate alignment of reads to reference sequences. Current approaches focus on using heuristics to map reads quickly to large genomes, rather than generating highly accurate alignments in coding regions. Such approaches are, thus, unsuited for applications such as amplicon-based analysis and the realignment phase of exome sequencing and RNA-seq, where accurate and biologically relevant alignmen ...[more]

PMID: 24861618

Dataset Information

RAMICS: trainable, high-speed and biologically relevant alignment of high-throughput sequencing reads to coding DNA.

Publications

RAMICS: trainable, high-speed and biologically relevant alignment of high-throughput sequencing reads to coding DNA.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Identifying micro-inversions using high-throughput sequencing reads.
| S-EPMC4895285 | biostudies-literature

HybPiper: Extracting coding sequence and introns for phylogenetics from high-throughput sequencing reads using target enrichment.
| S-EPMC4948903 | biostudies-literature

Fulcrum: condensing redundant reads from high-throughput sequencing studies.
| S-EPMC3348557 | biostudies-literature

Statistical Approach for Biologically Relevant Gene Selection from High-Throughput Gene Expression Data.
| S-EPMC7712650 | biostudies-literature

BLESS: bloom filter-based error correction solution for high-throughput sequencing reads.
| S-EPMC6365934 | biostudies-literature

SparkBWA: Speeding Up the Alignment of High-Throughput DNA Sequencing Data.
| S-EPMC4868289 | biostudies-literature

Efficient alignment of pyrosequencing reads for re-sequencing applications.
| S-EPMC3118166 | biostudies-literature

Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies.
| S-EPMC3119603 | biostudies-literature

CASPER: context-aware scheme for paired-end reads from high-throughput amplicon sequencing.
| S-EPMC4168710 | biostudies-literature

Centroid based clustering of high throughput sequencing reads based on n-mer counts.
| S-EPMC3848435 | biostudies-literature