Unknown

Dataset Information

0

Specific alignment of structured RNA: stochastic grammars and sequence annealing.


ABSTRACT: Whole-genome screens suggest that eukaryotic genomes are dense with non-coding RNAs (ncRNAs). We introduce a novel approach to RNA multiple alignment which couples a generative probabilistic model of sequence and structure with an efficient sequence annealing approach for exploring the space of multiple alignments. This leads to a new software program, Stemloc-AMA, that is both accurate and specific in the alignment of multiple related RNA sequences.When tested on the benchmark datasets BRalibase II and BRalibase 2.1, Stemloc-AMA has comparable sensitivity to and better specificity than the best competing methods. We use a large-scale random sequence experiment to show that while most alignment programs maximize sensitivity at the expense of specificity, even to the point of giving complete alignments of non-homologous sequences, Stemloc-AMA aligns only sequences with detectable homology and leaves unrelated sequences largely unaligned. Such accurate and specific alignments are crucial for comparative-genomics analysis, from inferring phylogeny to estimating substitution rates across different lineages.Stemloc-AMA is available from http://biowiki.org/StemLocAMA as part of the dart software package for sequence analysis.

SUBMITTER: Bradley RK 

PROVIDER: S-EPMC2732270 | biostudies-literature | 2008 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Specific alignment of structured RNA: stochastic grammars and sequence annealing.

Bradley Robert K RK   Pachter Lior L   Holmes Ian I  

Bioinformatics (Oxford, England) 20080916 23


<h4>Motivation</h4>Whole-genome screens suggest that eukaryotic genomes are dense with non-coding RNAs (ncRNAs). We introduce a novel approach to RNA multiple alignment which couples a generative probabilistic model of sequence and structure with an efficient sequence annealing approach for exploring the space of multiple alignments. This leads to a new software program, Stemloc-AMA, that is both accurate and specific in the alignment of multiple related RNA sequences.<h4>Results</h4>When tested  ...[more]

Similar Datasets

| S-EPMC169020 | biostudies-literature
| S-EPMC3102635 | biostudies-literature
| S-EPMC2709569 | biostudies-literature
| S-EPMC1635699 | biostudies-literature
| S-EPMC1579236 | biostudies-literature
| S-EPMC2677745 | biostudies-literature
| S-EPMC147093 | biostudies-other
| S-EPMC2734164 | biostudies-literature
| S-EPMC8735865 | biostudies-literature
2024-10-10 | PXD050548 | Pride