Unknown

Dataset Information

0

Computing the probability of RNA hairpin and multiloop formation.


ABSTRACT: We describe four novel algorithms, RNAhairpin, RNAmloopNum, RNAmloopOrder, and RNAmloopHP, which compute the Boltzmann partition function for global structural constraints-respectively for the number of hairpins, the number of multiloops, maximum order (or depth) of multiloops, and the simultaneous number of hairpins and multiloops. Given an RNA sequence of length n and a user-specified integer 0???K???n, RNAhairpin (resp. RNAmloopNum and RNAmloopOrder) computes the partition functions Z(k) for each 0???k???K in time O(K(2)n(3)) and space O(Kn(2)), while RNAmloopHP computes the partition functions Z(m, h) for 0???mm???M multiloops and 0???h???H hairpins, with run time O(M(2)H(2)n(3)) and space O(MHn(2)). In addition, programs such as RNAhairpin (resp. RNAmloopHP) sample from the low-energy ensemble of structures having h hairpins (resp. m multiloops and h hairpins), for given h, m. Moreover, by using the fast Fourier transform (FFT), RNAhairpin and RNAmloopNum have been improved to run in time O(n(4)) and space O(n(2)), although this improvement is not possible for RNAmloopOrder. We present two applications of the novel algorithms. First, we show that for many Rfam families of RNA, structures sampled from RNAmloopHP are more accurate than the minimum free-energy structure; for instance, sensitivity improves by almost 24% for transfer RNA, while for certain ribozyme families, there is an improvement of around 5%. Second, we show that the probabilities p(k)=Z(k)/Z of forming k hairpins (resp. multiloops) provide discriminating novel features for a support vector machine or relevance vector machine binary classifier for Rfam families of RNA. Our data suggests that multiloop order does not provide any significant discriminatory power over that of hairpin and multiloop number, and since these probabilities can be efficiently computed using the FFT, hairpin and multiloop formation probabilities could be added to other features in existent noncoding RNA gene finders. Our programs, written in C/C++, are publicly available online at: http://bioinformatics.bc.edu/clotelab/RNAparametric .

SUBMITTER: Ding Y 

PROVIDER: S-EPMC3948487 | biostudies-literature | 2014 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Computing the probability of RNA hairpin and multiloop formation.

Ding Yang Y   Lorenz William A WA   Dotu Ivan I   Senter Evan E   Clote Peter P  

Journal of computational biology : a journal of computational molecular cell biology 20140221 3


We describe four novel algorithms, RNAhairpin, RNAmloopNum, RNAmloopOrder, and RNAmloopHP, which compute the Boltzmann partition function for global structural constraints-respectively for the number of hairpins, the number of multiloops, maximum order (or depth) of multiloops, and the simultaneous number of hairpins and multiloops. Given an RNA sequence of length n and a user-specified integer 0 ≤ K ≤ n, RNAhairpin (resp. RNAmloopNum and RNAmloopOrder) computes the partition functions Z(k) for  ...[more]

Similar Datasets

| S-EPMC5982174 | biostudies-literature
| S-EPMC17733 | biostudies-literature
| S-EPMC5572644 | biostudies-literature
| S-EPMC3388226 | biostudies-literature
| S-EPMC5113201 | biostudies-literature
| S-EPMC7203009 | biostudies-literature
| S-EPMC6923921 | biostudies-literature
| S-EPMC3141925 | biostudies-literature
| S-EPMC15965 | biostudies-literature
| S-EPMC1805602 | biostudies-literature