Unknown

Dataset Information

0

Identifying novel sequence variants of RNA 3D motifs.


ABSTRACT: Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson-Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download.

SUBMITTER: Zirbel CL 

PROVIDER: S-EPMC4551918 | biostudies-literature | 2015 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Identifying novel sequence variants of RNA 3D motifs.

Zirbel Craig L CL   Roll James J   Sweeney Blake A BA   Petrov Anton I AI   Pirrung Meg M   Leontis Neocles B NB  

Nucleic acids research 20150629 15


Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameteriz  ...[more]

Similar Datasets

| S-EPMC3854523 | biostudies-literature
| S-EPMC2651809 | biostudies-literature
| S-EPMC168941 | biostudies-literature
| S-EPMC5289855 | biostudies-literature
| S-EPMC2887949 | biostudies-literature
| S-EPMC2703912 | biostudies-literature
| S-EPMC7242328 | biostudies-literature
| S-EPMC3965019 | biostudies-literature
| S-EPMC7001330 | biostudies-literature
| S-EPMC8532352 | biostudies-literature