Unknown

Dataset Information

0

Uncovering deeply conserved motif combinations in rapidly evolving noncoding sequences.


ABSTRACT:

Background

Animal genomes contain thousands of long noncoding RNA (lncRNA) genes, a growing subset of which are thought to be functionally important. This functionality is often mediated by short sequence elements scattered throughout the RNA sequence that correspond to binding sites for small RNAs and RNA binding proteins. Throughout vertebrate evolution, the sequences of lncRNA genes changed extensively, so that it is often impossible to obtain significant alignments between sequences of lncRNAs from evolutionary distant species, even when synteny is evident. This often prohibits identifying conserved lncRNAs that are likely to be functional or prioritizing constrained regions for experimental interrogation.

Results

We introduce here LncLOOM, a novel algorithmic framework for the discovery and evaluation of syntenic combinations of short motifs. LncLOOM is based on a graph representation of the input sequences and uses integer linear programming to efficiently compare dozens of sequences that have thousands of bases each and to evaluate the significance of the recovered motifs. We show that LncLOOM is capable of identifying specific, biologically relevant motifs which are conserved throughout vertebrates and beyond in lncRNAs and 3'UTRs, including novel functional RNA elements in the CHASERR lncRNA that are required for regulation of CHD2 expression.

Conclusions

We expect that LncLOOM will become a broadly used approach for the discovery of functionally relevant elements in the noncoding genome.

SUBMITTER: Ross CJ 

PROVIDER: S-EPMC7798263 | biostudies-literature | 2021 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Uncovering deeply conserved motif combinations in rapidly evolving noncoding sequences.

Ross Caroline Jane CJ   Rom Aviv A   Spinrad Amit A   Gelbard-Solodkin Dikla D   Degani Neta N   Ulitsky Igor I  

Genome biology 20210111 1


<h4>Background</h4>Animal genomes contain thousands of long noncoding RNA (lncRNA) genes, a growing subset of which are thought to be functionally important. This functionality is often mediated by short sequence elements scattered throughout the RNA sequence that correspond to binding sites for small RNAs and RNA binding proteins. Throughout vertebrate evolution, the sequences of lncRNA genes changed extensively, so that it is often impossible to obtain significant alignments between sequences  ...[more]

Similar Datasets

2021-01-18 | PXD023093 | Pride
| S-EPMC2942038 | biostudies-literature
| S-EPMC4159006 | biostudies-literature
| S-EPMC2394770 | biostudies-literature
| S-EPMC403677 | biostudies-literature
| S-EPMC2992528 | biostudies-literature
| S-EPMC7240152 | biostudies-literature
| S-EPMC8233505 | biostudies-literature
| S-EPMC8042745 | biostudies-literature
| S-EPMC4111549 | biostudies-literature