Unknown

Dataset Information

0

Efficient motif search in ranked lists and applications to variable gap motifs.


ABSTRACT: Sequence elements, at all levels-DNA, RNA and protein, play a central role in mediating molecular recognition and thereby molecular regulation and signaling. Studies that focus on -measuring and investigating sequence-based recognition make use of statistical and computational tools, including approaches to searching sequence motifs. State-of-the-art motif searching tools are limited in their coverage and ability to address large motif spaces. We develop and present statistical and algorithmic approaches that take as input ranked lists of sequences and return significant motifs. The efficiency of our approach, based on suffix trees, allows searches over motif spaces that are not covered by existing tools. This includes searching variable gap motifs-two half sites with a flexible length gap in between-and searching long motifs over large alphabets. We used our approach to analyze several high-throughput measurement data sets and report some validation results as well as novel suggested motifs and motif refinements. We suggest a refinement of the known estrogen receptor 1 motif in humans, where we observe gaps other than three nucleotides that also serve as significant recognition sites, as well as a variable length motif related to potential tyrosine phosphorylation.

SUBMITTER: Leibovich L 

PROVIDER: S-EPMC3401424 | biostudies-literature | 2012 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Efficient motif search in ranked lists and applications to variable gap motifs.

Leibovich Limor L   Yakhini Zohar Z  

Nucleic acids research 20120313 13


Sequence elements, at all levels-DNA, RNA and protein, play a central role in mediating molecular recognition and thereby molecular regulation and signaling. Studies that focus on -measuring and investigating sequence-based recognition make use of statistical and computational tools, including approaches to searching sequence motifs. State-of-the-art motif searching tools are limited in their coverage and ability to address large motif spaces. We develop and present statistical and algorithmic a  ...[more]

Similar Datasets

| S-EPMC1829477 | biostudies-literature
| S-EPMC4021615 | biostudies-literature
| S-EPMC6286601 | biostudies-literature
| S-EPMC6235700 | biostudies-literature
| S-EPMC6357556 | biostudies-literature
| S-EPMC2872879 | biostudies-other
| S-EPMC2896709 | biostudies-literature
| S-EPMC4326265 | biostudies-literature
| S-EPMC1679804 | biostudies-literature
| S-EPMC3667425 | biostudies-literature