Unknown

Dataset Information

0

NestedMICA as an ab initio protein motif discovery tool.


ABSTRACT:

Background

Discovering overrepresented patterns in amino acid sequences is an important step in protein functional element identification. We adapted and extended NestedMICA, an ab initio motif finder originally developed for finding transcription binding site motifs, to find short protein signals, and compared its performance with another popular protein motif finder, MEME. NestedMICA, an open source protein motif discovery tool written in Java, is driven by a Monte Carlo technique called Nested Sampling. It uses multi-class sequence background models to represent different "uninteresting" parts of sequences that do not contain motifs of interest. In order to assess NestedMICA as a protein motif finder, we have tested it on synthetic datasets produced by spiking instances of known motifs into a randomly selected set of protein sequences. NestedMICA was also tested using a biologically-authentic test set, where we evaluated its performance with respect to varying sequence length.

Results

Generally NestedMICA recovered most of the short (3-9 amino acid long) test protein motifs spiked into a test set of sequences at different frequencies. We showed that it can be used to find multiple motifs at the same time, too. In all the assessment experiments we carried out, its overall motif discovery performance was better than that of MEME.

Conclusion

NestedMICA proved itself to be a robust and sensitive ab initio protein motif finder, even for relatively short motifs that exist in only a small fraction of sequences.

Availability

NestedMICA is available under the Lesser GPL open-source license from: http://www.sanger.ac.uk/Software/analysis/nmica/

SUBMITTER: Dogruel M 

PROVIDER: S-EPMC2267705 | biostudies-literature | 2008 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

NestedMICA as an ab initio protein motif discovery tool.

Doğruel Mutlu M   Down Thomas A TA   Hubbard Tim Jp TJ  

BMC bioinformatics 20080114


<h4>Background</h4>Discovering overrepresented patterns in amino acid sequences is an important step in protein functional element identification. We adapted and extended NestedMICA, an ab initio motif finder originally developed for finding transcription binding site motifs, to find short protein signals, and compared its performance with another popular protein motif finder, MEME. NestedMICA, an open source protein motif discovery tool written in Java, is driven by a Monte Carlo technique call  ...[more]

Similar Datasets

| S-EPMC7184783 | biostudies-literature
| S-EPMC10448985 | biostudies-literature
| S-EPMC6226323 | biostudies-literature
| S-EPMC6084434 | biostudies-literature
| S-EPMC4509844 | biostudies-literature
| S-EPMC3081830 | biostudies-literature
| S-EPMC8723153 | biostudies-literature
| S-EPMC3551984 | biostudies-literature
| S-EPMC4555859 | biostudies-literature
| S-EPMC10998567 | biostudies-literature