Unknown

Dataset Information

0

ToPS: a framework to manipulate probabilistic models of sequence data.


ABSTRACT: Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implement different applications in bioinformatics analysis by combining eight kinds of models: (i) independent and identically distributed process; (ii) variable-length Markov chain; (iii) inhomogeneous Markov chain; (iv) hidden Markov model; (v) profile hidden Markov model; (vi) pair hidden Markov model; (vii) generalized hidden Markov model; and (viii) similarity based sequence weighting. The framework includes functionality for training, simulation and decoding of the models. Additionally, it provides two methods to help parameter setting: Akaike and Bayesian information criteria (AIC and BIC). The models can be used stand-alone, combined in Bayesian classifiers, or included in more complex, multi-model, probabilistic architectures using GHMMs. In particular the framework provides a novel, flexible, implementation of decoding in GHMMs that detects when the architecture can be traversed efficiently.

SUBMITTER: Kashiwabara AY 

PROVIDER: S-EPMC3789777 | biostudies-literature | 2013

REPOSITORIES: biostudies-literature

altmetric image

Publications

ToPS: a framework to manipulate probabilistic models of sequence data.

Kashiwabara André Yoshiaki AY   Bonadio Igor I   Onuchic Vitor V   Amado Felipe F   Mathias Rafael R   Durham Alan Mitchell AM  

PLoS computational biology 20131003 10


Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implement different applications in bioinformatics analysis by combining eight kinds of models: (i) independent and identically distributed process; (ii) variable-length Markov chain; (iii) inhomogeneous Mark  ...[more]

Similar Datasets

| S-EPMC8563988 | biostudies-literature
| S-EPMC10124968 | biostudies-literature
| S-EPMC2916723 | biostudies-literature
| S-EPMC3044293 | biostudies-literature
| S-EPMC8415599 | biostudies-literature
| S-EPMC7243991 | biostudies-literature
| S-EPMC2570367 | biostudies-literature
| S-EPMC3900378 | biostudies-literature
| S-EPMC7846514 | biostudies-literature
| S-EPMC7898129 | biostudies-literature