Unknown

Dataset Information

0

Detection of dispersed short tandem repeats using reversible jump Markov chain Monte Carlo.


ABSTRACT: Tandem repeats occur frequently in biological sequences. They are important for studying genome evolution and human disease. A number of methods have been designed to detect a single tandem repeat in a sliding window. In this article, we focus on the case that an unknown number of tandem repeat segments of the same pattern are dispersively distributed in a sequence. We construct a probabilistic generative model for the tandem repeats, where the sequence pattern is represented by a motif matrix. A Bayesian approach is adopted to compute this model. Markov chain Monte Carlo (MCMC) algorithms are used to explore the posterior distribution as an effort to infer both the motif matrix of tandem repeats and the location of repeat segments. Reversible jump Markov chain Monte Carlo (RJMCMC) algorithms are used to address the transdimensional model selection problem raised by the variable number of repeat segments. Experiments on both synthetic data and real data show that this new approach is powerful in detecting dispersed short tandem repeats. As far as we know, it is the first work to adopt RJMCMC algorithms in the detection of tandem repeats.

SUBMITTER: Liang T 

PROVIDER: S-EPMC3479165 | biostudies-literature | 2012 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Detection of dispersed short tandem repeats using reversible jump Markov chain Monte Carlo.

Liang Tong T   Fan Xiaodan X   Li Qiwei Q   Li Shuo-Yen R SY  

Nucleic acids research 20120629 19


Tandem repeats occur frequently in biological sequences. They are important for studying genome evolution and human disease. A number of methods have been designed to detect a single tandem repeat in a sliding window. In this article, we focus on the case that an unknown number of tandem repeat segments of the same pattern are dispersively distributed in a sequence. We construct a probabilistic generative model for the tandem repeats, where the sequence pattern is represented by a motif matrix.  ...[more]

Similar Datasets

| S-EPMC6760159 | biostudies-literature
| S-EPMC2607421 | biostudies-literature
| S-EPMC548300 | biostudies-literature
| S-EPMC10491952 | biostudies-literature
| S-EPMC7224357 | biostudies-literature
| S-EPMC5354282 | biostudies-literature
| S-EPMC10564381 | biostudies-literature
| S-EPMC4578810 | biostudies-literature
| S-EPMC3464018 | biostudies-literature
| S-EPMC6894579 | biostudies-literature