Unknown

Dataset Information

0

A phylogenetic Gibbs sampler that yields centroid solutions for cis-regulatory site prediction.


ABSTRACT: Identification of functionally conserved regulatory elements in sequence data from closely related organisms is becoming feasible, due to the rapid growth of public sequence databases. Closely related organisms are most likely to have common regulatory motifs; however, the recent speciation of such organisms results in the high degree of correlation in their genome sequences, confounding the detection of functional elements. Additionally, alignment algorithms that use optimization techniques are limited to the detection of a single alignment that may not be representative. Comparative-genomics studies must be able to address the phylogenetic correlation in the data and efficiently explore the alignment space, in order to make specific and biologically relevant predictions.We describe here a Gibbs sampler that employs a full phylogenetic model and reports an ensemble centroid solution. We describe regulatory motif detection using both simulated and real data, and demonstrate that this approach achieves improved specificity, sensitivity, and positive predictive value over non-phylogenetic algorithms, and over phylogenetic algorithms that report a maximum likelihood solution.The software is freely available at http://bayesweb.wadsworth.org/gibbs/gibbs.html.Supplementary data are available at Bioinformatics online.

SUBMITTER: Newberg LA 

PROVIDER: S-EPMC2268014 | biostudies-literature | 2007 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

A phylogenetic Gibbs sampler that yields centroid solutions for cis-regulatory site prediction.

Newberg Lee A LA   Thompson William A WA   Conlan Sean S   Smith Thomas M TM   McCue Lee Ann LA   Lawrence Charles E CE  

Bioinformatics (Oxford, England) 20070508 14


<h4>Motivation</h4>Identification of functionally conserved regulatory elements in sequence data from closely related organisms is becoming feasible, due to the rapid growth of public sequence databases. Closely related organisms are most likely to have common regulatory motifs; however, the recent speciation of such organisms results in the high degree of correlation in their genome sequences, confounding the detection of functional elements. Additionally, alignment algorithms that use optimiza  ...[more]

Similar Datasets

| S-EPMC5358773 | biostudies-literature
| S-EPMC3400952 | biostudies-literature
| S-EPMC2639663 | biostudies-literature
| S-EPMC7545134 | biostudies-literature
| S-EPMC2571992 | biostudies-literature
| S-EPMC8945543 | biostudies-literature
| S-EPMC5828534 | biostudies-literature
| S-EPMC3167047 | biostudies-literature
| S-EPMC5809088 | biostudies-literature
| S-EPMC2996316 | biostudies-literature