Unknown

Dataset Information

0

A Bayesian hidden Markov model for motif discovery through joint modeling of genomic sequence and ChIP-chip data.


ABSTRACT: We propose a unified framework for the analysis of chromatin (Ch) immunoprecipitation (IP) microarray (ChIP-chip) data for detecting transcription factor binding sites (TFBSs) or motifs. ChIP-chip assays are used to focus the genome-wide search for TFBSs by isolating a sample of DNA fragments with TFBSs and applying this sample to a microarray with probes corresponding to tiled segments across the genome. Present analytical methods use a two-step approach: (i) analyze array data to estimate IP-enrichment peaks then (ii) analyze the corresponding sequences independently of intensity information. The proposed model integrates peak finding and motif discovery through a unified Bayesian hidden Markov model (HMM) framework that accommodates the inherent uncertainty in both measurements. A Markov chain Monte Carlo algorithm is formulated for parameter estimation, adapting recursive techniques used for HMMs. In simulations and applications to a yeast RAP1 dataset, the proposed method has favorable TFBS discovery performance compared to currently available two-stage procedures in terms of both sensitivity and specificity.

SUBMITTER: Gelfond JA 

PROVIDER: S-EPMC2794970 | biostudies-literature | 2009 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Bayesian hidden Markov model for motif discovery through joint modeling of genomic sequence and ChIP-chip data.

Gelfond Jonathan A L JA   Gupta Mayetri M   Ibrahim Joseph G JG  

Biometrics 20091201 4


We propose a unified framework for the analysis of chromatin (Ch) immunoprecipitation (IP) microarray (ChIP-chip) data for detecting transcription factor binding sites (TFBSs) or motifs. ChIP-chip assays are used to focus the genome-wide search for TFBSs by isolating a sample of DNA fragments with TFBSs and applying this sample to a microarray with probes corresponding to tiled segments across the genome. Present analytical methods use a two-step approach: (i) analyze array data to estimate IP-e  ...[more]

Similar Datasets

| S-EPMC3286622 | biostudies-literature
| S-EPMC2732365 | biostudies-literature
| S-EPMC7451993 | biostudies-literature
| S-EPMC8830650 | biostudies-literature
| S-EPMC2857806 | biostudies-other
| S-EPMC7455056 | biostudies-literature
| S-EPMC3114652 | biostudies-literature
| S-EPMC1994961 | biostudies-literature
| S-EPMC9038756 | biostudies-literature
| S-EPMC6818740 | biostudies-literature