Ontology highlight
ABSTRACT: Motivation
Sequence-specific transcription factors (TFs) regulate the expression of their target genes through interactions with specific DNA-binding sites in the genome. Data on TF-DNA binding specificities are essential for understanding how regulatory specificity is achieved.Results
Numerous studies have used universal protein-binding microarray (PBM) technology to determine the in vitro binding specificities of hundreds of TFs for all possible 8 bp sequences (8mers). We have developed a Bayesian analysis of variance (ANOVA) model that decomposes these 8mer data into background noise, TF familywise effects and effects due to the particular TF. Adjusting for background noise improves PBM data quality and concordance with in vivo TF binding data. Moreover, our model provides simultaneous identification of TF subclasses and their shared sequence preferences, and also of 8mers bound preferentially by individual members of TF subclasses. Such results may aid in deciphering cis-regulatory codes and determinants of protein-DNA binding specificity.Availability and implementation
Source code, compiled code and R and Python scripts are available from http://thebrain.bwh.harvard.edu/hierarchicalANOVA.Supplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Jiang B
PROVIDER: S-EPMC3661050 | biostudies-literature | 2013 Jun
REPOSITORIES: biostudies-literature
Jiang Bo B Liu Jun S JS Bulyk Martha L ML
Bioinformatics (Oxford, England) 20130404 11
<h4>Motivation</h4>Sequence-specific transcription factors (TFs) regulate the expression of their target genes through interactions with specific DNA-binding sites in the genome. Data on TF-DNA binding specificities are essential for understanding how regulatory specificity is achieved.<h4>Results</h4>Numerous studies have used universal protein-binding microarray (PBM) technology to determine the in vitro binding specificities of hundreds of TFs for all possible 8 bp sequences (8mers). We have ...[more]