Project description:The sequence specificity of DNA-binding proteins is the primary mechanism by which the cell recognizes genomic features. Here, we describe systematic determination of yeast transcription factor DNA-binding specificities. We obtained binding specificities for 112 DNA-binding proteins representing 19 distinct structural classes. One-third of the binding specificities have not been previously reported. Several binding sequences have striking genomic distributions relative to transcription start sites, supporting their biological relevance and suggesting a role in promoter architecture. Among these are Rsc3 binding sequences, containing the core CGCG, which are found preferentially ~100 bp upstream of transcription start sites. Mutation of RSC3 results in a dramatic increase in nucleosome occupancy in hundreds of proximal promoters containing a Rsc3 binding element, but has little impact on promoters lacking Rsc3 binding sequences, indicating that Rsc3 plays a broad role in targeting nucleosome exclusion at yeast promoters. Keywords: Protein binding microarrays, DNA, proteins Protein binding microarray (PBM), ChIP-chip and DIP-chip experiments of yeast transcription factor DNA-binding domains were performed. Briefly, the PBMs involved binding GST-tagged DNA-binding proteins to custom-designed, double-stranded 44K Agilent microarrays in order to determine their sequence preferences. The method is described in Berger et al., Nature Biotechnology 2006. A key feature is that the microarrays are composed of de Bruijn sequences that contain each 10-base sequence once and only once, providing an evenly balanced sequence distribution. Individual de Bruijn sequences have different properties, including representation of gapped patterns. Here we provide the data transformed into median intensities for all 32,896 8-base sequences, Z-scores for these intensities, and E-scores. E-scores are a modified version of AUC, and describe how well each 8-mer ranks the intensities of the spots. In general the E-scores are slightly more reproducible than Z-scores, but contain less information about relative binding affinity. Additional experimental details are found in Berger et al., Nature Biotechnology 2006, Berger et al., Cell 2008, and the accompanying Supplementary information. Raw 35-mer array data is available on the web link provided. For the DIP-chip experiments [GSM345371, GSM345403, GSM345414-GSM345421, GSM345429-GSM345432], genomic DNA isolated from S288C yeast was incubated with 40nM of the MBP-tagged DNA binding domain (DBD) of either Cbf1, Pho2, Pho4, Leu3, Rap1, or Swi5 and incubated for 30 minutes prior to purification of protein-DNA complexes. The bound DNA was then isolated, amplified via Invitrogen's WGA protocol, and hybridized against input DNA on NimbleGen 385k 32bp-tiling whole genome arrays. ChIPOTle was used to identify peaks of binding from the data and motifs were identified by BioProspector and MDScan and then scored for their ability to predict the identified peaks by GOMER. Motifs with the best ROC AUC are reported in the paper. For the ChIP-chip experiments [GSM346493 and GSM346494], isogenic wildtype and rsc3-1 strains carrying Rsc8-TAP were grown in parallel under rsc3-1 restrictive growth conditions (37°C). Following formaldehyde crosslinking, cells were homogenized and extracts were sonicated to shear the chromatin to an average size of ~500 bp. A single pulldown was then performed with IgG sepharose beads and after decrosslinking and LM-PCR amplification of purified IP DNA, samples were labeled and hybridized on Nimblegen 32bp whole genome tiling arrays, comparing the pulled-down DNA to input genomic DNA.
2008-12-26 | E-GEOD-12349 | biostudies-arrayexpress