ABSTRACT: Binding of transcription factors (TFs) to regulatory sequences is a pivotal step in the control of gene expression. Despite many advances in the characterization of sequence motifs recognized by TFs, our ability to quantitatively predict TF binding to different regulatory sequences is still limited. Here, we present a novel experimental assay termed BunDLE-seq that provides quantitative measurements of TF binding to thousands of fully designed sequences of 200 bp in length within a single experiment. Applying this binding assay to two yeast TFs we demonstrate that sequences outside the core TF binding site profoundly affect TF binding. We show that TF-specific models based on the sequence or DNA shape of the regions flanking the core binding site are highly predictive of the measured differential TF binding. We further characterize the dependence of TF binding, accounting for measurements of single and co-occurring binding events, on the number and location of binding sites and on the TF concentration. Finally, by coupling our in vitro TF binding measurements, and another application of our method probing nucleosome formation, to in vivo expression measurements carried out with the same template sequences now serving and promoters, we offer insights into mechanisms that may determine the different expression outcomes observed. Our assay thus paves the way to a more comprehensive understanding of TF binding to regulatory sequences, and allows the characterization of TF binding determinants within and outside of core binding sites. Binding experiments were carried out with one of two input libraries; Lib1 (also referred to as “main lib”) includes designed sequence variants of length 150bp, and Lib2 (also referred to as “constant flanks lib”) includes designed sequence variants of length 200bp. The sequences of the variants of interest can be found in the second column of the processed data file (BunDLE-seq_data, see 3 excel sheets). EXP1, EXP2 and EXP3 were carried out using Lib1, and EXP4 was carried out using Lib2. Binding experiments were carried out with different proteins (in different concentrations) – DNA was extracted from bound bands that were formed on the gel and amplified with a primer with a unique 5bp upstream tail specifying the identity of the originating band (also referred to as “band barcode”). Data from the following bands was used in the analysis: EXP1_band1, no-protein band – with band barcode “AACGA”, EXP1_band2, single Gcn4 (conc 1:8) bound band* – with band barcode “TCGTA” , EXP1_band3, histones bound band – with band barcode “AGATA”, EXP2_band1, no-protein band – with band barcode “AACGA”, EXP2_band2, single Gcn4 (conc 1:8) bound band* – with band barcode “AGATA”, EXP2_band2, single Gcn4 (conc 1:7) bound band – with band barcode “CACTT”, EXP2_band2, single Gcn4 (conc 1:6) bound band – with band barcode “TCGTA”, EXP2_band2, single Gcn4 (conc 1:5) bound band – with band barcode “CATCA”, EXP2_band2, single Gcn4 (conc 1:4) bound band – with band barcode “TATTA”, EXP2_band2, single Gcn4 (conc 1:3) bound band – with band barcode “AGTCA”, EXP2_band2, single Gcn4 (conc 1:2) bound band – with band barcode “ACGAA”, EXP2_band2, single Gcn4 (conc 1:1) bound band – with band barcode “ACAGA”, EXP2_band2, two Gcn4 (conc 1:1) bound band – barcoded with “GTAGA”, EXP3_band1, no-protein band – with band barcode “GCATA”, EXP3_band2, single Gal4 (conc 1:1) bound band – with band barcode “GATGT”, EXP3_band3, single Gal4 (conc 3:1) bound band – with band barcode “GTTCA”, EXP3_band4, two Gal4 (conc 3:1) bound band – with band barcode “CTCAA”, EXP4_band1, no-protein band – with band barcode “GAGTA”, EXP4_band2, single Gcn4 (conc 1:3) bound band – “TACCA”, EXP4_band3, no-protein band – with band barcode “TCGTT”, EXP4_band3, single Gal4 (conc 3:1) bound band – with band barcode “ACATC”, *biological replicates