Project description:Throughout life, humans experience repeated exposure to viral antigens through infection and vaccination, resulting in the generation of diverse, and largely unique, antigen specific antibody repertoires. A paramount feature of antibodies that enables their critical contributions in counteracting recurrent and novel pathogens, and consequently fostering their utility as valuable targets for therapeutic and vaccine development, is the exquisite specificity displayed against their target antigens. Yet, there is still limited understanding of the determinants of antibody-antigen specificity, particularly as a function of antibody sequence. In recent years, experimental characterization of antibody repertoires has led to novel insights into fundamental properties of antibody sequences, but has been largely decoupled from at-scale antigen specificity analysis. Here, using the LIBRA-seq technology, we generated a large dataset mapping antibody sequence to antigen specificity for thousands of B cells, by screening the repertoires of a set of healthy individuals against twenty viral antigens representing diverse pathogens of biomedical significance. Analysis uncovered virus specific patterns in variable gene usage, gene pairing, somatic hypermutation, as well as the presence of convergent antiviral signatures across multiple individuals, including the presence of public antibody clonotypes. Notably, our results showed that, for B cell receptors originating from different individuals but leveraging an identical combination of heavy and light chain variable genes, there is a specific CDRH3/CDRL3 identity threshold that defines whether these B cells may share the same antigen specificity. This finding provides a quantifiable measure of the relationship between antibody sequence and antigen specificity and further defines experimentally grounded criteria for defining public antibody clonality. Understanding the fundamental rules of antibody-antigen interactions can lead to transformative new approaches for the development of antibody therapeutics and vaccines against current and emerging viruses.
Project description:Eukaryotic cells express transcription factor (TF) paralogues that bind to nearly identical DNA sequences in vitro but bind at different genomic loci and perform different functions in vivo. Predicting how 2 paralogous TFs bind in vivo using DNA sequence alone is an important open problem. Here, we analyzed 2 yeast bHLH TFs, Cbf1p and Tye7p, which have highly similar binding preferences in vitro, yet bind at almost completely non-overlapping target loci in vivo. We dissected the determinants of specificity for these 2 proteins by making a number of chimeric TFs in which we swapped different domains of Cbf1p and Tye7p and determined the effects on in vivo binding and cellular function. From these experiments, we learned that the Cbf1p dimer achieves its specificity by binding cooperatively with other Cbf1p dimers bound nearby. In contrast, we found that Tye7p achieves its specificity by binding cooperatively with three other DNA-binding proteins, Gcr1p, Gcr2p, and Rap1p. Remarkably, most promoters (63%) that are bound by Tye7p do not contain a consensus Tye7p binding site. Using this information, we were able to build simple models to accurately discriminate bound and unbound genomic loci for both Cbf1p and Tye7p. We then successfully reprogrammed the human bHLH NPAS2 to bind Cbf1p in vivo targets and a Tye7p target intergenic region to be bound by Cbf1p. These results demonstrate that the genome-wide binding targets of paralogous TFs can be discriminated using sequence information, and provide lessons about TF specificity that can be applied across the phylogenetic tree.
Project description:Recombinant adenovirus vectors were used to express wild type or domain swap mutants of A-Myb and c-Myb transcription factors in MCF-7 cells or pimary lung epithelial cells or fibroblasts. The results show that Myb proteins have extreme context specificity and identify sub-domains responsible for the activation of specific sets of target genes. Keywords = Myb proteins Keywords = oncogenes Keywords = transcription Keywords = gene activation Keywords: ordered
Project description:To elucidate how genomic sequences build transcriptional control networks we need to understand the connection between DNA sequence and transcription factor binding and function. Binding predictions based solely on consensus predictions are limited because a single factor can use degenerate sequence motifs and related transcription factors often prefer identical sequences. The ETS family transcription factor, ETS1, exemplifies these challenges. Unexpected, redundant occupancy of ETS1 and other ETS proteins is observed at promoters of housekeeping genes in T cells due to common sequence preferences and the presence of strong consensus motifs. However, ETS1 exhibits a specific function in T cell activation, thus unique transcriptional targets are predicted. To uncover the sequence motifs that mediate specific functions of ETS1, chromatin immunoprecipitation coupled with high-throughput sequencing (ChIP-seq) identified both promoter and enhancer binding events in Jurkat T cells. A comparison with DNase I sensitivity both validated the dataset and improved accuracy. Redundant occupancy of ETS1 with the ETS protein GABPA occurred primarily in promoters of housekeeping genes, whereas ETS1 specific occupancy occurred in the enhancers of T-cell specific genes. Two routes to ETS1 specificity were identified: an intrinsic preference of ETS1 for a variant of the ETS family consensus sequence and the presence of a composite sequence that can support cooperative binding with a RUNX transcription factor. Genome-wide occupancy of RUNX factors corroborated the importance of this partnership. Furthermore, genome-wide occupancy of co-activator CBP indicated tight co-localization with ETS1 at specific enhancers, but not redundant promoters. The distinct sequences associated with redundant versus specific ETS1 occupancy were predictive of promoter or enhancer location and the ontology of nearby genes. These findings demonstrate that diversity of binding motifs may enable variable transcription factor function at different genomic sites. Each ChIP sample was pooled from three independent immunoprecipitation experiments
Project description:Gene transcription in animals involves the assembly of the RNA polymerase II complex at core promoters and its cell type-specific activation by genomic enhancers that can be located more distally. However, how ubiquitous expression of housekeeping genes is achieved has remained less clear. In particular, it is unknown whether ubiquitously active enhancers exist and how developmental and housekeeping gene regulation is separated. An attractive hypothesis is that different types of core promoters might exhibit an intrinsic specificity towards certain types of enhancers. Here, we show that thousands of enhancers in D. melanogaster S2 cells and ovarian somatic cells (OSCs) exhibit a marked specificity towards one of two core promoters M-bM-^@M-^S one derived from a ubiquitously expressed ribosomal protein gene and another from a developmentally regulated transcription factor. Enhancers that activate the housekeeping core promoter are functional across the two different cell types, while developmental enhancers exhibit strong cell type specificity. Both enhancer classes differ in their overall genomic distribution, the functions of neighbouring genes,these genesM-bM-^@M-^Y core promoter elements, as well as the associated factors. Our results provide evidence for a sequence-encoded enhancer-core promoter specificity that separates developmental and housekeeping gene regulatory programs for thousands of enhancers and their target genes across the entire genome. STARR-seq was performed in S2 and OSC cells using two core promoters each representing housekeeping and developmental transcription programs. Data for housekeeping promoters (hkCP) are presented in this series; Data for developmental core promoters (dCP) samples are presented in GSE40739.