Project description:Specific DNA-protein interactions mediate physiologic gene regulation and may be altered by DNA variants linked to polygenic disease. To enhance the speed and signal-to-noise ratio (SNR) of identifying and quantifying proteins associating with specific DNA sequences in living cells, we developed proximal biotinylation by episomal recruitment (PROBER). PROBER uses high copy episomes to amplify SNR along with proximity proteomics (BioID) to identify the transcription factors (TFs) and additional gene regulators associated with DNA sequences of interest. PROBER quantified steady-state and inducible association of TFs and associated chromatin regulators to target DNA sequences and quantified binding quantitative trait loci (bQTLs) due to single nucleotide variants. PROBER identified alterations in gene regulator associations due to cancer hotspot mutations in the hTERT promoter, indicating these mutations increase promoter association with specific gene activators. PROBER offers an approach to rapidly identify proteins associated with specific DNA sequences and their variants in living cells.
Project description:DNA-protein interactions mediate physiologic gene regulation and may be altered by DNA variants linked to polygenic disease. To enhance the speed and signal-to-noise ratio (SNR) of identifying and quantifying proteins that associate with specific DNA sequences in living cells, we developed proximal biotinylation by episomal recruitment (PROBER). PROBER uses high copy episomes to amplify SNR along with proximity proteomics (BioID) to identify the transcription factors (TFs) and additional gene regulators associated with short DNA sequences of interest. PROBER quantified steady-state and inducible association of TFs and corresponding chromatin regulators to target DNA sequences as well as binding quantitative trait loci (bQTLs) due to single nucleotide variants. PROBER identified alterations in regulator associations due to cancer hotspot mutations in the hTERT promoter, indicating these mutations increase promoter association with specific gene activators. PROBER provides an approach to rapidly identify proteins associated with specific DNA sequences and their variants in living cells
Project description:DNA-protein interactions mediate physiologic gene regulation and may be altered by DNA variants linked to polygenic disease. To enhance the speed and signal-to-noise ratio (SNR) of identifying and quantifying proteins that associate with specific DNA sequences in living cells, we developed proximal biotinylation by episomal recruitment (PROBER). PROBER uses high copy episomes to amplify SNR along with proximity proteomics (BioID) to identify the transcription factors (TFs) and additional gene regulators associated with short DNA sequences of interest. PROBER quantified steady-state and inducible association of TFs and corresponding chromatin regulators to target DNA sequences as well as binding quantitative trait loci (bQTLs) due to single nucleotide variants. PROBER identified alterations in regulator associations due to cancer hotspot mutations in the hTERT promoter, indicating these mutations increase promoter association with specific gene activators. PROBER provides an approach to rapidly identify proteins associated with specific DNA sequences and their variants in living cells.
Project description:DNA binding protein are generally thought to bind specific DNA sequences through selective interactions with DNA bases. However, it is now becoming more widely appreciated that DNA shape, which may not be specified by a unique base sequence, also contributes to site-specific binding. Here we elucidate how DNA sequence and shape confer site specificity on a genomic scale, and relate this to specificity imparted indirectly through occlusion of sequences by the in vivo environment. For simplicity, we focus on the set of General Regulatory Factors (GRFs) that do not rely on other factors for binding. They also serve a related function in organizing chromatin. Remarkably, we find that GRFs will not bind to their cognate motif if the DNA surrounding that sequence lacks a specific shape. While proper DNA sequence/shape properties tend to be restricted to promoter regions, weaker sites that are still binding-competent reside in gene bodies, but are prevented from binding by resident chromatin. Thus, site-specificity is achieved across a genome in vivo by the combined action of favorable DNA sequence and shape interactions, and occlusion by chromatin.
Project description:The transmission of information from DNA to RNA is a critical process. It is assumed that DNA is faithfully copied into RNA. However, when we compared RNA sequences from human B cells of 27 individuals to the corresponding DNA sequences from the same individuals, we uncovered more than 20,000 sites where the RNA sequences do not match that of the DNA. Validations using RNA sequences from another laboratory and re-sequencing of the DNA and RNA samples confirmed these findings. All 12 possible categories of discordances were found, with A-to-G and C-to-U being the most common. About 50% of the differences involved conversions between purines and pyrimidines. These differences were non-random as many sites were found in multiple individuals. The same differences were also found in primary skin cells in a separate set of 20 individuals. In addition, when these differences were found, they were seen in nearly all transcripts. Thus, these widespread RNA-DNA differences in the human genome provide a yet unexplored aspect of genome variation that affect gene expression and therefore phenotypic and disease manifestations.
Project description:Binding of transcription factors to DNA is mediated by the recognition of the chemical signatures of the DNA bases and the three-dimensional shape of the DNA molecule. The direct contribution of DNA shape to DNA-binding specificity has been difficult to assess, as DNA shape is a consequence of its sequence. Here, we teased apart these two modes of recognition in the context of Hox-DNA binding. We made a series of mutations in Hox residues that, in a co-crystal structure, only recognize DNA shape, and tested the effect on DNA binding preferences using SELEX-seq. Analysis of shape features of selected sequences revealed that these residues are both necessary and sufficient for selection of sequences with distinct shape features. We used statistical machine learning to show that the accuracy of binding specificity predictions improves by adding shape features to a model that only depends on sequence. We conclude that shape readout is a direct and critical component of binding site selection by Hox proteins.
Project description:To investigate how exogenous DNA concatemerizes to form episomal artificial chromosomes (ACs), acquire equal segregation ability and maintain stable holocentromeres, we injected DNA sequences with different features, including sequences that are repetitive or complex, and sequences with different AT-contents, into the gonad of Caenorhabditis elegans to form ACs in embryos, and monitored AC mitotic segregation. We demonstrated that AT-poor sequences (26% AT-content) delayed the acquisition of segregation competency of newly formed ACs. We also co-injected fragmented Saccharomyces cerevisiae genomic DNA, differentially expressed fluorescent markers and ubiquitously expressed selectable marker to construct a less repetitive, more complex AC. We sequenced the whole genome of a strain which propagates this AC through multiple generations, and de novo assembled the AC sequences. We discovered CENP-AHCP-3 domains/peaks are distributed along the AC, as in endogenous chromosomes, suggesting a holocentric architecture. We found that CENP-AHCP-3 binds to the unexpressed marker genes and many fragmented yeast sequences, but is excluded in the yeast extremely high-AT-content centromeric and mitochondrial DNA (> 83% AT-content) on the AC. We identified A-rich motifs in CENP-AHCP-3 domains/peaks on the AC and on endogenous chromosomes, which have some similarity with each other and similarity to some non-germline transcription factor binding sites.
Project description:The specificity of humoral immune responses depends on the functional rearrangement and expression of only one allele of immunoglobulin (Ig) genes. Here, we analyzed the comprehensive proteome of the murine Ig Emu enhancer, which governs the rearrangement and expression of the Ig mu heavy chain allele. By mass spectrometry of proteins bound at wild type versus mutant Emu enhancers, we identified Emu-binding proteins and associated multi-protein complexes. We found that the MSL/MOF complex, a regulator of gene dosage compensation in flies, binds Emu via transcription factor YY1 and facilitates Emu-driven chromatin looping and promoter interaction. Msl2 gene knockout in primary pre-B cells or Mof heterozygosity in mice reduced mu gene expression. In this data set we compare proteins binding to the wild-type Emu versus a DNA bait control for which the entire Emu sequence was switched to its reverse polarity sequence. The latter conserves the DNA GC content but virtually destroys all sequence-specific transcription factor binding sites. Of note, DNA repetitive sequences that can also be bound by DNA interacting proteins are kept functional by this control bait. SILAC quantitative proteomics was employed in a label swap approach incubating wild-type and control DNA with labeled and non-labled protein extracts, respectively.
Project description:DNA methylation is essential for embryonic development and implicated in the regulation of genomic imprinting. Genomic imprinting is established in the germline through parent-specific methylation of distinct cis-regulatory DNA sequences, called imprinting control regions (ICRs). Which factors bind to the opposing chromatin states at ICRs within the same nuclear environment was not systematically addressed. By using a proximity labelling approach with the methylation sensitive transcription factor ZFP57, we identified ATF7IP and other major components of the epigenetic maintenance machinery at ICRs.