Project description:DNA determines where and when genes are expressed, but the full set of sequence determinants that control gene expression is not known. Here, we measured transcriptional activity of DNA sequences that represent ~100 times larger sequence space than the human genome using massively parallel reporter assays. Machine learning models revealed that transcription factors (TFs) act generally in an additive manner with weak grammar, and that enhancers increase expression from a promoter by a mechanism that does not involve specific TF-TF interactions. The enhancers themselves can be classified into three distinct types: classical, closed chromatin and chromatin-dependent enhancers. We also show that few TFs are strongly active in a cell, with most activities similar between cell types. Individual TFs can have multiple gene regulatory activities, including chromatin opening, enhancing, promoting and TSS determining activity – consistent with the view that the TF binding motif is the only atomic unit of gene expression.
Project description:DNA determines where and when genes are expressed, but the full set of sequence determinants that control gene expression is not known. Here, we measured transcriptional activity of DNA sequences that represent ~100 times larger sequence space than the human genome using massively parallel reporter assays. Machine learning models revealed that transcription factors (TFs) act generally in an additive manner with weak grammar, and that enhancers increase expression from a promoter by a mechanism that does not involve specific TF-TF interactions. The enhancers themselves can be classified into three distinct types: classical, closed chromatin and chromatin-dependent enhancers. We also show that few TFs are strongly active in a cell, with most activities similar between cell types. Individual TFs can have multiple gene regulatory activities, including chromatin opening, enhancing, promoting and TSS determining activity – consistent with the view that the TF binding motif is the only atomic unit of gene expression.
Project description:Promoters and enhancers are key cis-regulatory elements, but how they operate to generate cell-type-specific transcriptomes is not fully understood. We developed a simple and robust approach to sensitively detect 5’-ends of nascent RNAs (NET-CAGE) in diverse cells and tissues, including unstable transcripts such as enhancer-derived RNAs. We studied RNA synthesis and degradation at the transcription start site (TSS) level, characterizing the impact of differential promoter usage on transcript stability. We quantified transcription from cis-regulatory elements without the influence of RNA turnover, and show that enhancer-promoter pairs are generally activated simultaneously upon stimulation. By integrating NET-CAGE data with chromatin interaction maps, we show that cis-regulatory elements are topologically connected according to their cell-type specificity. We identified new enhancers with high sensitivity, and delineated primary locations of transcription within super-enhancers. Our NET-CAGE dataset derived from human and mouse cells expands the FANTOM5 catalogue of transcribed enhancers, with broad applicability to biomedical research.