Project description:We present GuideScan2 for memory-efficient, parallelizable construction of high-specificity CRISPR guide RNA (gRNA) databases and user-friendly design and analysis of individual gRNAs and gRNA libraries for targeting coding and non-coding regions in custom genomes. GuideScan2 analysis identifies widespread confounding effects of low-specificity gRNAs in published CRISPR screens and enables construction of a gRNA library that reduces off-target effects in a gene essentiality screen. GuideScan2 also enables the design and experimental validation of allele-specific gRNAs in a hybrid mouse genome. GuideScan2 will facilitate CRISPR experiments across a wide range of applications.
Project description:CRISPR-Cas9 has been widely used to functionally interrogate multiple aspects of cellular physiology and pathophysiology from single gene studies to genome-wide screens. Proper design of highly efficient guide RNAs directing the CRISPR genome editing process is critical for success in these types of experiments. Here, we present a pipeline for designing highly efficient loss-of-function guide RNA (gRNA) libraries with improved rates of knock-out efficiency compared to previous guide RNA library designs. We provide pre-computed and triaged gRNAs from our pipeline for all human and mouse transcripts through a fully searchable online portal as a resource to the community.
Project description:Background: The CRISPR/Cas9 toolbox has recently been expanded to include approaches for modulating gene expression. To successfully build on this work, and apply it for answering biological questions, it is important to establish it in a broad range of circumstances. Genome-scale CRISPR interference (CRISPRi) has been used in human cells lines, however the rules for designing effective guide RNAs (gRNAs) in different organisms are not well known. We sought to determine rules that determine gRNA effectiveness at transcriptional repression in Saccharomyces cerevisiae. Results: We created an inducible single plasmid CRISPRi system for gene repression in yeast, and used it to analyze fitness effects of gRNAs under 18 small molecule treatments. Our approach correctly identified previously-described chemical-genetic interactions, as well as a new mechanism of suppressing fluconazole toxicity by repression of the ERG25 gene. Assessment of multiple target loci across treatments allowed us to determine generalizable features associated with gRNA efficacy. Guides that target regions with low nucleosome occupancy and high chromatin accessibility were clearly more effective. We also found the best region to target gRNAs was between the transcription start site (TSS) and 200bp upstream of the TSS. Finally, unlike nuclease-proficient Cas9 in human cells, point mutations were tolerated equally well by truncated (18 nt specificity sequence) and full length (20 nt) gRNAs, however, 18 nt gRNAs were generally less potent than full length gRNAs. Conclusions: Our results establish a powerful functional genomics screening method, provide rules for designing effective gRNAs for gene repression, and show that 18 nt and 20 nt gRNAs exhibit similar tolerance to mismatches in the target sequence. These findings will enable effective library design and genome-wide screening in many genetic backgrounds. An expression construct was created for inducible CRISPRi in yeast. Key features include ORFs expressing dCas9-Mxi1 and the tetracycline repressor (TetR), as well as a tetracycline inducible gRNA locus containing the RPR1 promoter with a TetO site, a NotI site for cloning new gRNA specificity sequences, and the constant part of the gRNA. When yeast containing this plasmid are grown in the absence of anhydrotetracycline (ATc) TetR binds the gRNA promoter and prevents PolIII from binding and transcribing the gRNA. This in turn prevents dCas9-Mxi1 from binding the target site. In the presence of ATc, TetR dissociates and gRNA is expressed, allowing dCas9-Mxi1 to bind its target locus, and repress gene expression. gRNA libraries were cloned into this construct and transformed into yeast to create pools. Experiments were conducted in which yeast pools were grown in inducing (+ATc) and non-inducing conditions (-ATc) in the presence of different drugs. After multiple generations of growth in these conditions, yeast plasmids were minipreped and the gRNA locus was PCRed and sequenced via MiSeq. Counts of each gRNA were compared in different conditions.
Project description:Clustered regularly interspaced short palindromic repeat (CRISPR) RNA-guided nucleases have gathered considerable excitement as a tool for genome engineering. However, questions remain about the specificity of their target site recognition. Most previous studies have examined predicted off-target binding sites that differ from the perfect target site by one to four mismatches, which represent only a subset of genomic regions. Here, we used ChIP-seq to examine genome-wide CRISPR binding specificity at gRNA-specific and gRNA-independent sites. For two guide RNAs targeting the murine Snurf gene promoter, we observed very high binding specificity at the intended target site while off-target binding was observed at 2- to 6-fold lower intensities. We also identified significant gRNA-independent off-target binding. Interestingly, we found that these regions are highly enriched in the PAM site, a sequence required for target site recognition by CRISPR. To determine the relationship between Cas9 binding and endonuclease activity, we used targeted sequence capture as a high-throughput approach to survey a large number of the potential off-target sites identified by ChIP-seq or computational prediction. A high frequency of indels was observed at both target sites and one off-target site, while no cleavage activity could be detected at other ChIP-bound regions. Our results demonstrate that even a simple configuration of a Cas9:gRNA nuclease can support very specific DNA cleavage activity and that most interactions between the CRISPR nuclease complex and genomic PAM sites do not lead to DNA cleavage. ChIP-seq using dCas9 to determine genome-wide binding of CRISPR/Cas9 noED: Cas9 doublemutant protein without an effector domain KRAB: Cas9 doublemutant protein fused to the KRAB repressor domain S1 gRNA: guide RNA targeting GCTCCCTACGCATGCGTCCC(AGG) in the mouse genome S2 gRNA: guide RNA targeting AATGGCTCAGGTTTGTCGCG(CGG) in the mouse genome VEGFA TS3 gRNA: guide RNA targeting GGTGAGTGAGTGTGTGCGTG(TGG) in the human genome
Project description:Background: The CRISPR/Cas9 toolbox has recently been expanded to include approaches for modulating gene expression. To successfully build on this work, and apply it for answering biological questions, it is important to establish it in a broad range of circumstances. Genome-scale CRISPR interference (CRISPRi) has been used in human cells lines, however the rules for designing effective guide RNAs (gRNAs) in different organisms are not well known. We sought to determine rules that determine gRNA effectiveness at transcriptional repression in Saccharomyces cerevisiae. Results: We created an inducible single plasmid CRISPRi system for gene repression in yeast, and used it to analyze fitness effects of gRNAs under 18 small molecule treatments. Our approach correctly identified previously-described chemical-genetic interactions, as well as a new mechanism of suppressing fluconazole toxicity by repression of the ERG25 gene. Assessment of multiple target loci across treatments allowed us to determine generalizable features associated with gRNA efficacy. Guides that target regions with low nucleosome occupancy and high chromatin accessibility were clearly more effective. We also found the best region to target gRNAs was between the transcription start site (TSS) and 200bp upstream of the TSS. Finally, unlike nuclease-proficient Cas9 in human cells, point mutations were tolerated equally well by truncated (18 nt specificity sequence) and full length (20 nt) gRNAs, however, 18 nt gRNAs were generally less potent than full length gRNAs. Conclusions: Our results establish a powerful functional genomics screening method, provide rules for designing effective gRNAs for gene repression, and show that 18 nt and 20 nt gRNAs exhibit similar tolerance to mismatches in the target sequence. These findings will enable effective library design and genome-wide screening in many genetic backgrounds.
Project description:Synthetic DNA-binding proteins have found broad application in gene therapies and as tools for interrogating biology. Engineered proteins based on the CRISPR/Cas9 and TALE systems have been used to alter genomic DNA sequences, control transcription of endogenous genes, and modify epigenetic states. Although the activity of these proteins at their intended genomic target sites have been assessed, the genome-wide effects of their action have not been extensively characterized. Additionally, the role of chromatin structure in determining the binding of CRISPR/Cas9 and TALE proteins to their target sites and the regulation of nearby genes is poorly understood. Characterization of the activity these proteins using modern high-throughput genomic methods would provide valuable insight into the specificity and off-target effects of CRISPR- and TALE-based genome engineering tools. We have analyzed the genome-wide effects of TALE- and CRISPR-based transcriptional activators targeted to the promoters of two different endogenous human genes in HEK293T cells using a variety of high-throughput DNA sequencing methods. In particular, we assayed the DNA-binding specificity of these proteins and their effects on the epigenome. DNA-binding specificity was evaluated by ChIP-seq and RNA-seq was used to measure the specificity of these activators in perturbing the transcriptome. Additionally, DNase-seq was used to identify the chromatin state at target sites of the synthetic transcriptional activators and the genome-wide chromatin remodeling that occurs as a result of their action. Our results show that these genome engineering technologies are highly specific in both binding to their promoter target sites and inducing expression of downstream genes when multiple activators bind to a single promoter. Moreover, we show that these synthetic activators are able to induce the expression of silent genes in heterochromatic regions of the genome by opening regions of closed chromatin and decreasing DNA methylation. Interestingly, the transcriptional activation domain was not necessary for DNA-binding or chromatin remodeling in these regions, but was critical to inducing gene expression. This study shows that these CRISPR- and TALE-based transcriptional activators are exceptionally specific. Although we detected limited binding of off-target sites in the genome and changes to genome structure, these off-target event did not lead to any detectable changes in gene regulation. Collectively, these results underscore the potential for these technologies to make precise changes to gene expression for gene and cell therapies or fundamental studies of gene function. HEK293T cells were transfected in triplicate with plasmids expressing synthetic transcription factors. The synthetic TFs were either (a) dCas9-VP64 fusion protein and a targeting guide RNA (gRNA), or (b) a TALE-VP64 fusion protein engineered to bind to a specific target site in the genome. As a control, cells were transfected with plasmids expressing GFP. After transfection, ChIP-seq was used to identify both on-target and off-target binding sites for the synthetic TFs.
Project description:Synthetic DNA-binding proteins have found broad application in gene therapies and as tools for interrogating biology. Engineered proteins based on the CRISPR/Cas9 and TALE systems have been used to alter genomic DNA sequences, control transcription of endogenous genes, and modify epigenetic states. Although the activity of these proteins at their intended genomic target sites have been assessed, the genome-wide effects of their action have not been extensively characterized. Additionally, the role of chromatin structure in determining the binding of CRISPR/Cas9 and TALE proteins to their target sites and the regulation of nearby genes is poorly understood. Characterization of the activity these proteins using modern high-throughput genomic methods would provide valuable insight into the specificity and off-target effects of CRISPR- and TALE-based genome engineering tools. We have analyzed the genome-wide effects of TALE- and CRISPR-based transcriptional activators targeted to the promoters of two different endogenous human genes in HEK293T cells using a variety of high-throughput DNA sequencing methods. In particular, we assayed the DNA-binding specificity of these proteins and their effects on the epigenome. DNA-binding specificity was evaluated by ChIP-seq and RNA-seq was used to measure the specificity of these activators in perturbing the transcriptome. Additionally, DNase-seq was used to identify the chromatin state at target sites of the synthetic transcriptional activators and the genome-wide chromatin remodeling that occurs as a result of their action. Our results show that these genome engineering technologies are highly specific in both binding to their promoter target sites and inducing expression of downstream genes when multiple activators bind to a single promoter. Moreover, we show that these synthetic activators are able to induce the expression of silent genes in heterochromatic regions of the genome by opening regions of closed chromatin and decreasing DNA methylation. Interestingly, the transcriptional activation domain was not necessary for DNA-binding or chromatin remodeling in these regions, but was critical to inducing gene expression. This study shows that these CRISPR- and TALE-based transcriptional activators are exceptionally specific. Although we detected limited binding of off-target sites in the genome and changes to genome structure, these off-target event did not lead to any detectable changes in gene regulation. Collectively, these results underscore the potential for these technologies to make precise changes to gene expression for gene and cell therapies or fundamental studies of gene function. HEK293T cells were transfected in triplicate with plasmids expressing synthetic transcription factors. The synthetic TFs were either (a) dCas9-VP64 fusion protein and a targeting guide RNA (gRNA), or (b) a TALE-VP64 fusion protein engineered to bind to a specific target site in the genome. As a control, cells were transfected with plasmids expressing GFP. After transfection, RNA-seq was used to identify both on-target and off-target binding sites for the synthetic TFs. The data in this submission were generated using the TALE transfection experiments.
Project description:An important and largely unsolved problem in synthetic biology is how to target gene expression to specific cell types. Here, we apply iterative deep learning to design synthetic enhancers with strong differential activity between two human cell lines. We initially train models on published datasets of enhancer activity and chromatin accessibility and use them to guide the design of synthetic enhancers that maximize predicted specificity. We experimentally validate these sequences, use the measurements to re-optimize the predictor, and design a second generation of enhancers with improved specificity. Our design methods embed relevant transcription factor binding site (TFBS) motifs with higher frequencies than comparable endogenous enhancers while using a more selective motif vocabulary, and we show that enhancer activity is correlated with transcription factor expression at the single cell level. Finally, we characterize causal features of top enhancers via perturbation experiments and show enhancers as short as 50bp can maintain specificity.
Project description:An important and largely unsolved problem in synthetic biology is how to target gene expression to specific cell types. Here, we apply iterative deep learning to design synthetic enhancers with strong differential activity between two human cell lines. We initially train models on published datasets of enhancer activity and chromatin accessibility and use them to guide the design of synthetic enhancers that maximize predicted specificity. We experimentally validate these sequences, use the measurements to re-optimize the predictor, and design a second generation of enhancers with improved specificity. Our design methods embed relevant transcription factor binding site (TFBS) motifs with higher frequencies than comparable endogenous enhancers while using a more selective motif vocabulary, and we show that enhancer activity is correlated with transcription factor expression at the single cell level. Finally, we characterize causal features of top enhancers via perturbation experiments and show enhancers as short as 50bp can maintain specificity.