Unknown,Transcriptomics,Genomics,Proteomics

Dataset Information

Cas9 target DNA specificity

ABSTRACT: To study target sequence specificity, selectivity, and reaction kinetics of Streptococcus pyogenes Cas9 activity, we challenged libraries of random variant targets with purified Cas9::guide RNA complexes in vitro. Cleavage kinetics were nonlinear, with a burst of initial activity followed by slower sustained cleavage. Consistent with other recent analyses of Cas9 sequence specificity, we observe considerable (albeit incomplete) impairment of cleavage for targets mutated in the PAM sequence or in "seed" sequences matching the proximal 8 bp of the guide. A second target region requiring close homology was located at the other end of the guide::target duplex (positions 13-18 relative to the PAM). Strikingly, a subset of variants which broke homology in the intervening region consistently increased the capacity of Cas9 to cleave in extended reactions. Sequences flanking the guide+PAM region had measurable (albeit modest) effects on cleavage. Taken together, these studies provide both a basis for predicting effective cleavage targets and a basis for potential optimization of guide RNAs to yield efficiency beyond that of the simple perfect-match guides. 118 samples anaylzed. Controls have con in sample name. To quantitatively measure cleavage efficiency of a single gRNA, we created a population of random variant target sequences to two gRNA targets. The targets used were "unc-22A", [a sequence from the well-characterized unc-22 gene of Caenorhabditis elegans], and "protospacer 4" (ps4), a previously characterized sequence from a natural spacer from S. pyogenes MGAS10750 . Using custom mixtures of oligonucleotide precursors for each base during chemical synthesis, a set of polymorphic target libraries ('Random Variant Libraries') were designed to have a baseline variation rate at each position. On each side of the gRNA homology and PAM regions, 6 bps of random sequence were added. The first base of intended gRNA homology is designated base 1 . The entire 35 bp random variant library mixture was cloned into a standard plasmid vector (pHRL-TK). Several thousand colonies from plates were washed in pools and prepared by standard plasmid preparation methods. The complexity of the libraries were estimated based on Illumina sequencing of the uncut libraries and filtering for minimum representation expected from the pooling. Approximately 1500-3000 unique species were obtained in the unc-22A libraries and 5000 unique sequences in the ps4 library (see Materials and Methods). To assay cleavage, purified Cas9 was first incubated with gRNA, followed by incubation with the variant library for various time points and under various conditions. DNA template is among the conditions varied in the experiments. After protein removal, flanking sequences outside of the target region are used for PCR amplification and plasmid cleavage was measured through loss of PCR products that span the region of interest. A set of perfectly matched targets and highly mutated versions present in the random variant library served as internal positive and negative controls respectively. A log retention score for each sequence in each experiment was calculated by quantifying the representation of each sequence before and after addition of the Cas9 protein. Two approaches were used for normalization: first we used a population of ps4 targets "spiked" into the library as an uncleaved control, second, we used a population of unc-22A targets with large numbers of variations from the perfect target (between 4 and 7), and hence likely limited if any cleavage. Equivalent results are obtained with these two normalization approaches (see Computational Methods for details). Retention scores are expressed as the log2 of the normalized ratio, so that a more negative retention score indicates efficient cleavage of substrate while a less negative score indicates less cleavage. Templates which are uncleaved will yield a retention score at or near zero. Comparisons between multiple experiments indicate strong correlation between independent retention measurements. GSM1410678-GSM1410761; AF_SOL*.dat' files contain the calculated final retentions for each experiment. Each experiment labeled: M-bM-^@M-^\AF_SOL_###_t###M-bM-^@M-^]. M-bM-^@M-^\AF_SOL_###M-bM-^@M-^] corresponds to the experiment run ID and M-bM-^@M-^\t###M-bM-^@M-^] corresponds to the incubation time of the experiment. For example AF_SOL_513_t360, corresponds to experiment 513 on the protospacer 4 guide and DNA target and the incubation time was 360 mins. The experimental conditions and ID can be found in the associated publication. GSM1544297-GSM1544332; unc*.dat file is a tab-delimited file of all considered sequences in each experiment. The names of the files and the AF_SOL_# run number can be found in the associated publication (Supplementary Materials) with the details of the conditions. Each filename starts with the type of gRNA used (either unc-22WT or the mutant version unc22C11G). The next number (#min) is indication of the time of incubation for the experiment and this is either followed by #pcr_AF_SOL_# or just AF_SOL_#. If followed by #pcr, that is the indication of the number of PCR cycles used in the experiments. Finally, AF_SOL_# denotes the sequencing run ID number.

ORGANISM(S): synthetic construct

SUBMITTER: Andrew Fire

PROVIDER: E-GEOD-58426 | biostudies-arrayexpress |

REPOSITORIES: biostudies-arrayexpress

ACCESS DATA

Similar Datasets

Project description:Clustered regularly interspaced short palindromic repeat (CRISPR) RNA-guided nucleases have gathered considerable excitement as a tool for genome engineering. However, questions remain about the specificity of their target site recognition. Most previous studies have examined predicted off-target binding sites that differ from the perfect target site by one to four mismatches, which represent only a subset of genomic regions. Here, we used ChIP-seq to examine genome-wide CRISPR binding specificity at gRNA-specific and gRNA-independent sites. For two guide RNAs targeting the murine Snurf gene promoter, we observed very high binding specificity at the intended target site while off-target binding was observed at 2- to 6-fold lower intensities. We also identified significant gRNA-independent off-target binding. Interestingly, we found that these regions are highly enriched in the PAM site, a sequence required for target site recognition by CRISPR. To determine the relationship between Cas9 binding and endonuclease activity, we used targeted sequence capture as a high-throughput approach to survey a large number of the potential off-target sites identified by ChIP-seq or computational prediction. A high frequency of indels was observed at both target sites and one off-target site, while no cleavage activity could be detected at other ChIP-bound regions. Our results demonstrate that even a simple configuration of a Cas9:gRNA nuclease can support very specific DNA cleavage activity and that most interactions between the CRISPR nuclease complex and genomic PAM sites do not lead to DNA cleavage. ChIP-seq using dCas9 to determine genome-wide binding of CRISPR/Cas9 noED: Cas9 doublemutant protein without an effector domain KRAB: Cas9 doublemutant protein fused to the KRAB repressor domain S1 gRNA: guide RNA targeting GCTCCCTACGCATGCGTCCC(AGG) in the mouse genome S2 gRNA: guide RNA targeting AATGGCTCAGGTTTGTCGCG(CGG) in the mouse genome VEGFA TS3 gRNA: guide RNA targeting GGTGAGTGAGTGTGTGCGTG(TGG) in the human genome

Dataset Information

Cas9 target DNA specificity

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets