Unknown

Dataset Information

0

CRISPR Arrays Away from cas Genes.


ABSTRACT: CRISPR-Cas systems typically consist of a CRISPR array and cas genes that are organized in one or more operons. However, a substantial fraction of CRISPR arrays are not adjacent to cas genes. Definitive identification of such isolated CRISPR arrays runs into the problem of false-positives, with unrelated types of repetitive sequences mimicking CRISPR. We developed a computational pipeline to eliminate false CRISPR predictions and found that up to 25% of the CRISPR arrays in complete bacterial and archaeal genomes are located away from cas genes. Most of the repeats in these isolated arrays are identical to repeats in cas-adjacent CRISPR arrays in the same or closely related genomes, indicating an evolutionary relationship between isolated arrays and arrays in typical CRISPR-cas loci. The spacers in isolated CRISPR arrays show nearly as many matches to viral genomes as spacers from complete CRISPR-cas loci, suggesting that the isolated arrays were either functionally active recently or continue to function. Reconstruction of evolutionary events in closely related bacterial genomes suggests three routes of evolution of isolated CRISPR arrays: (1) loss of cas genes in a CRISPR-cas locus, (2) de novo generation of arrays from off-target spacer integration into sequences resembling the corresponding repeats, and (3) transfer by mobile genetic elements. Both combination of de novo emerging arrays with cas genes and regain of cas genes by isolated arrays via recombination likely contribute to functional diversification in CRISPR-Cas evolution.

SUBMITTER: Shmakov SA 

PROVIDER: S-EPMC7757702 | biostudies-literature | 2020 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

CRISPR Arrays Away from <i>cas</i> Genes.

Shmakov Sergey A SA   Utkina Irina I   Wolf Yuri I YI   Makarova Kira S KS   Severinov Konstantin V KV   Koonin Eugene V EV  

The CRISPR journal 20201201 6


CRISPR-Cas systems typically consist of a CRISPR array and <i>cas</i> genes that are organized in one or more operons. However, a substantial fraction of CRISPR arrays are not adjacent to <i>cas</i> genes. Definitive identification of such isolated CRISPR arrays runs into the problem of false-positives, with unrelated types of repetitive sequences mimicking CRISPR. We developed a computational pipeline to eliminate false CRISPR predictions and found that up to 25% of the CRISPR arrays in complet  ...[more]

Similar Datasets

| S-EPMC8173669 | biostudies-literature
| S-EPMC6604393 | biostudies-literature
| S-EPMC5294841 | biostudies-literature
| S-EPMC4931913 | biostudies-literature
| S-EPMC10132114 | biostudies-literature
| S-EPMC10520945 | biostudies-literature
| S-EPMC7145573 | biostudies-literature
| S-EPMC6010251 | biostudies-literature
| S-EPMC9782134 | biostudies-literature
| S-EPMC6709367 | biostudies-literature