Ontology highlight
ABSTRACT:
SUBMITTER: Altae-Tran H
PROVIDER: S-EPMC10910872 | biostudies-literature | 2023 Nov
REPOSITORIES: biostudies-literature
Altae-Tran Han H Kannan Soumya S Suberski Anthony J AJ Mears Kepler S KS Demircioglu F Esra FE Moeller Lukas L Kocalar Selin S Oshiro Rachel R Makarova Kira S KS Macrae Rhiannon K RK Koonin Eugene V EV Zhang Feng F
Science (New York, N.Y.) 20231123 6673
Microbial systems underpin many biotechnologies, including CRISPR, but the exponential growth of sequence databases makes it difficult to find previously unidentified systems. In this work, we develop the fast locality-sensitive hashing-based clustering (FLSHclust) algorithm, which performs deep clustering on massive datasets in linearithmic time. We incorporated FLSHclust into a CRISPR discovery pipeline and identified 188 previously unreported CRISPR-linked gene modules, revealing many additio ...[more]