Project description:The PFGRC has developed a cost effective alternative to complete genome sequencing in order to study the genetic differences between closely related species and/or strains. The comparative genomics approach combines Gene Discovery (GD) and Comparative Genomic Hybridization (CGH) techniques, resulting in the design and production of species microarrays that represent the diversity of a species beyond just the sequenced reference strain(s) used in the initial microarray design. These species arrays may then be used to interrogate hundreds of closely related strains in order to further unravel their evolutionary relationships. Clostridium botulinum produces botulinum neurotoxin (BoNT)and is classified as a “Category A” select agent. BoNT can be classified into seven serotypes designated A-G. There is considerable genetic variation within these serotypes, as demonstrated by the recognition of at least 47 subtypes. The most studied serotype, BoNT/A, has been found in a large and diverse group of clostridia, most of which express the subtype BoNT/A1. The BoNT/A1 producing C. botulinum strain ATCC 3502, used to obtain an initial annotated genome sequence, is not representative of the diverse clostridia group producing BoNT. Nearly 50% of C. botulinum strains producing BoNT/A1 have been shown to also encode unexpressed variants of BoNT/B with a distinct cluster arrangement. This nucleotide cluster is completely absent from the published genome sequence. In addition, a recently identified novel BoNT/A1 strain lacks the gene cluster seen in the genome sequence of ATCC 3502. Furthermore, a strain designated Hall A Hyper differs greatly from the sequenced strain as indicated by its ability to produce higher quantities of BoNT/A1. The genetic and phenotypic basis for this difference in BoNT expression is currently unknown, and the sequences of the BoNT gene and the cluster are identical in both strains. This observation supports the hypothesis that genes outside the toxin cluster are involved in the regulation and maturation of BoNT. The flow of genetic information within this group motivated us to identify novel genes for the purpose of creating a “species” DNA microarray to better understand the ancestral relationships among its members. Based on preliminary genotyping (MLST, and CGH using a single-genome-based array), 20 diverse C. botulinum strains were selected for sequencing. Sequence information obtained from this project, and from other publicly available sources, led to the development of a comprehensive species microarray for C. botulinum group members. The availability of the C. botulinum species DNA microarray has allowed us to carry out a collaborative CGH genotyping project to validate this microarray as well as understand the phylogenomic relationships among members of C. botulinum group.