Project description:Background: Enterohemorrhagic Escherichia coli (EHEC) O157 causes severe food-bone illness in humans. The chromosome of O157 consists of 4.1-Mb backbone sequences shared by benign E. coli K-12, and 1.4-Mb O157-specific sequences encoding many virulence determinants such as Shiga toxin genes (stxs) and the locus of enterocyte effacement (LEE). Non-O157 EHECs belonging to clonal lineages distinct from O157 also cause similar illness in humans. According to the parallel evolution model, they have independently acquired the major virulence determinants, stxs and LEE. However, the genomic differences between O157 and non-O157 EHECs have not yet systematically been analyzed. Results: By using the microarray and Whole Genome PCR scanning analyses, we performed a whole genome comparison of 20 EHEC strains of O26, O111, and O103 serotypes with O157. In non-O157 EHEC strains, although genome sizes were similar with or rather larger than O157 and the backbone regions were well conserved, O157-specific regions were very poorly conserved. Only around 20% of the O157-specific genes were fully conserved in each non-O157 serotype. However, the non-O157 EHECs contained a significant number of virulence genes found on prophages and plasmids in O157, and also multiple prophages similar but significantly divergent from those in O157. Conclusion: Although O157 and non-O157 EHECs have independently acquired a huge amount of serotype- or strain-specific genes by lateral gene transfer, they share an unexpectedly large number of virulence genes. Independent infections of similar but distinct bacteriophages carrying these virulence determinants appear to be involved in the parallel evolution of EHEC. Keywords: comparative genomic hybridization, CGH
Project description:Using a novel multiplexed reporter assay, we characterize promoter activity of hundreds of thousands of DNA sequences spanning the entire E. coli genome. We use this powerful assay to identify promoters throughout the E. coli genome and systematically dissect their regulatory motifs which encode promoter activity using a series of experiments.
Project description:Background: Based on 32 Escherichia coli and Shigella genome sequences, we have developed an E. coli pan-genome microarray. Publicly available genomes were annotated in a consistent manor to define all currently known genes potentially present in the species. The chip design was evaluated by hybridization of DNA from two sequenced E. coli strains, K-12 MG1655 (a commensal) and O157:H7 EDL933 (an enterotoxigenic E. coli). A dual channel and single channel analysis approach was compared for the comparative genomic hybridization experiments. Moreover, the microarray was used to characterize four unsequenced probiotic E. coli strains, currently marketed for beneficial effects on the human gut flora. Results: Based on the genomes included in this study, we were able to group together 2,041 genes that were present in all 32 genomes. Furthermore, we predict that the size of the E. coli core genome will approach ~1,560 essential genes, considerably less than previous estimates. Although any individual E. coli genome contains between 4,000 and 5,000 genes, we identified more than twice as many (11,872) distinct gene groups in the total gene pool (“pan-genome”) examined for microarray design. Benchmarking of the design based on sequenced control strain samples demonstrated a high sensitivity and relatively low false positive rate. Moreover, the array was highly sufficient to investigate the gene content of apathogenic isolates, despite the strong bias towards pathogenic E. coli strains that have been sequenced so far. Our analysis of four probiotic E. coli strains demonstrate that they share a gene pool very similar to the E. coli K-12 strains but also show significant similarity with enteropathogenic strains. Nonetheless, virulence genes were largely absent. Strain-specific genes found in probiotic E. coli but absent in E. coli K12 were most frequently phage-related genes, transposases and other genes related to mobile DNA, and metabolic enzymes or factors that may offer colonization fitness, which together with their asymptomatic nature may explain their nature. Conclusion: This high-density microarray provides an excellent tool for characterizing either DNA content or gene expression from unknown E. coli strains. Keywords: Comparative genomic hybridizations
Project description:We report binding of Topoisomerase IV on the E. coli genome. We report the cleavage sites of Topoisomerase IV on the E. coli genome
Project description:Glutathionylspermdine synthetase/amidase (Gss) and the encoding gene (gss) have only been described in two widely separated species; namely Escherichia coli and several members of the Kinetoplastida phyla. In the present paper we have studied the species distribution more extensively. It is striking that all of the 75 Enterobacteria species that has been sequenced contain sequences with very high degree of homology to the E. coli Gss protein. Although homologous sequences are also present in various other bacteria, in contrast to Enterobacteria they are not present in all species of a given phyla. As previously reported homologous sequences were found in all five species of Kinetoplastids tested (including Trypansosma cruzi), but it is striking that comparable sequences are not found in a variety of invertebrate and vertebrate species, Archea and plants. Studies in E. coli show that the highest accumulation of glutathionylspermidine is found in stationary phase cultures where most of the intracellular spermidine is converted to glutathionylspermidine. However, even in log phase cells there is some formation of glutathionylspermidine, and isotope exchange experiments show that there is a rapid exchange between glutathionylspermidine and intracellular spermidine. We have not been able to define a specific physiologic function for glutathionylspermidine, but microarray studies comparing gss+ and -gss strains of E. coli show that a large number of genes are either upregulated or downregulated by the loss of the gss gene.
Project description:The model prokaryote Escherichia coli can exist as a either a commensal or a pathogen in the gut of diverse mammalian hosts. These associations, coupled with its ease of cultivation and genetic variability, have made E. coli a popular indicator organism for tracking the origin of fecal water contamination. Source tracking accuracy is predicated on the assumption that E. coli isolates recovered from contaminated water present a genetic signature characteristic of the host from which they originated. In this study, we compared the accuracy with which E. coli isolated from humans, bear, cattle and deer could be identified by standard fingerprinting methods used for library-based microbial source tracking (repetitive element PCR and pulsed-field gel electrophoresis) in relation to microarray-based analysis of genome content. Our results show that patterns of gene presence or absence were more useful for distinguishing E. coli isolates from different sources than traditional fingerprinting methods, particularly in the case of human strains. Host-associated differences in genome composition included the presence or absence of mobile IS1 elements as well as genes encoding the ferric dicitrate iron transporter (fec), E. coli common pilus (ECP), type 1 fimbriae and the CRISPR associated cas proteins. Many of these differences occurred in regions of the E. coli chromosome previously shown to be “hot spots” for the integration of horizontally-acquired DNA. PCR primers designed to amplify the IS1 and fec loci confirmed array results and demonstrated the ease with which gene presence/absence data can be converted into a diagnostic assay. The data presented here suggest that, despite the high level of genetic diversity observed among isolates by PFGE, human-derived strains may constitute a distinct ecotype distinguished by multiple potential library-independent source tracking markers.
Project description:Reinvestigation of the chromatin immunoprecipitation procedure led us to discover four causes of high background: non-unique sequences, incomplete reversion of crosslinks, washing with spin-columns and insufficient RNase treatment. We used a publishd method giving a high background signal and a modified chromatin immunoprecipitation method which could greatly reduce the false positive rate and apply it to analyze genome wide binding of SeqA and σ32 binding in E. coli.