Characterization of probiotic Escherichia coli isolates using a novel pangenome microarray
Ontology highlight
ABSTRACT: Background: Based on 32 Escherichia coli and Shigella genome sequences, we have developed an E. coli pan-genome microarray. Publicly available genomes were annotated in a consistent manor to define all currently known genes potentially present in the species. The chip design was evaluated by hybridization of DNA from two sequenced E. coli strains, K-12 MG1655 (a commensal) and O157:H7 EDL933 (an enterotoxigenic E. coli). A dual channel and single channel analysis approach was compared for the comparative genomic hybridization experiments. Moreover, the microarray was used to characterize four unsequenced probiotic E. coli strains, currently marketed for beneficial effects on the human gut flora. Results: Based on the genomes included in this study, we were able to group together 2,041 genes that were present in all 32 genomes. Furthermore, we predict that the size of the E. coli core genome will approach ~1,560 essential genes, considerably less than previous estimates. Although any individual E. coli genome contains between 4,000 and 5,000 genes, we identified more than twice as many (11,872) distinct gene groups in the total gene pool (âpan-genomeâ) examined for microarray design. Benchmarking of the design based on sequenced control strain samples demonstrated a high sensitivity and relatively low false positive rate. Moreover, the array was highly sufficient to investigate the gene content of apathogenic isolates, despite the strong bias towards pathogenic E. coli strains that have been sequenced so far. Our analysis of four probiotic E. coli strains demonstrate that they share a gene pool very similar to the E. coli K-12 strains but also show significant similarity with enteropathogenic strains. Nonetheless, virulence genes were largely absent. Strain-specific genes found in probiotic E. coli but absent in E. coli K12 were most frequently phage-related genes, transposases and other genes related to mobile DNA, and metabolic enzymes or factors that may offer colonization fitness, which together with their asymptomatic nature may explain their nature. Conclusion: This high-density microarray provides an excellent tool for characterizing either DNA content or gene expression from unknown E. coli strains. Factorial design: Each of four test samples (G 1/2, G3/10, G 4/9, G5) are co-hybridized with two control strain samples (K-12 MG1655 and O157:H7 EDL933). Additional replicate co-hybridizations are included of the two control strain samples (O157:H7 EDL933 vs. K-12 MG1655).
ORGANISM(S): Escherichia coli
SUBMITTER: Peter Hallin
PROVIDER: E-GEOD-8595 | biostudies-arrayexpress |
REPOSITORIES: biostudies-arrayexpress
ACCESS DATA