Project description:The Gram-negative bacterium Cupriavidus gilardii is an emerging multidrug-resistant pathogen found in many environments. However, little is known about this species or its antibiotic resistance mechanisms. We used biochemical tests, antibiotic susceptibility experiments, and whole-genome sequencing to characterize an environmental C. gilardii isolate. Like clinical isolates, this isolate was resistant to meropenem, gentamicin, and other antibiotics. Resistance to these antibiotics appeared to be related to the large number of intrinsic antibiotic resistance genes found in this isolate. As determined by comparative genomics, this resistome was also well conserved in the only two other C. gilardii strains sequenced to date. The intrinsic resistome of C. gilardii did not include the colistin resistance gene mcr-5, which was in a transposon present only in one strain. The intrinsic resistome of C. gilardii was comprised of (i) many multidrug efflux pumps, such as a homolog of the Pseudomonas aeruginosa MexAB-OprM pump that may be involved in resistance to meropenem, other β-lactams, and aminoglycosides; (ii) a novel β-lactamase (OXA-837) that decreases susceptibility to ampicillin but not to other β-lactams tested; (iii) a new aminoglycoside 3-N-acetyltransferase [AAC(3)-IVb, AacC10] that decreases susceptibility to gentamicin and tobramycin; and (iv) a novel partially conserved aminoglycoside 3"-adenylyltransferase [ANT(3")-Ib, AadA32] that decreases susceptibility to spectinomycin and streptomycin. These findings provide the first mechanistic insight into the intrinsic resistance of C. gilardii to multiple antibiotics and its ability to become resistant to an increasing number of drugs during therapy.IMPORTANCE Cupriavidus gilardii is a bacterium that is gaining increasing attention both as an infectious agent and because of its potential use in the detoxification of toxic compounds and other biotechnological applications. In recent years, however, there has been an increasing number of reported infections, some of them fatal, caused by C. gilardii These infections are hard to treat because this bacterium is naturally resistant to many antibiotics, including last-resort antibiotics, such as carbapenems. Moreover, this bacterium often becomes resistant to additional antibiotics during therapy. However, little is known about C. gilardii and its antibiotic resistance mechanisms. The significance of our research is in providing, for the first time, whole-genome information about the natural antibiotic resistance genes found in this bacterium and their conservation among different C. gilardii strains. This information may provide new insights into the appropriate use of antibiotics in combating infections caused by this emerging pathogen.
Project description:We report the first complete genome sequence of a beta-proteobacterial nitrogen-fixing symbiont of legumes, Cupriavidus taiwanensis LMG19424. The genome consists of two chromosomes of size 3.42 Mb and 2.50 Mb, and a large symbiotic plasmid of 0.56 Mb. The C. taiwanensis genome displays an unexpected high similarity with the genome of the saprophytic bacterium C. eutrophus H16, despite being 0.94 Mb smaller. Both organisms harbor two chromosomes with large regions of synteny interspersed by specific regions. In contrast, the two species host highly divergent plasmids, with the consequence that C. taiwanensis is symbiotically proficient and less metabolically versatile. Altogether, specific regions in C. taiwanensis compared with C. eutrophus cover 1.02 Mb and are enriched in genes associated with symbiosis or virulence in other bacteria. C. taiwanensis reveals characteristics of a minimal rhizobium, including the most compact (35-kb) symbiotic island (nod and nif) identified so far in any rhizobium. The atypical phylogenetic position of C. taiwanensis allowed insightful comparative genomics of all available rhizobium genomes. We did not find any gene that was both common and specific to all rhizobia, thus suggesting that a unique shared genetic strategy does not support symbiosis of rhizobia with legumes. Instead, phylodistribution analysis of more than 200 Sinorhizobium meliloti known symbiotic genes indicated large and complex variations of their occurrence in rhizobia and non-rhizobia. This led us to devise an in silico method to extract genes preferentially associated with rhizobia. We discuss how the novel genes we have identified may contribute to symbiotic adaptation.
Project description:The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Project description:Many important questions in biology are, fundamentally, comparative, and this extends to our analysis of a growing number of sequenced genomes. Existing genomic analysis tools are often organized around literal views of genomes as linear strings. Even when information is highly condensed, these views grow cumbersome as larger numbers of genomes are added. Data aggregation and summarization methods from the field of visual analytics can provide abstracted comparative views, suitable for sifting large multi-genome datasets to identify critical similarities and differences. We introduce a software system for visual analysis of comparative genomics data. The system automates the process of data integration, and provides the analysis platform to identify and explore features of interest within these large datasets. GenoSets borrows techniques from business intelligence and visual analytics to provide a rich interface of interactive visualizations supported by a multi-dimensional data warehouse. In GenoSets, visual analytic approaches are used to enable querying based on orthology, functional assignment, and taxonomic or user-defined groupings of genomes. GenoSets links this information together with coordinated, interactive visualizations for both detailed and high-level categorical analysis of summarized data. GenoSets has been designed to simplify the exploration of multiple genome datasets and to facilitate reasoning about genomic comparisons. Case examples are included showing the use of this system in the analysis of 12 Brucella genomes. GenoSets software and the case study dataset are freely available at http://genosets.uncc.edu. We demonstrate that the integration of genomic data using a coordinated multiple view approach can simplify the exploration of large comparative genomic data sets, and facilitate reasoning about comparisons and features of interest.
Project description:The genus Lactobacillus includes a diverse group of bacteria consisting of many species that are associated with fermentations of plants, meat or milk. In addition, various lactobacilli are natural inhabitants of the intestinal tract of humans and other animals. Finally, several Lactobacillus strains are marketed as probiotics as their consumption can confer a health benefit to host. Presently, 154 Lactobacillus species are known and a growing fraction of these are subject to draft genome sequencing. However, complete genome sequences are needed to provide a platform for detailed genomic comparisons. Therefore, we selected a total of 20 genomes of various Lactobacillus strains for which complete genomic sequences have been reported. These genomes had sizes varying from 1.8 to 3.3 Mb and other characteristic features, such as G+C content that ranged from 33% to 51%. The Lactobacillus pan genome was found to consist of approximately 14 000 protein-encoding genes while all 20 genomes shared a total of 383 sets of orthologous genes that defined the Lactobacillus core genome (LCG). Based on advanced phylogeny of the proteins encoded by this LCG, we grouped the 20 strains into three main groups and defined core group genes present in all genomes of a single group, signature group genes shared in all genomes of one group but absent in all other Lactobacillus genomes, and Group-specific ORFans present in core group genes of one group and absent in all other complete genomes. The latter are of specific value in defining the different groups of genomes. The study provides a platform for present individual comparisons as well as future analysis of new Lactobacillus genomes.
Project description:Until recently, the apicomplexan parasites, Cryptosporidium hominis and C. parvum, were considered the same species. However, the two parasites, now considered distinct species, exhibit significant differences in host range, infectivity, and pathogenicity, and their sequenced genomes exhibit only 95-97% identity. The availability of the complete genome sequences of these organisms provides the potential to identify the genetic variations that are responsible for the phenotypic differences between the two parasites. We compared the genome organization and structure, gene composition, the metabolic and other pathways, and the local sequence identity between the genes of these two Cryptosporidium species. Our observations show that the phenotypic differences between C. hominis and C. parvum are not due to gross genome rearrangements, structural alterations, gene deletions or insertions, metabolic capabilities, or other obvious genomic alterations. Rather, the results indicate that these genomes exhibit a remarkable structural and compositional conservation and suggest that the phenotypic differences observed are due to subtle variations in the sequences of proteins that act at the interface between the parasite and its host.
Project description:Despite its role as a reference organism in the plant sciences, the green alga Chlamydomonas reinhardtii entirely lacks genomic resources from closely related species. We present highly contiguous and well-annotated genome assemblies for three unicellular C. reinhardtii relatives: Chlamydomonas incerta, Chlamydomonas schloesseri, and the more distantly related Edaphochlamys debaryana. The three Chlamydomonas genomes are highly syntenous with similar gene contents, although the 129.2 Mb C. incerta and 130.2 Mb C. schloesseri assemblies are more repeat-rich than the 111.1 Mb C. reinhardtii genome. We identify the major centromeric repeat in C. reinhardtii as a LINE transposable element homologous to Zepp (the centromeric repeat in Coccomyxa subellipsoidea) and infer that centromere locations and structure are likely conserved in C. incerta and C. schloesseri. We report extensive rearrangements, but limited gene turnover, between the minus mating type loci of these Chlamydomonas species. We produce an eight-species core-Reinhardtinia whole-genome alignment, which we use to identify several hundred false positive and missing genes in the C. reinhardtii annotation and >260,000 evolutionarily conserved elements in the C. reinhardtii genome. In summary, these resources will enable comparative genomics analyses for C. reinhardtii, significantly extending the analytical toolkit for this emerging model system.
Project description:The PFGRC has developed a cost effective alternative to complete genome sequencing in order to study the genetic differences between closely related species and/or strains. The comparative genomics approach combines Gene Discovery (GD) and Comparative Genomic Hybridization (CGH) techniques, resulting in the design and production of species microarrays that represent the diversity of a species beyond just the sequenced reference strain(s) used in the initial microarray design. These species arrays may then be used to interrogate hundreds of closely related strains in order to further unravel their evolutionary relationships. Clostridium botulinum produces botulinum neurotoxin (BoNT)and is classified as a “Category A” select agent. BoNT can be classified into seven serotypes designated A-G. There is considerable genetic variation within these serotypes, as demonstrated by the recognition of at least 47 subtypes. The most studied serotype, BoNT/A, has been found in a large and diverse group of clostridia, most of which express the subtype BoNT/A1. The BoNT/A1 producing C. botulinum strain ATCC 3502, used to obtain an initial annotated genome sequence, is not representative of the diverse clostridia group producing BoNT. Nearly 50% of C. botulinum strains producing BoNT/A1 have been shown to also encode unexpressed variants of BoNT/B with a distinct cluster arrangement. This nucleotide cluster is completely absent from the published genome sequence. In addition, a recently identified novel BoNT/A1 strain lacks the gene cluster seen in the genome sequence of ATCC 3502. Furthermore, a strain designated Hall A Hyper differs greatly from the sequenced strain as indicated by its ability to produce higher quantities of BoNT/A1. The genetic and phenotypic basis for this difference in BoNT expression is currently unknown, and the sequences of the BoNT gene and the cluster are identical in both strains. This observation supports the hypothesis that genes outside the toxin cluster are involved in the regulation and maturation of BoNT. The flow of genetic information within this group motivated us to identify novel genes for the purpose of creating a “species” DNA microarray to better understand the ancestral relationships among its members. Based on preliminary genotyping (MLST, and CGH using a single-genome-based array), 20 diverse C. botulinum strains were selected for sequencing. Sequence information obtained from this project, and from other publicly available sources, led to the development of a comprehensive species microarray for C. botulinum group members. The availability of the C. botulinum species DNA microarray has allowed us to carry out a collaborative CGH genotyping project to validate this microarray as well as understand the phylogenomic relationships among members of C. botulinum group.