Project description:Metagenome-assembled genomes (MAGs) have revealed the existence of novel bacterial and archaeal groups and provided insight into their genetic potential. However, metagenomics and even metatranscriptomics cannot resolve how the genetic potential translates into metabolic functions and physiological activity. Here, we present a novel approach for the quantitative and organism-specific assessment of the carbon flux through microbial communities with stable isotope probing-metaproteomics and integration of temporal dynamics in 13C incorporation by Stable Isotope Cluster Analysis (SIsCA). We used groundwater microcosms labeled with 13CO2 and D2O as model systems and stimulated them with reduced sulfur compounds to determine the ecosystem role of chemolithoautotrophic primary production. Raman microspectroscopy detected rapid deuterium incorporation in microbial cells from 12 days onwards, indicating activity of the groundwater organisms. SIsCA revealed that groundwater microorganisms fell into five distinct carbon assimilation strategies. Only one of these strategies, comprising less than 3.5% of the community, consisted of obligate autotrophs (Thiobacillus), with a 13C incorporation of approximately 95%. Instead, mixotrophic growth was the most successful strategy, and was represented by 12 of the 15 MAGs expressing pathways for autotrophic CO2 fixation, including Hydrogenophaga, Polaromonas and Dechloromonas, with varying 13C incorporation between 5% and 90%. Within 21 days, 43% of carbon in the community was replaced by 13C, increasing to 80% after 70 days. Of the 31 most abundant MAGs, 16 expressed pathways for sulfur oxidation, including strict heterotrophs. We concluded that chemolithoautotrophy drives the recycling of organic carbon and serves as a fill-up function in the groundwater. Mixotrophs preferred the uptake of organic carbon over the fixation of CO2, and heterotrophs oxidize inorganic compounds to preserve organic carbon. Our study showcases how next-generation physiology approach like SIsCA can move beyond metagenomics studies by providing information about expression of metabolic pathways and elucidating the role of MAGs in ecosystem functioning.
Project description:The ecophysiology of complete ammonia oxidizing Nitrospira (CMX) and their widespread occurrence in groundwater suggests that CMX bacteria have a competitive advantage over ammonia-oxidizing bacteria (AOB) and archaea (AOA) in these environments. However, the relevance of their activity from the ecosystem-level process perspective has remained unclear. We investigated oligotrophic carbonate rock aquifers as a model system to assess the contribution of CMX, AOA and AOB to nitrification and to identify the environmental drivers of their niche differentiation at different levels of ammonium and oxygen. CMX accounted for up to 95% of the ammonia oxidizer communities. Nitrification rates were positively correlated to CMX clade A-associated phylotypes and AOB affiliated with Nitrosomonas ureae. Surprisingly, short-term incubations amended with the nitrification inhibitors allylthiourea and chlorate suggested that AOB contributed more than 90% to overall ammonia oxidation, while metaproteomics analysis confirmed an active role of CMX in both ammonia and nitrite oxidation. Ecophysiological niche differentiation of CMX clades A and B, AOA and AOB was linked to their requirements for ammonium, oxygen tolerance, and metabolic versatility. Our results demonstrate that despite numerical predominance of CMX, the first step of nitrification in oligotrophic groundwater is primarily governed by AOB. Higher growth yields at lower NH4+ turnover rates and energy derived from nitrite oxidation most likely enable CMX to maintain consistently high populations. Activity measurements combined with differential inhibition allowed a refined understanding of ammonia oxidizer coexistence, competition and cooperation beyond the insights from molecular data alone.
Project description:Gene expression microarrays were performed to investigate the molecular effects of exposure to environmental polluted groundwater. Zebrafish was treated with polluted waters collected from dumps located upstream and downstream a sanitary landfills. Gene expression profiling of zebrafish liver was analyzed after acute exposure to sampled waters.
Project description:Gene expression microarrays were performed to investigate the molecular effects of exposure to environmental polluted groundwater. Mice were treated with polluted waters collected from dumps located upstream and downstream a sanitary landfills. Gene expression profiling of mouse liver was analyzed after acute and chronic exposure to sampled waters.
Project description:Describing the microbial community within the tumour has been a key aspect in understanding the pathophysiology of the tumour microenvironment. In head and neck cancer (HNC), most studies on tissue samples have only performed 16S rRNA short-read sequencing (SRS) on V3-V5 region. SRS is mostly limited to genus level identification. In this study, we compared full-length 16S rRNA long-read sequencing (FL-ONT) from Oxford Nanopore Technology (ONT) to V3-V4 Illumina SRS (V3V4-Illumina) in 26 HNC tumour tissues. Further validation was also performed using culture-based methods in 16 bacterial isolates obtained from 4 patients using MALDI-TOF MS. We observed similar alpha diversity indexes between FL-ONT and V3V4-Illumina. However, beta-diversity was significantly different between techniques (PERMANOVA - R2 = 0.131, p < 0.0001). At higher taxonomic levels (Phylum to Family), all metrics were more similar among sequencing techniques, while lower taxonomy displayed more discrepancies. At higher taxonomic levels, correlation in relative abundance from FL-ONT and V3V4-Illumina were higher, while this correlation decreased at lower levels. Finally, FL-ONT was able to identify more isolates at the species level that were identified using MALDI-TOF MS (75% vs. 18.8%). FL-ONT was able to identify lower taxonomic levels at a better resolution as compared to V3V4-Illumina 16S rRNA sequencing.
Project description:Microbial amplicon sequencing studies are an important tool in biological and biomedical research. Widespread 16S rRNA gene microbial surveys have shed light on the structure of many ecosystems inhabited by bacteria, including the human body. However, specialized software and algorithms are needed to convert raw sequencing data into biologically meaningful information (i.e. tables of bacterial counts). While different bioinformatic pipelines are available in a rapidly changing and improving field, users are often unaware of limitations and biases associated with individual pipelines and there is a lack of agreement regarding best practices. Here, we compared six bioinformatic pipelines for the analysis of amplicon sequence data: three OTU-level flows (QIIME-uclust, MOTHUR, and USEARCH-UPARSE) and three ASV-level (DADA2, Qiime2-Deblur, and USEARCH-UNOISE3). We tested workflows with different quality control options, clustering algorithms, and cutoff parameters on a mock community as well as on a large (N = 2170) recently published fecal sample dataset from the multi-ethnic HELIUS study. We assessed the sensitivity, specificity, and degree of consensus of the different outputs. DADA2 offered the best sensitivity, at the expense of decreased specificity compared to USEARCH-UNOISE3 and Qiime2-Deblur. USEARCH-UNOISE3 showed the best balance between resolution and specificity. OTU-level USEARCH-UPARSE and MOTHUR performed well, but with lower specificity than ASV-level pipelines. QIIME-uclust produced large number of spurious OTUs as well as inflated alpha-diversity measures and should be avoided in future studies. This study provides guidance for researchers using amplicon sequencing to gain biological insights.
Project description:Short-amplicon 16S rRNA gene sequencing is currently the method of choice for studies investigating microbiomes. However, comparative studies on differences in procedures are scarce. We sequenced human stool samples and mock communities with increasing complexity using a variety of commonly used protocols. Short amplicons targeting different variable regions (V-regions) or ranges thereof (V1-V2, V1-V3, V3-V4, V4, V4-V5, V6-V8, and V7-V9) were investigated for differences in the composition outcome due to primer choices. Next, the influence of clustering (operational taxonomic units [OTUs], zero-radius OTUs [zOTUs], and amplicon sequence variants [ASVs]), different databases (GreenGenes, the Ribosomal Database Project, Silva, the genomic-based 16S rRNA Database, and The All-Species Living Tree), and bioinformatic settings on taxonomic assignment were also investigated. We present a systematic comparison across all typically used V-regions using well-established primers. While it is known that the primer choice has a significant influence on the resulting microbial composition, we show that microbial profiles generated using different primer pairs need independent validation of performance. Further, comparing data sets across V-regions using different databases might be misleading due to differences in nomenclature (e.g., Enterorhabdus versus Adlercreutzia) and varying precisions in classification down to genus level. Overall, specific but important taxa are not picked up by certain primer pairs (e.g., Bacteroidetes is missed using primers 515F-944R) or due to the database used (e.g., Acetatifactor in GreenGenes and the genomic-based 16S rRNA Database). We found that appropriate truncation of amplicons is essential and different truncated-length combinations should be tested for each study. Finally, specific mock communities of sufficient and adequate complexity are highly recommended.IMPORTANCE In 16S rRNA gene sequencing, certain bacterial genera were found to be underrepresented or even missing in taxonomic profiles when using unsuitable primer combinations, outdated reference databases, or inadequate pipeline settings. Concerning the last, quality thresholds as well as bioinformatic settings (i.e., clustering approach, analysis pipeline, and specific adjustments such as truncation) are responsible for a number of observed differences between studies. Conclusions drawn by comparing one data set to another (e.g., between publications) appear to be problematic and require independent cross-validation using matching V-regions and uniform data processing. Therefore, we highlight the importance of a thought-out study design including sufficiently complex mock standards and appropriate V-region choice for the sample of interest. The use of processing pipelines and parameters must be tested beforehand.
Project description:Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6-V8, and V7-V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. Beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.
Project description:BackgroundIn the last few years, 16S rRNA gene sequencing (16S rDNA-seq) has seen a surprisingly rapid increase in election rate as a methodology to perform microbial community studies. Despite the considerable popularity of this technique, an exiguous number of specific tools are currently available for proper 16S rDNA-seq count data preprocessing and simulation. Indeed, the great majority of tools have been developed adapting methodologies previously used for bulk RNA-seq data, with poor assessment of their applicability in the metagenomics field. For such tools and the few ones specifically developed for 16S rDNA-seq data, performance assessment is challenging, mainly due to the complex nature of the data and the lack of realistic simulation models. In fact, to the best of our knowledge, no software thought for data simulation are available to directly obtain synthetic 16S rDNA-seq count tables that properly model heavy sparsity and compositionality typical of these data.ResultsIn this paper we present metaSPARSim, a sparse count matrix simulator intended for usage in development of 16S rDNA-seq metagenomic data processing pipelines. metaSPARSim implements a new generative process that models the sequencing process with a Multivariate Hypergeometric distribution in order to realistically simulate 16S rDNA-seq count table, resembling real experimental data compositionality and sparsity. It provides ready-to-use count matrices and comes with the possibility to reproduce different pre-coded scenarios and to estimate simulation parameters from real experimental data. The tool is made available at http://sysbiobig.dei.unipd.it/?q=Software#metaSPARSimand https://gitlab.com/sysbiobig/metasparsim.ConclusionmetaSPARSim is able to generate count matrices resembling real 16S rDNA-seq data. The availability of count data simulators is extremely valuable both for methods developers, for which a ground truth for tools validation is needed, and for users who want to assess state of the art analysis tools for choosing the most accurate one. Thus, we believe that metaSPARSim is a valuable tool for researchers involved in developing, testing and using robust and reliable data analysis methods in the context of 16S rRNA gene sequencing.