Project description:Current developments have led to a reconsidering of energy policy in many countries with the aim of increasing the share of renewable energies in the energy supply, where the anaerobic digestion (AD) of biomass to produce methane also plays an important role. To improve biomass digestion while ensuring overall process stability, microbiome-based management strategies became more important. By applying combined metagenome and metaproteome, as well as metagenomically assembled genome (MAG)-centric analyses, it is possible to determine not only the functional potential but also the expressed functions of the entire microbial community and also individual MAGs. This approach was used in this study for the production-scale biogas plant 35 (BP35) consisting of three digesters which were operated differently regarding process temperatures, feedstocks and other process parameters. Different process conditions were hypothesized to result in specific microbiome adaptations and differentially abundant metabolic functions in the digesters. Based on metagenomic single-read analyses, several taxa residing in the three digesters of BP35 were shown to correlate with the corresponding substrates and temperatures. In particular, the genus Defluviitoga showed the strongest correlation to the process temperature and the genus Acetomicrobium featured a direct correlation to the concentartions of different acids including acetic acid. Analysis of the functional potential and expressed functions of the entire microbial community of the three digesters revealed that the genes and key enzymes relevant for the biogas process were present and also expressed. Differences between the abundances of certain genes and expressed enzymes could be related to the specific parameters of the corresponding digesters. Regarding the biogas related metabolic pathways, MAG-centric metagenomics and metaproteomics indicated the functional potential and the actual expressed metabolic functions of certain MAGs that are differentially abundant in the three digesters. These MAGs, belonging to the phylum Firmicutes, the class Bacilli and the orders Caldicoprobacterales and Bacteroidales showed a specific metabolic activity within the three digesters and have important roles in the hydrolysis, acidogenesis or acetogenesis of the anaerobic digestion process. An archaeal MAG assigned to the species Methanothermobacter wolfeii was the most abundant and highly active hydrogenotrophic methanogen in digester 3 featuring an operation temperature of 54 °C. Beside the MAGs that were differentially abundant in the three digesters, also MAGs which were more evenly distributed were analyzed. The most abundant and highly active MAG in all digesters belongs to the class Limnochordia and was shown to be ubiquitous in all three digesters and exhibit activity in a variety of pathways representing hydrolysis as well as the acido- and acetogenesis steps of the biogas process. Other MAGs assigned to the phylum Firmicutes, genus Acetomicrobium and the hydrogenotrophic species Methanoculleus thermohydrogenotrophicum were also shown to be more evenly distributed and active in the three digesters. Corresponding taxa appeared to be more resilient to the different process parameters of the three digesters, and therefore, may support a stable biogas process. Overall, the combined metagenome and metaproteome analysis of biogas digesters helps to gain deeper insights into the composition of the whole microbial community, biogas related pathways and their expression, which could contribute to an improved microbiome-based process management in the future.
Project description:The recovery of metagenome-assembled genomes (MAGs) from metagenomic data has recently become a common task for microbial studies. The strengths and limitations of the underlying bioinformatics algorithms are well appreciated by now based on performance tests with mock data sets of known composition. However, these mock data sets do not capture the complexity and diversity often observed within natural populations, since their construction typically relies on only a single genome of a given organism. Further, it remains unclear if MAGs can recover population-variable genes (those shared by >10% but <90% of the members of the population) as efficiently as core genes (those shared by >90% of the members). To address these issues, we compared the gene variabilities of pathogenic Escherichia coli isolates from eight diarrheal samples, for which the isolate was the causative agent, against their corresponding MAGs recovered from the companion metagenomic data set. Our analysis revealed that MAGs with completeness estimates near 95% captured only 77% of the population core genes and 50% of the variable genes, on average. Further, about 5% of the genes of these MAGs were conservatively identified as missing in the isolate and were of different (non-Enterobacteriaceae) taxonomic origin, suggesting errors at the genome-binning step, even though contamination estimates based on commonly used pipelines were only 1.5%. Therefore, the quality of MAGs may often be worse than estimated, and we offer examples of how to recognize and improve such MAGs to sufficient quality by (for instance) employing only contigs longer than 1,000 bp for binning.IMPORTANCE Metagenome assembly and the recovery of metagenome-assembled genomes (MAGs) have recently become common tasks for microbiome studies across environmental and clinical settings. However, the extent to which MAGs can capture the genes of the population they represent remains speculative. Current approaches to evaluating MAG quality are limited to the recovery and copy number of universal housekeeping genes, which represent a small fraction of the total genome, leaving the majority of the genome essentially inaccessible. If MAG quality in reality is lower than these approaches would estimate, this could have dramatic consequences for all downstream analyses and interpretations. In this study, we evaluated this issue using an approach that employed comparisons of the gene contents of MAGs to the gene contents of isolate genomes derived from the same sample. Further, our samples originated from a diarrhea case-control study, and thus, our results are relevant for recovering the virulence factors of pathogens from metagenomic data sets.
Project description:MotivationMetagenomics is a powerful tool for assaying the DNA from every genome present in an environment. Recent advances in bioinformatics have enabled the rapid assembly of near-complete metagenome-assembled genomes (MAGs), and there is a need for reproducible pipelines that can annotate and characterize thousands of genomes simultaneously, to enable identification and functional characterization.ResultsHere we present MAGpy, a scalable and reproducible pipeline that takes multiple genome assemblies as FASTA and compares them to several public databases, checks quality, suggests a taxonomy and draws a phylogenetic tree.Availability and implementationMAGpy is available on github: https://github.com/WatsonLab/MAGpy.Supplementary informationSupplementary data are available at Bioinformatics online.