Dataset Information

Synthetic Sequencing Standards: A Guide to Database Choice for Rumen Microbiota Amplicon Sequencing Analysis.

ABSTRACT: Our understanding of complex microbial communities, such as those residing in the rumen, has drastically advanced through the use of high throughput sequencing (HTS) technologies. Indeed, with the use of barcoded amplicon sequencing, it is now cost effective and computationally feasible to identify individual rumen microbial genera associated with ruminant livestock nutrition, genetics, performance and greenhouse gas production. However, across all disciplines of microbial ecology, there is currently little reporting of the use of internal controls for validating HTS results. Furthermore, there is little consensus of the most appropriate reference database for analyzing rumen microbiota amplicon sequencing data. Therefore, in this study, a synthetic rumen-specific sequencing standard was used to assess the effects of database choice on results obtained from rumen microbial amplicon sequencing. Four DADA2 reference training sets (RDP, SILVA, GTDB, and RefSeq + RDP) were compared to assess their ability to correctly classify sequences included in the rumen-specific sequencing standard. In addition, two thresholds of phylogenetic bootstrapping, 50 and 80, were applied to investigate the effect of increasing stringency. Sequence classification differences were apparent amongst the databases. For example the classification of Clostridium differed between all databases, thus highlighting the need for a consistent approach to nomenclature amongst different reference databases. It is hoped the effect of database on taxonomic classification observed in this study, will encourage research groups across various microbial disciplines to develop and routinely use their own microbiome-specific reference standard to validate analysis pipelines and database choice.

SUBMITTER: Smith PE

PROVIDER: S-EPMC7752867 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Synthetic Sequencing Standards: A Guide to Database Choice for Rumen Microbiota Amplicon Sequencing Analysis.

Smith Paul E PE Waters Sinead M SM Gómez Expósito Ruth R Smidt Hauke H Carberry Ciara A CA McCabe Matthew S MS

Frontiers in microbiology 20201208

Our understanding of complex microbial communities, such as those residing in the rumen, has drastically advanced through the use of high throughput sequencing (HTS) technologies. Indeed, with the use of barcoded amplicon sequencing, it is now cost effective and computationally feasible to identify individual rumen microbial genera associated with ruminant livestock nutrition, genetics, performance and greenhouse gas production. However, across all disciplines of microbial ecology, there is curr ...[more]

PMID: 33363527

Similar Datasets

Project description:The chicken microbiota is often analyzed to address questions about the effects of diet or disease on poultry health. To analyze the microbiota, bioinformatic platforms such as QIIME 2 and mothur are used, which incorporate public taxonomic databases such as Greengenes, the ribosomal database project (RDP), and SILVA to assign taxonomies to bacterial sequences. Many chicken microbiota studies continue to incorporate the Greengenes database, which has not been updated since 2013. To determine whether a choice of database could affect results, this study compared the results of bioinformatic analyses obtained using the Greengenes, RDP, and SILVA databases on a cecal luminal microbiome dataset. The QIIME 2 platform was used to process 16S bacterial sequences and assign taxonomies with Greengenes, RDP, and SILVA. Linear discriminant analysis effect size (LEfSe) was performed, allowing for the comparison of taxonomies considered significantly differentially abundant between the three databases. Some notable differences between databases were observed in results, in particular the ability of SILVA database to classify members of the family Lachnospiraceae into separate genera, while these members remained in one group of unclassified Lachnospiraceae through Greengenes and RDP. LEfSe analyses showed that the SILVA database produced more differentially abundant genera, in large part due to the classification of these separate Lachnospiraceae genera. Additionally, the relative abundance of unclassified Lachnospiraceae in SILVA results was significantly lower than in RDP results. Our results show the choice of taxonomic database can influence the results of a microbiota study at the genus level, potentially affecting the interpretation of the results. The use of the SILVA database is recommended over Greengenes in chicken microbiota studies, as more specific classifications at the genus level may provide more accurate interpretations of changes in the microbiota.

Project description:Meat and seafood spoilage ecosystems harbor extensive bacterial genomic diversity that is mainly found within a small number of species but within a large number of strains with different spoilage metabolic potential. To decipher the intraspecies diversity of such microbiota, traditional metagenetic analysis using the 16S rRNA gene is inadequate. We therefore assessed the potential benefit of an alternative genetic marker, gyrB, which encodes the subunit B of DNA gyrase, a type II DNA topoisomerase. A comparison between 16S rDNA-based (V3-V4) amplicon sequencing and gyrB-based amplicon sequencing was carried out in five types of meat and seafood products, with five mock communities serving as quality controls. Our results revealed that bacterial richness in these mock communities and food samples was estimated with higher accuracy using gyrB than using16S rDNA. However, for Firmicutes species, 35% of putative gyrB reads were actually identified as sequences of a gyrB paralog, parE, which encodes subunit B of topoisomerase IV; we therefore constructed a reference database of published sequences of both gyrB and pare for use in all subsequent analyses. Despite this co-amplification, the deviation between relative sequencing quantification and absolute qPCR quantification was comparable to that observed for 16S rDNA for all the tested species. This confirms that gyrB can be used successfully alongside 16S rDNA to determine the species composition (richness and evenness) of food microbiota. The major benefit of gyrB sequencing is its potential for improving taxonomic assignment and for further investigating OTU richness at the subspecies level, thus allowing more accurate discrimination of samples. Indeed, 80% of the reads of the 16S rDNA dataset were represented by thirteen 16S rDNA-based OTUs that could not be assigned at the species-level. Instead, these same clades corresponded to 44 gyrB-based OTUs, which differentiated various lineages down to the subspecies level. The increased ability of gyrB-based analyses to track and trace phylogenetically different groups of strains will generate improved resolution and more reliable results for studies of the strains implicated in food processes.

Dataset Information

Synthetic Sequencing Standards: A Guide to Database Choice for Rumen Microbiota Amplicon Sequencing Analysis.

Publications

Synthetic Sequencing Standards: A Guide to Database Choice for Rumen Microbiota Amplicon Sequencing Analysis.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets