Dataset Information

Factors that affect large subunit ribosomal DNA amplicon sequencing studies of fungal communities: classification method, primer choice, and error.

ABSTRACT: Nuclear large subunit ribosomal DNA is widely used in fungal phylogenetics and to an increasing extent also amplicon-based environmental sequencing. The relatively short reads produced by next-generation sequencing, however, makes primer choice and sequence error important variables for obtaining accurate taxonomic classifications. In this simulation study we tested the performance of three classification methods: 1) a similarity-based method (BLAST + Metagenomic Analyzer, MEGAN); 2) a composition-based method (Ribosomal Database Project naïve bayesian classifier, NBC); and, 3) a phylogeny-based method (Statistical Assignment Package, SAP). We also tested the effects of sequence length, primer choice, and sequence error on classification accuracy and perceived community composition. Using a leave-one-out cross validation approach, results for classifications to the genus rank were as follows: BLAST + MEGAN had the lowest error rate and was particularly robust to sequence error; SAP accuracy was highest when long LSU query sequences were classified; and, NBC runs significantly faster than the other tested methods. All methods performed poorly with the shortest 50-100 bp sequences. Increasing simulated sequence error reduced classification accuracy. Community shifts were detected due to sequence error and primer selection even though there was no change in the underlying community composition. Short read datasets from individual primers, as well as pooled datasets, appear to only approximate the true community composition. We hope this work informs investigators of some of the factors that affect the quality and interpretation of their environmental gene surveys.

SUBMITTER: Porter TM

PROVIDER: S-EPMC3338786 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Factors that affect large subunit ribosomal DNA amplicon sequencing studies of fungal communities: classification method, primer choice, and error.

Porter Teresita M TM Golding G Brian GB

PloS one 20120427 4

Nuclear large subunit ribosomal DNA is widely used in fungal phylogenetics and to an increasing extent also amplicon-based environmental sequencing. The relatively short reads produced by next-generation sequencing, however, makes primer choice and sequence error important variables for obtaining accurate taxonomic classifications. In this simulation study we tested the performance of three classification methods: 1) a similarity-based method (BLAST + Metagenomic Analyzer, MEGAN); 2) a compositi ...[more]

PMID: 22558215

Similar Datasets

Project description:The evolution of sequencing technology and multiplexing has rapidly expanded our ability to characterize fungal diversity in the environment. However, obtaining an unbiased assessment of the fungal community using ribosomal markers remains challenging. Longer amplicons were shown to improve taxonomic resolution and resolve ambiguities by reducing the risk of spurious operational taxonomic units. We examined the implications of barcoding strategies by amplifying and sequencing two ribosomal DNA fragments. We analyzed the performance of the full internal transcribed spacer (ITS) and a longer fragment including also a part of the 28S ribosomal subunit replicated on 60 grapevine trunk core samples. Grapevine trunks harbor highly diverse fungal communities with implications for disease development. Using identical handling, amplification, and sequencing procedures, we obtained higher sequencing depths for the shorter ITS amplicon. Despite the more limited access to polymorphism, the overall diversity in amplified sequence variants was higher for the shorter ITS amplicon. We detected no meaningful bias in the phylogenetic composition due to the amplicon choice across analyzed samples. Despite the increased resolution of the longer ITS-28S amplicon, the higher and more consistent yields of the shorter amplicons produced a clearer resolution of the fungal community of grapevine stem samples. Our study highlights that the choice of ribosomal amplicons should be carefully evaluated and adjusted according to specific goals. IMPORTANCE Surveying fungal communities is key to our understanding of ecological functions of diverse habitats. Fungal communities can inform about the resilience of agricultural ecosystems, risks to human health, and impacts of pathogens. Community compositions are typically analyzed using ribosomal DNA sequences. Due to technical limitations, most fungal community surveys were based on amplifying a short but highly variable fragment. Advances in sequencing technology enabled the use of longer fragments that can address some limitations of species identification. In this study, we examined the implications of choosing either a short or long ribosomal sequence fragment by replicating the analyses on 60 grapevine wood core samples. Using highly accurate long-read sequencing, we found that the shorter fragment produced substantially higher yields. The shorter fragment also revealed more sequence and species diversity. Our study highlights that the choice of ribosomal amplicons should be carefully evaluated and adjusted according to specific goals.

Project description:Recent studies highlight the importance of intestinal fungal microbiota in the development of human disease. Infants, in particular, are an important population in which to study intestinal microbiomes because microbial community structure and dynamics during this formative window of life have the potential to influence host immunity and metabolism. When compared to bacteria, much less is known about the early development of human fungal communities, owing partly to their lower abundance and the relative lack of established molecular and taxonomic tools for their study. Herein, we describe the development, validation, and use of complementary amplicon-based genomic strategies to characterize infant fungal communities and provide quantitative information about Candida, an important fungal genus with respect to intestinal colonization and human disease. Fungal communities were characterized from 11 infant fecal samples using primers that target the internal transcribed spacer (ITS) 2 locus, a region that provides taxonomic discrimination of medically relevant fungi. Each sample yielded an average of 27,553 fungal sequences and Candida albicans was the most abundant species identified by sequencing and quantitative PCR (qPCR). Low numbers of Candida krusei and Candida parapsilosis sequences were observed in several samples, but their presence was detected by species-specific qPCR in only one sample, highlighting a challenge inherent in the study of low-abundance organisms. Overall, the sequencing results revealed that infant fecal samples had fungal diversity comparable to that of bacterial communities in similar-aged infants, which correlated with the relative abundance of C. albicans. We conclude that targeted sequencing of fungal ITS2 amplicons in conjunction with qPCR analyses of specific fungi provides an informative picture of fungal community structure in the human intestinal tract. Our data suggests that the infant intestine harbors diverse fungal species and is consistent with prior culture-based analyses showing that the predominant fungus in the infant intestine is C. albicans.

Dataset Information

Factors that affect large subunit ribosomal DNA amplicon sequencing studies of fungal communities: classification method, primer choice, and error.

Publications

Factors that affect large subunit ribosomal DNA amplicon sequencing studies of fungal communities: classification method, primer choice, and error.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets