Unknown

Dataset Information

0

Improved inference of taxonomic richness from environmental DNA.


ABSTRACT: Accurate estimation of biological diversity in environmental DNA samples using high-throughput amplicon pyrosequencing must account for errors generated by PCR and sequencing. We describe a novel approach to distinguish the underlying sequence diversity in environmental DNA samples from errors that uses information on the abundance distribution of similar sequences across independent samples, as well as the frequency and diversity of sequences within individual samples. We have further refined this approach into a bioinformatics pipeline, Amplicon Pyrosequence Denoising Program (APDP) that is able to process raw sequence datasets into a set of validated sequences in formats compatible with commonly used downstream analyses packages. We demonstrate, by sequencing complex environmental samples and mock communities, that APDP is effective for removing errors from deeply sequenced datasets comprising biological and technical replicates, and can efficiently denoise single-sample datasets. APDP provides more conservative diversity estimates for complex datasets than other approaches; however, for some applications this may provide a more accurate and appropriate level of resolution, and result in greater confidence that returned sequences reflect the diversity of the underlying sample.

SUBMITTER: Morgan MJ 

PROVIDER: S-EPMC3753314 | biostudies-literature | 2013

REPOSITORIES: biostudies-literature

altmetric image

Publications

Improved inference of taxonomic richness from environmental DNA.

Morgan Matthew J MJ   Chariton Anthony A AA   Hartley Diana M DM   Court Leon N LN   Hardy Christopher M CM  

PloS one 20130826 8


Accurate estimation of biological diversity in environmental DNA samples using high-throughput amplicon pyrosequencing must account for errors generated by PCR and sequencing. We describe a novel approach to distinguish the underlying sequence diversity in environmental DNA samples from errors that uses information on the abundance distribution of similar sequences across independent samples, as well as the frequency and diversity of sequences within individual samples. We have further refined t  ...[more]

Similar Datasets

| S-EPMC4972244 | biostudies-literature
| S-EPMC5719527 | biostudies-literature
| S-EPMC6500138 | biostudies-literature
| S-EPMC4760164 | biostudies-literature
| S-EPMC5366867 | biostudies-literature
| S-EPMC5632632 | biostudies-literature
| S-EPMC3511277 | biostudies-literature
| S-EPMC6989699 | biostudies-literature