Natural history bycatch: a pipeline for identifying metagenomic sequences in RADseq data.
Ontology highlight
ABSTRACT: Background:Reduced representation genomic datasets are increasingly becoming available from a variety of organisms. These datasets do not target specific genes, and so may contain sequences from parasites and other organisms present in the target tissue sample. In this paper, we demonstrate that (1) RADseq datasets can be used for exploratory analysis of tissue-specific metagenomes, and (2) tissue collections house complete metagenomic communities, which can be investigated and quantified by a variety of techniques. Methods:We present an exploratory method for mining metagenomic "bycatch" sequences from a range of host tissue types. We use a combination of the pyRAD assembly pipeline, NCBI's blastn software, and custom R scripts to isolate metagenomic sequences from RADseq type datasets. Results:When we focus on sequences that align with existing references in NCBI's GenBank, we find that between three and five percent of identifiable double-digest restriction site associated DNA (ddRAD) sequences from host tissue samples are from phyla to contain known blood parasites. In addition to tissue samples, we examine ddRAD sequences from metagenomic DNA extracted snake and lizard hind-gut samples. We find that the sequences recovered from these samples match with expected bacterial and eukaryotic gut microbiome phyla. Discussion:Our results suggest that (1) museum tissue banks originally collected for host DNA archiving are also preserving valuable parasite and microbiome communities, (2) that publicly available RADseq datasets may include metagenomic sequences that could be explored, and (3) that restriction site approaches are a useful exploratory technique to identify microbiome lineages that could be missed by primer-based approaches.
SUBMITTER: Holmes I
PROVIDER: S-EPMC5907781 | biostudies-literature | 2018
REPOSITORIES: biostudies-literature
ACCESS DATA