Unknown

Dataset Information

0

The Pacific Ocean virome (POV): a marine viral metagenomic dataset and associated protein clusters for quantitative viral ecology.


ABSTRACT: Bacteria and their viruses (phage) are fundamental drivers of many ecosystem processes including global biogeochemistry and horizontal gene transfer. While databases and resources for studying function in uncultured bacterial communities are relatively advanced, many fewer exist for their viral counterparts. The issue is largely technical in that the majority (often 90%) of viral sequences are functionally 'unknown' making viruses a virtually untapped resource of functional and physiological information. Here, we provide a community resource that organizes this unknown sequence space into 27 K high confidence protein clusters using 32 viral metagenomes from four biogeographic regions in the Pacific Ocean that vary by season, depth, and proximity to land, and include some of the first deep pelagic ocean viral metagenomes. These protein clusters more than double currently available viral protein clusters, including those from environmental datasets. Further, a protein cluster guided analysis of functional diversity revealed that richness decreased (i) from deep to surface waters, (ii) from winter to summer, (iii) and with distance from shore in surface waters only. These data provide a framework from which to draw on for future metadata-enabled functional inquiries of the vast viral unknown.

SUBMITTER: Hurwitz BL 

PROVIDER: S-EPMC3585363 | biostudies-literature | 2013

REPOSITORIES: biostudies-literature

altmetric image

Publications

The Pacific Ocean virome (POV): a marine viral metagenomic dataset and associated protein clusters for quantitative viral ecology.

Hurwitz Bonnie L BL   Sullivan Matthew B MB  

PloS one 20130228 2


Bacteria and their viruses (phage) are fundamental drivers of many ecosystem processes including global biogeochemistry and horizontal gene transfer. While databases and resources for studying function in uncultured bacterial communities are relatively advanced, many fewer exist for their viral counterparts. The issue is largely technical in that the majority (often 90%) of viral sequences are functionally 'unknown' making viruses a virtually untapped resource of functional and physiological inf  ...[more]

Similar Datasets

| S-EPMC4303639 | biostudies-literature
| S-EPMC1887562 | biostudies-literature
| S-EPMC5221459 | biostudies-literature
| S-EPMC6741140 | biostudies-literature
| S-EPMC3346489 | biostudies-literature
| S-EPMC7236503 | biostudies-literature
| S-EPMC6145878 | biostudies-literature
| S-EPMC4639796 | biostudies-other
| S-EPMC5894113 | biostudies-literature
| S-EPMC3662754 | biostudies-literature