Unknown

Dataset Information

0

Expansion of known ssRNA phage genomes: From tens to over a thousand.


ABSTRACT: The first sequenced genome was that of the 3569-nucleotide single-stranded RNA (ssRNA) bacteriophage MS2. Despite the recent accumulation of vast amounts of DNA and RNA sequence data, only 12 representative ssRNA phage genome sequences are available from the NCBI Genome database (June 2019). The difficulty in detecting RNA phages in metagenomic datasets raises questions as to their abundance, taxonomic structure, and ecological importance. In this study, we iteratively applied profile hidden Markov models to detect conserved ssRNA phage proteins in 82 publicly available metatranscriptomic datasets generated from activated sludge and aquatic environments. We identified 15,611 nonredundant ssRNA phage sequences, including 1015 near-complete genomes. This expansion in the number of known sequences enabled us to complete a phylogenetic assessment of both sequences identified in this study and known ssRNA phage genomes. Our expansion of these viruses from two environments suggests that they have been overlooked within microbiome studies.

SUBMITTER: Callanan J 

PROVIDER: S-EPMC7007245 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC9753621 | biostudies-literature
| S-EPMC7568308 | biostudies-literature
| S-EPMC5100052 | biostudies-literature
| PRJEB75610 | ENA
| S-EPMC7851401 | biostudies-literature
| S-EPMC9104289 | biostudies-literature
| S-EPMC3515743 | biostudies-literature
| S-EPMC5604317 | biostudies-literature
| S-EPMC6718518 | biostudies-literature