Unknown

Dataset Information

0

Seeker: alignment-free identification of bacteriophage genomes by deep learning.


ABSTRACT: Recent advances in metagenomic sequencing have enabled discovery of diverse, distinct microbes and viruses. Bacteriophages, the most abundant biological entity on Earth, evolve rapidly, and therefore, detection of unknown bacteriophages in sequence datasets is a challenge. Most of the existing detection methods rely on sequence similarity to known bacteriophage sequences, impeding the identification and characterization of distinct, highly divergent bacteriophage families. Here we present Seeker, a deep-learning tool for alignment-free identification of phage sequences. Seeker allows rapid detection of phages in sequence datasets and differentiation of phage sequences from bacterial ones, even when those phages exhibit little sequence similarity to established phage families. We comprehensively validate Seeker's ability to identify previously unidentified phages, and employ this method to detect unknown phages, some of which are highly divergent from the known phage families. We provide a web portal (seeker.pythonanywhere.com) and a user-friendly Python package (github.com/gussow/seeker) allowing researchers to easily apply Seeker in metagenomic studies, for the detection of diverse unknown bacteriophages.

SUBMITTER: Auslander N 

PROVIDER: S-EPMC7708075 | biostudies-literature | 2020 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Seeker: alignment-free identification of bacteriophage genomes by deep learning.

Auslander Noam N   Gussow Ayal B AB   Benler Sean S   Wolf Yuri I YI   Koonin Eugene V EV  

Nucleic acids research 20201201 21


Recent advances in metagenomic sequencing have enabled discovery of diverse, distinct microbes and viruses. Bacteriophages, the most abundant biological entity on Earth, evolve rapidly, and therefore, detection of unknown bacteriophages in sequence datasets is a challenge. Most of the existing detection methods rely on sequence similarity to known bacteriophage sequences, impeding the identification and characterization of distinct, highly divergent bacteriophage families. Here we present Seeker  ...[more]

Similar Datasets

| S-EPMC8743544 | biostudies-literature
| S-EPMC3549825 | biostudies-literature
| PRJEB19480 | ENA
2014-06-10 | E-GEOD-45684 | biostudies-arrayexpress
| S-EPMC4524009 | biostudies-literature
2014-06-10 | GSE45684 | GEO
| S-EPMC7432689 | biostudies-literature
| S-EPMC6374904 | biostudies-literature
| S-EPMC2922894 | biostudies-other
| S-EPMC4791545 | biostudies-literature