Dataset Information

PZLAST: an ultra-fast amino acid sequence similarity search server against public metagenomes.

ABSTRACT: Similarity searches of amino acid sequences against the public metagenomic data can provide users insights about the function of sequences based on the environmental distribution of similar sequences. However, a considerable reduction in the amount of data or the accuracy of the result was necessary to conduct sequence similarity searches against public metagenomic data, because of the vast data size more than Terabytes. Here, we present an ultra-fast service for the highly accurate amino acid sequence similarity search, called PZLAST, which can search the user's amino acid sequences to several Terabytes of public metagenomic sequences in approximately 10-20 minutes. PZLAST accomplishes its search speed by using PEZY-SC2, which is a MIMD many-core processor. Results of PZLAST are summarized by the ontology-based environmental distribution of similar sequences. PZLAST can be used to predict the function of sequences and mine for homologs of functionally important gene sequences. PZLAST is freely accessible at https://pzlast.riken.jp/meta without requiring registration. Supplementary data are available at Bioinformatics online.

SUBMITTER: Mori H

PROVIDER: S-EPMC8570820 | biostudies-literature | 2021 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

PZLAST: an ultra-fast amino acid sequence similarity search server against public metagenomes.

Mori Hiroshi H Ishikawa Hitoshi H Higashi Koichi K Kato Yoshiaki Y Ebisuzaki Toshikazu T Kurokawa Ken K

Bioinformatics (Oxford, England) 20211101 21

<h4>Summary</h4>: Similarity searches of amino acid sequences against the public metagenomic data can provide users insights about the function of sequences based on the environmental distribution of similar sequences. However, a considerable reduction in the amount of data or the accuracy of the result was necessary to conduct sequence similarity searches against public metagenomic data, because of the vast data size more than Terabytes. Here, we present an ultra-fast service for the highly acc ...[more]

PMID: 34240105

Dataset Information

PZLAST: an ultra-fast amino acid sequence similarity search server against public metagenomes.

Publications

PZLAST: an ultra-fast amino acid sequence similarity search server against public metagenomes.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

CUDASW++4.0: ultra-fast GPU-based Smith-Waterman protein sequence database search.
| S-EPMC11531700 | biostudies-literature

HMMER web server: interactive sequence similarity searching.
| S-EPMC3125773 | biostudies-literature

PSimScan: algorithm and utility for fast protein similarity search.
| S-EPMC3591303 | biostudies-literature

Protein domain embeddings for fast and accurate similarity search.
| S-EPMC11529836 | biostudies-literature

Minimally-overlapping words for sequence similarity search.
| S-EPMC8016470 | biostudies-literature

A fast word search algorithm for the representation of sequence similarity in genomic DNA.
| S-EPMC523596 | biostudies-other

BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server.
| S-EPMC3098077 | biostudies-literature

RAPSearch: a fast protein similarity search tool for short reads.
| S-EPMC3113943 | biostudies-literature

SW#db: GPU-Accelerated Exact Sequence Similarity Database Search.
| S-EPMC4699916 | biostudies-literature