Unknown

Dataset Information

0

MBLAST: Keeping up with the sequencing explosion for (meta)genome analysis.


ABSTRACT: Recent advances in next-generation sequencing technologies require alignment algorithms and software that can keep pace with the heightened data production. Standard algorithms, especially protein similarity searches, represent significant bottlenecks in analysis pipelines. For metagenomic approaches in particular, it is now often necessary to search hundreds of millions of sequence reads against large databases. Here we describe mBLAST, an accelerated search algorithm for translated and/or protein alignments to large datasets based on the Basic Local Alignment Search Tool (BLAST) and retaining the high sensitivity of BLAST. The mBLAST algorithms achieve substantial speed up over the National Center for Biotechnology Information (NCBI) programs BLASTX, TBLASTX and BLASTP for large datasets, allowing analysis within reasonable timeframes on standard computer architectures. In this article, the impact of mBLAST is demonstrated with sequences originating from the microbiota of healthy humans from the Human Microbiome Project. mBLAST is designed as a plug-in replacement for BLAST for any study that involves short-read sequences and includes high-throughput analysis. The mBLAST software is freely available to academic users at www.multicorewareinc.com.

SUBMITTER: Davis C 

PROVIDER: S-EPMC4612494 | biostudies-literature | 2015 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

mBLAST: Keeping up with the sequencing explosion for (meta)genome analysis.

Davis Curtis C   Kota Karthik K   Baldhandapani Venkat V   Gong Wei W   Abubucker Sahar S   Becker Eric E   Martin John J   Wylie Kristine M KM   Khetani Radhika R   Hudson Matthew E ME   Weinstock George M GM   Mitreva Makedonka M  

Journal of data mining in genomics & proteomics 20130731 3


Recent advances in next-generation sequencing technologies require alignment algorithms and software that can keep pace with the heightened data production. Standard algorithms, especially protein similarity searches, represent significant bottlenecks in analysis pipelines. For metagenomic approaches in particular, it is now often necessary to search hundreds of millions of sequence reads against large databases. Here we describe mBLAST, an accelerated search algorithm for translated and/or prot  ...[more]

Similar Datasets

| S-EPMC8996892 | biostudies-literature
| S-EPMC1116302 | biostudies-literature
| S-EPMC1116301 | biostudies-literature
| S-EPMC5383236 | biostudies-literature
| S-EPMC3634726 | biostudies-literature
| PRJEB41312 | ENA
| S-EPMC2176183 | biostudies-literature
| S-EPMC3023631 | biostudies-literature
| S-EPMC5658414 | biostudies-literature
| PRJNA554609 | ENA