Dataset Information

GPU-FS-kNN: a software tool for fast and scalable kNN computation using GPUs.

ABSTRACT:

Background

The analysis of biological networks has become a major challenge due to the recent development of high-throughput techniques that are rapidly producing very large data sets. The exploding volumes of biological data are craving for extreme computational power and special computing facilities (i.e. super-computers). An inexpensive solution, such as General Purpose computation based on Graphics Processing Units (GPGPU), can be adapted to tackle this challenge, but the limitation of the device internal memory can pose a new problem of scalability. An efficient data and computational parallelism with partitioning is required to provide a fast and scalable solution to this problem.

Results

We propose an efficient parallel formulation of the k-Nearest Neighbour (kNN) search problem, which is a popular method for classifying objects in several fields of research, such as pattern recognition, machine learning and bioinformatics. Being very simple and straightforward, the performance of the kNN search degrades dramatically for large data sets, since the task is computationally intensive. The proposed approach is not only fast but also scalable to large-scale instances. Based on our approach, we implemented a software tool GPU-FS-kNN (GPU-based Fast and Scalable k-Nearest Neighbour) for CUDA enabled GPUs. The basic approach is simple and adaptable to other available GPU architectures. We observed speed-ups of 50-60 times compared with CPU implementation on a well-known breast microarray study and its associated data sets.

Conclusion

Our GPU-based Fast and Scalable k-Nearest Neighbour search technique (GPU-FS-kNN) provides a significant performance improvement for nearest neighbour computation in large-scale networks. Source code and the software tool is available under GNU Public License (GPL) at https://sourceforge.net/p/gpufsknn/.

SUBMITTER: Arefin AS

PROVIDER: S-EPMC3429408 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

GPU-FS-kNN: a software tool for fast and scalable kNN computation using GPUs.

Arefin Ahmed Shamsul AS Riveros Carlos C Berretta Regina R Moscato Pablo P

PloS one 20120828 8

<h4>Background</h4>The analysis of biological networks has become a major challenge due to the recent development of high-throughput techniques that are rapidly producing very large data sets. The exploding volumes of biological data are craving for extreme computational power and special computing facilities (i.e. super-computers). An inexpensive solution, such as General Purpose computation based on Graphics Processing Units (GPGPU), can be adapted to tackle this challenge, but the limitation ...[more]

PMID: 22937144

Dataset Information

GPU-FS-kNN: a software tool for fast and scalable kNN computation using GPUs.

Background

Results

Conclusion

Publications

GPU-FS-kNN: a software tool for fast and scalable kNN computation using GPUs.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

WFA-GPU: gap-affine pairwise read-alignment using GPUs.
| S-EPMC10697739 | biostudies-literature

Fast Screening of Inhibitor Binding/Unbinding Using Novel Software Tool CaverDock.
| S-EPMC6828983 | biostudies-literature

Pharokka: a fast scalable bacteriophage annotation tool.
| S-EPMC9805569 | biostudies-literature

FastGCN: a GPU accelerated tool for fast gene co-expression networks.
| S-EPMC4300192 | biostudies-literature

A python library for the fast and scalable computation of biologically meaningful individual specific networks.
| S-EPMC11303555 | biostudies-literature

Fast parallel tandem mass spectral library searching using GPU hardware acceleration.
| S-EPMC3107871 | biostudies-literature

SJARACNe: a scalable software tool for gene network reverse engineering from big data.
| S-EPMC6581437 | biostudies-literature

al3c: high-performance software for parameter inference using Approximate Bayesian Computation.
| S-EPMC4626746 | biostudies-literature

A fast, scalable and versatile tool for analysis of single-cell omics data.
| S-EPMC10864184 | biostudies-literature

GPU-I-TASSER: a GPU accelerated I-TASSER protein structure prediction tool.
| S-EPMC8896630 | biostudies-literature