Unknown

Dataset Information

0

A distributed system for fast alignment of next-generation sequencing data.


ABSTRACT: We developed a scalable distributed computing system using the Berkeley Open Interface for Network Computing (BOINC) to align next-generation sequencing (NGS) data quickly and accurately. NGS technology is emerging as a promising platform for gene expression analysis due to its high sensitivity compared to traditional genomic microarray technology. However, despite the benefits, NGS datasets can be prohibitively large, requiring significant computing resources to obtain sequence alignment results. Moreover, as the data and alignment algorithms become more prevalent, it will become necessary to examine the effect of the multitude of alignment parameters on various NGS systems. We validate the distributed software system by (1) computing simple timing results to show the speed-up gained by using multiple computers, (2) optimizing alignment parameters using simulated NGS data, and (3) computing NGS expression levels for a single biological sample using optimal parameters and comparing these expression levels to that of a microarray sample. Results indicate that the distributed alignment system achieves approximately a linear speed-up and correctly distributes sequence data to and gathers alignment results from multiple compute clients.

SUBMITTER: Srimani JK 

PROVIDER: S-EPMC4984844 | biostudies-literature | 2010 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

A distributed system for fast alignment of next-generation sequencing data.

Srimani Jaydeep K JK   Wu Po-Yen PY   Phan John H JH   Wang May D MD  

IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine 20101201


We developed a scalable distributed computing system using the Berkeley Open Interface for Network Computing (BOINC) to align next-generation sequencing (NGS) data quickly and accurately. NGS technology is emerging as a promising platform for gene expression analysis due to its high sensitivity compared to traditional genomic microarray technology. However, despite the benefits, NGS datasets can be prohibitively large, requiring significant computing resources to obtain sequence alignment result  ...[more]

Similar Datasets

| S-EPMC3792961 | biostudies-literature
| S-EPMC4053384 | biostudies-literature
| S-EPMC5567265 | biostudies-literature
| S-EPMC4501066 | biostudies-literature
| S-EPMC9891242 | biostudies-literature
| S-EPMC9234764 | biostudies-literature
| S-EPMC6580563 | biostudies-literature
| S-EPMC9338934 | biostudies-literature
| S-EPMC5907718 | biostudies-literature
| S-EPMC4265526 | biostudies-literature