Unknown

Dataset Information

0

Biospark: scalable analysis of large numerical datasets from biological simulations and experiments using Hadoop and Spark.


ABSTRACT: Data-parallel programming techniques can dramatically decrease the time needed to analyze large datasets. While these methods have provided significant improvements for sequencing-based analyses, other areas of biological informatics have not yet adopted them. Here, we introduce Biospark, a new framework for performing data-parallel analysis on large numerical datasets. Biospark builds upon the open source Hadoop and Spark projects, bringing domain-specific features for biology. AVAILABILITY AND IMPLEMENTATION:Source code is licensed under the Apache 2.0 open source license and is available at the project website: https://www.assembla.com/spaces/roberts-lab-public/wiki/Biospark CONTACT: eroberts@jhu.eduSupplementary information: Supplementary data are available at Bioinformatics online.

SUBMITTER: Klein M 

PROVIDER: S-EPMC6276899 | biostudies-literature | 2017 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Biospark: scalable analysis of large numerical datasets from biological simulations and experiments using Hadoop and Spark.

Klein Max M   Sharma Rati R   Bohrer Chris H CH   Avelis Cameron M CM   Roberts Elijah E  

Bioinformatics (Oxford, England) 20160922 2


Data-parallel programming techniques can dramatically decrease the time needed to analyze large datasets. While these methods have provided significant improvements for sequencing-based analyses, other areas of biological informatics have not yet adopted them. Here, we introduce Biospark, a new framework for performing data-parallel analysis on large numerical datasets. Biospark builds upon the open source Hadoop and Spark projects, bringing domain-specific features for biology.<h4>Availability  ...[more]

Similar Datasets

| S-EPMC3866557 | biostudies-literature
| S-EPMC8315543 | biostudies-literature
| S-EPMC6805285 | biostudies-literature
| S-EPMC3458522 | biostudies-literature
| S-EPMC4238823 | biostudies-other
| S-EPMC7390408 | biostudies-literature
| S-EPMC6952362 | biostudies-literature
| S-EPMC8218388 | biostudies-literature
| S-EPMC5499724 | biostudies-literature
| S-EPMC2717332 | biostudies-literature