Ontology highlight
ABSTRACT:
SUBMITTER: Klein M
PROVIDER: S-EPMC6276899 | biostudies-literature | 2017 Jan
REPOSITORIES: biostudies-literature
Klein Max M Sharma Rati R Bohrer Chris H CH Avelis Cameron M CM Roberts Elijah E
Bioinformatics (Oxford, England) 20160922 2
Data-parallel programming techniques can dramatically decrease the time needed to analyze large datasets. While these methods have provided significant improvements for sequencing-based analyses, other areas of biological informatics have not yet adopted them. Here, we introduce Biospark, a new framework for performing data-parallel analysis on large numerical datasets. Biospark builds upon the open source Hadoop and Spark projects, bringing domain-specific features for biology.<h4>Availability ...[more]