Ontology highlight
ABSTRACT:
SUBMITTER: Dirmeier S
PROVIDER: S-EPMC6849186 | biostudies-literature | 2019 Nov
REPOSITORIES: biostudies-literature
Dirmeier Simon S Emmenlauer Mario M Dehio Christoph C Beerenwinkel Niko N
BMC bioinformatics 20191112 1
<h4>Background</h4>Analysing large and high-dimensional biological data sets poses significant computational difficulties for bioinformaticians due to lack of accessible tools that scale to hundreds of millions of data points.<h4>Results</h4>We developed a novel machine learning command line tool called PyBDA for automated, distributed analysis of big biological data sets. By using Apache Spark in the backend, PyBDA scales to data sets beyond the size of current applications. It uses Snakemake i ...[more]