Unknown

Dataset Information

0

CloudForest: A Scalable and Efficient Random Forest Implementation for Biological Data.


ABSTRACT: Random Forest has become a standard data analysis tool in computational biology. However, extensions to existing implementations are often necessary to handle the complexity of biological datasets and their associated research questions. The growing size of these datasets requires high performance implementations. We describe CloudForest, a Random Forest package written in Go, which is particularly well suited for large, heterogeneous, genetic and biomedical datasets. CloudForest includes several extensions, such as dealing with unbalanced classes and missing values. Its flexible design enables users to easily implement additional extensions. CloudForest achieves fast running times by effective use of the CPU cache, optimizing for different classes of features and efficiently multi-threading. https://github.com/ilyalab/CloudForest.

SUBMITTER: Bressler R 

PROVIDER: S-EPMC4692062 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

altmetric image

Publications

CloudForest: A Scalable and Efficient Random Forest Implementation for Biological Data.

Bressler Ryan R   Kreisberg Richard B RB   Bernard Brady B   Niederhuber John E JE   Vockley Joseph G JG   Shmulevich Ilya I   Knijnenburg Theo A TA  

PloS one 20151217 12


Random Forest has become a standard data analysis tool in computational biology. However, extensions to existing implementations are often necessary to handle the complexity of biological datasets and their associated research questions. The growing size of these datasets requires high performance implementations. We describe CloudForest, a Random Forest package written in Go, which is particularly well suited for large, heterogeneous, genetic and biomedical datasets. CloudForest includes severa  ...[more]

Similar Datasets

| S-EPMC3218317 | biostudies-literature
| S-EPMC4783407 | biostudies-literature
| S-EPMC3163175 | biostudies-literature
| S-EPMC4957112 | biostudies-literature
| S-EPMC1363357 | biostudies-literature
| S-EPMC5920646 | biostudies-literature
| S-EPMC6022547 | biostudies-literature
| S-EPMC3766035 | biostudies-literature
| S-EPMC2894507 | biostudies-literature
| S-EPMC8305490 | biostudies-literature