Unknown

Dataset Information

0

A comprehensive evaluation of multicategory classification methods for microbiomic data.


ABSTRACT: Recent advances in next-generation DNA sequencing enable rapid high-throughput quantitation of microbial community composition in human samples, opening up a new field of microbiomics. One of the promises of this field is linking abundances of microbial taxa to phenotypic and physiological states, which can inform development of new diagnostic, personalized medicine, and forensic modalities. Prior research has demonstrated the feasibility of applying machine learning methods to perform body site and subject classification with microbiomic data. However, it is currently unknown which classifiers perform best among the many available alternatives for classification with microbiomic data.In this work, we performed a systematic comparison of 18 major classification methods, 5 feature selection methods, and 2 accuracy metrics using 8 datasets spanning 1,802 human samples and various classification tasks: body site and subject classification and diagnosis.We found that random forests, support vector machines, kernel ridge regression, and Bayesian logistic regression with Laplace priors are the most effective machine learning techniques for performing accurate classification from these microbiomic data.

SUBMITTER: Statnikov A 

PROVIDER: S-EPMC3960509 | biostudies-literature | 2013

REPOSITORIES: biostudies-literature

altmetric image

Publications

A comprehensive evaluation of multicategory classification methods for microbiomic data.

Statnikov Alexander A   Henaff Mikael M   Narendra Varun V   Konganti Kranti K   Li Zhiguo Z   Yang Liying L   Pei Zhiheng Z   Blaser Martin J MJ   Aliferis Constantin F CF   Alekseyenko Alexander V AV  

Microbiome 20130405 1


<h4>Background</h4>Recent advances in next-generation DNA sequencing enable rapid high-throughput quantitation of microbial community composition in human samples, opening up a new field of microbiomics. One of the promises of this field is linking abundances of microbial taxa to phenotypic and physiological states, which can inform development of new diagnostic, personalized medicine, and forensic modalities. Prior research has demonstrated the feasibility of applying machine learning methods t  ...[more]

Similar Datasets

| S-EPMC4629508 | biostudies-literature
| S-EPMC5899419 | biostudies-literature
2013-08-20 | E-GEOD-49712 | biostudies-arrayexpress
2013-08-20 | GSE49712 | GEO
2021-08-28 | GSE175772 | GEO
| S-EPMC5854612 | biostudies-literature
| S-EPMC8345583 | biostudies-literature
| S-EPMC9333262 | biostudies-literature
| S-EPMC7647064 | biostudies-literature
| S-EPMC4054597 | biostudies-literature