Unknown

Dataset Information

0

Microbiome meta-analysis and cross-disease comparison enabled by the SIAMCAT machine learning toolbox.


ABSTRACT: The human microbiome is increasingly mined for diagnostic and therapeutic biomarkers using machine learning (ML). However, metagenomics-specific software is scarce, and overoptimistic evaluation and limited cross-study generalization are prevailing issues. To address these, we developed SIAMCAT, a versatile R toolbox for ML-based comparative metagenomics. We demonstrate its capabilities in a meta-analysis of fecal metagenomic studies (10,803 samples). When naively transferred across studies, ML models lost accuracy and disease specificity, which could however be resolved by a novel training set augmentation strategy. This reveals some biomarkers to be disease-specific, with others shared across multiple conditions. SIAMCAT is freely available from siamcat.embl.de .

SUBMITTER: Wirbel J 

PROVIDER: S-EPMC8008609 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC10704913 | biostudies-literature
| S-EPMC6235622 | biostudies-literature
2024-05-17 | GSE267438 | GEO
| S-EPMC10486516 | biostudies-literature
| S-EPMC10015860 | biostudies-literature
| S-EPMC6827106 | biostudies-literature
| S-EPMC9649010 | biostudies-literature
| S-EPMC9952031 | biostudies-literature
| S-EPMC8041204 | biostudies-literature
| S-EPMC8539211 | biostudies-literature