Machine Learning Strategy for Gut Microbiome-Based Diagnostic Screening of Cardiovascular Disease.
Ontology highlight
ABSTRACT: Cardiovascular disease (CVD) is the number one leading cause for human mortality. Besides genetics and environmental factors, in recent years, gut microbiota has emerged as a new factor influencing CVD. Although cause-effect relationships are not clearly established, the reported associations between alterations in gut microbiota and CVD are prominent. Therefore, we hypothesized that machine learning (ML) could be used for gut microbiome-based diagnostic screening of CVD. To test our hypothesis, fecal 16S ribosomal RNA sequencing data of 478 CVD and 473 non-CVD human subjects collected through the American Gut Project were analyzed using 5 supervised ML algorithms including random forest, support vector machine, decision tree, elastic net, and neural networks. Thirty-nine differential bacterial taxa were identified between the CVD and non-CVD groups. ML modeling using these taxonomic features achieved a testing area under the receiver operating characteristic curve (0.0, perfect antidiscrimination; 0.5, random guessing; 1.0, perfect discrimination) of ≈0.58 (random forest and neural networks). Next, the ML models were trained with the top 500 high-variance features of operational taxonomic units, instead of bacterial taxa, and an improved testing area under the receiver operating characteristic curves of ≈0.65 (random forest) was achieved. Further, by limiting the selection to only the top 25 highly contributing operational taxonomic unit features, the area under the receiver operating characteristic curves was further significantly enhanced to ≈0.70. Overall, our study is the first to identify dysbiosis of gut microbiota in CVD patients as a group and apply this knowledge to develop a gut microbiome-based ML approach for diagnostic screening of CVD.
SUBMITTER: Aryal S
PROVIDER: S-EPMC7577586 | biostudies-literature |
REPOSITORIES: biostudies-literature
ACCESS DATA