Unknown

Dataset Information

0

Using Decision Tree Aggregation with Random Forest Model to Identify Gut Microbes Associated with Colorectal Cancer.


ABSTRACT: The imbalance of human gut microbiota has been associated with colorectal cancer. In recent years, metagenomics research has provided a large amount of scientific data enabling us to study the dedicated roles of gut microbes in the onset and progression of cancer. We removed unrelated and redundant features during feature selection by mutual information. We then trained a random forest classifier on a large metagenomics dataset of colorectal cancer patients and healthy people assembled from published reports and extracted and analysed the information from the learned decision trees. We identified key microbial species associated with colorectal cancers. These microbes included Porphyromonas asaccharolytica, Peptostreptococcus stomatis, Fusobacterium, Parvimonas sp., Streptococcus vestibularis and Flavonifractor plautii. We obtained the optimal splitting abundance thresholds for these species to distinguish between healthy and colorectal cancer samples. This extracted consensus decision tree may be applied to the diagnosis of colorectal cancers.

SUBMITTER: Ai D 

PROVIDER: S-EPMC6410271 | biostudies-literature | 2019 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Using Decision Tree Aggregation with Random Forest Model to Identify Gut Microbes Associated with Colorectal Cancer.

Ai Dongmei D   Pan Hongfei H   Han Rongbao R   Li Xiaoxin X   Liu Gang G   Xia Li C LC  

Genes 20190201 2


The imbalance of human gut microbiota has been associated with colorectal cancer. In recent years, metagenomics research has provided a large amount of scientific data enabling us to study the dedicated roles of gut microbes in the onset and progression of cancer. We removed unrelated and redundant features during feature selection by mutual information. We then trained a random forest classifier on a large metagenomics dataset of colorectal cancer patients and healthy people assembled from publ  ...[more]

Similar Datasets

| S-EPMC7988984 | biostudies-literature
| S-EPMC9202951 | biostudies-literature
| S-EPMC9448733 | biostudies-literature
| S-EPMC4615251 | biostudies-other
| S-EPMC4271564 | biostudies-literature
| S-EPMC6143023 | biostudies-literature
| S-EPMC8271086 | biostudies-literature
| S-EPMC3796272 | biostudies-literature
| S-EPMC6802825 | biostudies-literature
| S-EPMC9209521 | biostudies-literature