Unknown

Dataset Information

0

TaxoNN: ensemble of neural networks on stratified microbiome data for disease prediction.


ABSTRACT:

Motivation

Research supports the potential use of microbiome as a predictor of some diseases. Motivated by the findings that microbiome data is complex in nature, and there is an inherent correlation due to hierarchical taxonomy of microbial Operational Taxonomic Units (OTUs), we propose a novel machine learning method incorporating a stratified approach to group OTUs into phylum clusters. Convolutional Neural Networks (CNNs) were used to train within each of the clusters individually. Further, through an ensemble learning approach, features obtained from each cluster were then concatenated to improve prediction accuracy. Our two-step approach comprising stratification prior to combining multiple CNNs, aided in capturing the relationships between OTUs sharing a phylum efficiently, as compared to using a single CNN ignoring OTU correlations.

Results

We used simulated datasets containing 168 OTUs in 200 cases and 200 controls for model testing. Thirty-two OTUs, potentially associated with risk of disease were randomly selected and interactions between three OTUs were used to introduce non-linearity. We also implemented this novel method in two human microbiome studies: (i) Cirrhosis with 118 cases, 114 controls; (ii) type 2 diabetes (T2D) with 170 cases, 174 controls; to demonstrate the model's effectiveness. Extensive experimentation and comparison against conventional machine learning techniques yielded encouraging results. We obtained mean AUC values of 0.88, 0.92, 0.75, showing a consistent increment (5%, 3%, 7%) in simulations, Cirrhosis and T2D data, respectively, against the next best performing method, Random Forest.

Availability and implementation

https://github.com/divya031090/TaxoNN_OTU.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Sharma D 

PROVIDER: S-EPMC7750934 | biostudies-literature | 2020 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

TaxoNN: ensemble of neural networks on stratified microbiome data for disease prediction.

Sharma Divya D   Paterson Andrew D AD   Xu Wei W  

Bioinformatics (Oxford, England) 20201101 17


<h4>Motivation</h4>Research supports the potential use of microbiome as a predictor of some diseases. Motivated by the findings that microbiome data is complex in nature, and there is an inherent correlation due to hierarchical taxonomy of microbial Operational Taxonomic Units (OTUs), we propose a novel machine learning method incorporating a stratified approach to group OTUs into phylum clusters. Convolutional Neural Networks (CNNs) were used to train within each of the clusters individually. F  ...[more]

Similar Datasets

| S-EPMC5587804 | biostudies-other
| S-EPMC7068856 | biostudies-literature
| S-EPMC6777738 | biostudies-literature
| S-EPMC2832827 | biostudies-literature
| S-EPMC7285987 | biostudies-literature
| S-EPMC7697539 | biostudies-literature
| S-EPMC6881452 | biostudies-literature
| S-EPMC6896157 | biostudies-literature
| S-EPMC8792440 | biostudies-literature
| S-EPMC7832895 | biostudies-literature