Unknown

Dataset Information

0

GenomeForest: An Ensemble Machine Learning Classifier for Endometriosis.


ABSTRACT: Endometriosis is a complex and high impact disease affecting 176 million women worldwide with diagnostic latency between 4 to 11 years due to lack of a definitive clinical symptom or a minimally invasive diagnostic method. In this study, we developed a new ensemble machine learning classifier based on chromosomal partitioning, named GenomeForest and applied it in classifying the endometriosis vs. the control patients using 38 RNA-seq and 80 enrichment-based DNA-methylation (MBD-seq) datasets, and computed performance assessment with six different experiments. The ensemble machine learning models provided an avenue for identifying several candidate biomarker genes with a very high F1 score; a near perfect F1 score (0.968) for the transcriptomics dataset and a very high F1 score (0.918) for the methylomics dataset. We hope in the future a less invasive biopsy can be used to diagnose endometriosis using the findings from such ensemble machine learning classifiers, as demonstrated in this study.

SUBMITTER: Akter S 

PROVIDER: S-EPMC7233069 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

altmetric image

Publications

GenomeForest: An Ensemble Machine Learning Classifier for Endometriosis.

Akter Sadia S   Xu Dong D   Nagel Susan C SC   Bromfield John J JJ   Pelch Katherine E KE   Wilshire Gilbert B GB   Joshi Trupti T  

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science 20200530


Endometriosis is a complex and high impact disease affecting 176 million women worldwide with diagnostic latency between 4 to 11 years due to lack of a definitive clinical symptom or a minimally invasive diagnostic method. In this study, we developed a new ensemble machine learning classifier based on chromosomal partitioning, named GenomeForest and applied it in classifying the endometriosis vs. the control patients using 38 RNA-seq and 80 enrichment-based DNA-methylation (MBD-seq) datasets, an  ...[more]

Similar Datasets

| S-EPMC9575864 | biostudies-literature
2019-07-18 | GSE134056 | GEO
2019-07-18 | GSE134052 | GEO
| S-EPMC11308548 | biostudies-literature
| S-EPMC7244534 | biostudies-literature
| S-EPMC9797088 | biostudies-literature
2017-02-01 | GSE85033 | GEO
| S-EPMC11373136 | biostudies-literature
| S-EPMC8097070 | biostudies-literature
| S-EPMC10073113 | biostudies-literature