Unknown

Dataset Information

0

Disease prediction via Bayesian hyperparameter optimization and ensemble learning.


ABSTRACT: OBJECTIVE:Early disease screening and diagnosis are important for improving patient survival. Thus, identifying early predictive features of disease is necessary. This paper presents a comprehensive comparative analysis of different Machine Learning (ML) systems and reports the standard deviation of the results obtained through sampling with replacement. The research emphasises on: (a) to analyze and compare ML strategies used to predict Breast Cancer (BC) and Cardiovascular Disease (CVD) and (b) to use feature importance ranking to identify early high-risk features. RESULTS:The Bayesian hyperparameter optimization method was more stable than the grid search and random search methods. In a BC diagnosis dataset, the Extreme Gradient Boosting (XGBoost) model had an accuracy of 94.74% and a sensitivity of 93.69%. The mean value of the cell nucleus in the Fine Needle Puncture (FNA) digital image of breast lump was identified as the most important predictive feature for BC. In a CVD dataset, the XGBoost model had an accuracy of 73.50% and a sensitivity of 69.54%. Systolic blood pressure was identified as the most important feature for CVD prediction.

SUBMITTER: Gao L 

PROVIDER: S-EPMC7146897 | biostudies-literature | 2020 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Disease prediction via Bayesian hyperparameter optimization and ensemble learning.

Gao Liyuan L   Ding Yongmei Y  

BMC research notes 20200410 1


<h4>Objective</h4>Early disease screening and diagnosis are important for improving patient survival. Thus, identifying early predictive features of disease is necessary. This paper presents a comprehensive comparative analysis of different Machine Learning (ML) systems and reports the standard deviation of the results obtained through sampling with replacement. The research emphasises on: (a) to analyze and compare ML strategies used to predict Breast Cancer (BC) and Cardiovascular Disease (CVD  ...[more]

Similar Datasets

| S-EPMC8495939 | biostudies-literature
| S-EPMC7038525 | biostudies-literature
| S-EPMC8584383 | biostudies-literature
| S-EPMC10280255 | biostudies-literature
| S-EPMC7396641 | biostudies-literature
| S-EPMC8064234 | biostudies-literature
| S-EPMC6152467 | biostudies-literature
| S-EPMC9969040 | biostudies-literature
| S-EPMC3031034 | biostudies-other
| S-EPMC9777370 | biostudies-literature