Unknown

Dataset Information

0

An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence.


ABSTRACT:

Background

There is considerable geographic heterogeneity in obesity prevalence across counties in the United States. Machine learning algorithms accurately predict geographic variation in obesity prevalence, but the models are often uninterpretable and viewed as a black-box.

Objective

The goal of this study is to extract knowledge from machine learning models for county-level variation in obesity prevalence.

Methods

This study shows the application of explainable artificial intelligence methods to machine learning models of cross-sectional obesity prevalence data collected from 3,142 counties in the United States. County-level features from 7 broad categories: health outcomes, health behaviors, clinical care, social and economic factors, physical environment, demographics, and severe housing conditions. Explainable methods applied to random forest prediction models include feature importance, accumulated local effects, global surrogate decision tree, and local interpretable model-agnostic explanations.

Results

The results show that machine learning models explained 79% of the variance in obesity prevalence, with physical inactivity, diabetes, and smoking prevalence being the most important factors in predicting obesity prevalence.

Conclusions

Interpretable machine learning models of health behaviors and outcomes provide substantial insight into obesity prevalence variation across counties in the United States.

SUBMITTER: Allen B 

PROVIDER: S-EPMC10553328 | biostudies-literature | 2023

REPOSITORIES: biostudies-literature

altmetric image

Publications

An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence.

Allen Ben B  

PloS one 20231005 10


<h4>Background</h4>There is considerable geographic heterogeneity in obesity prevalence across counties in the United States. Machine learning algorithms accurately predict geographic variation in obesity prevalence, but the models are often uninterpretable and viewed as a black-box.<h4>Objective</h4>The goal of this study is to extract knowledge from machine learning models for county-level variation in obesity prevalence.<h4>Methods</h4>This study shows the application of explainable artificia  ...[more]

Similar Datasets

| S-EPMC10007880 | biostudies-literature
| S-EPMC9174200 | biostudies-literature
| S-EPMC8618183 | biostudies-literature
| S-EPMC9367834 | biostudies-literature
| S-EPMC11571574 | biostudies-literature
| S-EPMC11522690 | biostudies-literature
| S-EPMC9772995 | biostudies-literature
| S-EPMC11473981 | biostudies-literature
| S-EPMC11682355 | biostudies-literature
| S-EPMC11042710 | biostudies-literature