Unknown

Dataset Information

0

Validation of a Machine Learning Model to Predict Childhood Lead Poisoning.


ABSTRACT: Importance:Childhood lead poisoning causes irreversible neurobehavioral deficits, but current practice is secondary prevention. Objective:To validate a machine learning (random forest) prediction model of elevated blood lead levels (EBLLs) by comparison with a parsimonious logistic regression. Design, Setting, and Participants:This prognostic study for temporal validation of multivariable prediction models used data from the Women, Infants, and Children (WIC) program of the Chicago Department of Public Health. Participants included a development cohort of children born from January 1, 2007, to December 31, 2012, and a validation WIC cohort born from January 1 to December 31, 2013. Blood lead levels were measured until December 31, 2018. Data were analyzed from January 1 to October 31, 2019. Exposures:Blood lead level test results; lead investigation findings; housing characteristics, permits, and violations; and demographic variables. Main Outcomes and Measures:Incident EBLL (?6 ?g/dL). Models were assessed using the area under the receiver operating characteristic curve (AUC) and confusion matrix metrics (positive predictive value, sensitivity, and specificity) at various thresholds. Results:Among 6812 children in the WIC validation cohort, 3451 (50.7%) were female, 3057 (44.9%) were Hispanic, 2804 (41.2%) were non-Hispanic Black, 458 (6.7%) were non-Hispanic White, and 442 (6.5%) were Asian (mean [SD] age, 5.5 [0.3] years). The median year of housing construction was 1919 (interquartile range, 1903-1948). Random forest AUC was 0.69 compared with 0.64 for logistic regression (difference, 0.05; 95% CI, 0.02-0.08). When predicting the 5% of children at highest risk to have EBLLs, random forest and logistic regression models had positive predictive values of 15.5% and 7.8%, respectively (difference, 7.7%; 95% CI, 3.7%-11.3%), sensitivity of 16.2% and 8.1%, respectively (difference, 8.1%; 95% CI, 3.9%-11.7%), and specificity of 95.5% and 95.1% (difference, 0.4%; 95% CI, 0.0%-0.7%). Conclusions and Relevance:The machine learning model outperformed regression in predicting childhood lead poisoning, especially in identifying children at highest risk. Such a model could be used to target the allocation of lead poisoning prevention resources to these children.

SUBMITTER: Potash E 

PROVIDER: S-EPMC7495240 | biostudies-literature | 2020 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Validation of a Machine Learning Model to Predict Childhood Lead Poisoning.

Potash Eric E   Ghani Rayid R   Walsh Joe J   Jorgensen Emile E   Lohff Cortland C   Prachand Nik N   Mansour Raed R  

JAMA network open 20200901 9


<h4>Importance</h4>Childhood lead poisoning causes irreversible neurobehavioral deficits, but current practice is secondary prevention.<h4>Objective</h4>To validate a machine learning (random forest) prediction model of elevated blood lead levels (EBLLs) by comparison with a parsimonious logistic regression.<h4>Design, setting, and participants</h4>This prognostic study for temporal validation of multivariable prediction models used data from the Women, Infants, and Children (WIC) program of the  ...[more]

Similar Datasets

| S-EPMC7610191 | biostudies-literature
| S-EPMC9281065 | biostudies-literature
| S-EPMC9937004 | biostudies-literature
| S-EPMC6210196 | biostudies-other
| S-EPMC10101605 | biostudies-literature
| S-EPMC10251538 | biostudies-literature
| S-EPMC8108129 | biostudies-literature
| S-EPMC10778506 | biostudies-literature
| S-EPMC5528905 | biostudies-other
| S-EPMC9233260 | biostudies-literature