Dataset Information

Predictive Models May Complement or Provide an Alternative to Existing Strategies for Assessing the Enteric Pathogen Contamination Status of Northeastern Streams Used to Provide Water for Produce Production.

ABSTRACT: While the Food Safety Modernization Act established standards for the use of surface water for produce production, water quality is known to vary over space and time. Targeted approaches for identifying hazards in water that account for this variation may improve growers' ability to address pre-harvest food safety risks. Models that utilize publicly-available data (e.g., land-use, real-time weather) may be useful for developing these approaches. The objective of this study was to use pre-existing datasets collected in 2017 (N = 181 samples) and 2018 (N = 191 samples) to train and test models that predict the likelihood of detecting Salmonella and pathogenic E. coli markers (eaeA, stx) in agricultural water. Four types of features were used to train the models: microbial, physicochemical, spatial and weather. "Full models" were built using all four features types, while "nested models" were built using between one and three types. Twenty learners were used to develop separate full models for each pathogen. Separately, to assess information gain associated with using different feature types, six learners were randomly selected and used to develop nine, nested models each. Performance measures for each model were then calculated and compared against baseline models where E. coli concentration was the sole covariate. In the methods, we outline the advantages and disadvantages of each learner. Overall, full models built using ensemble (e.g., Node Harvest) and "black-box" (e.g., SVMs) learners out-performed full models built using more interpretable learners (e.g., tree- and rule-based learners) for both outcomes. However, nested eaeA-stx models built using interpretable learners and microbial data performed almost as well as these full models. While none of the nested Salmonella models performed as well as the full models, nested models built using spatial data consistently out-performed models that excluded spatial data. These findings demonstrate that machine learning approaches can be used to predict when and where pathogens are likely to be present in agricultural water. This study serves as a proof-of-concept that can be built upon once larger datasets become available and provides guidance on the learner-data combinations that should be the foci of future efforts (e.g., tree-based microbial models for pathogenic E. coli).

SUBMITTER: Weller DL

PROVIDER: S-EPMC8009603 | biostudies-literature | 2020 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Predictive Models May Complement or Provide an Alternative to Existing Strategies for Assessing the Enteric Pathogen Contamination Status of Northeastern Streams Used to Provide Water for Produce Production.

Weller Daniel L DL Love Tanzy M T TMT Belias Alexandra A Wiedmann Martin M

Frontiers in sustainable food systems 20201006

While the Food Safety Modernization Act established standards for the use of surface water for produce production, water quality is known to vary over space and time. Targeted approaches for identifying hazards in water that account for this variation may improve growers' ability to address pre-harvest food safety risks. Models that utilize publicly-available data (e.g., land-use, real-time weather) may be useful for developing these approaches. The objective of this study was to use pre-existin ...[more]

PMID: 33791594

Dataset Information

Predictive Models May Complement or Provide an Alternative to Existing Strategies for Assessing the Enteric Pathogen Contamination Status of Northeastern Streams Used to Provide Water for Produce Production.

Publications

Predictive Models May Complement or Provide an Alternative to Existing Strategies for Assessing the Enteric Pathogen Contamination Status of Northeastern Streams Used to Provide Water for Produce Production.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Landscape, Water Quality, and Weather Factors Associated With an Increased Likelihood of Foodborne Pathogen Contamination of New York Streams Used to Source Water for Produce Production.
| S-EPMC7241490 | biostudies-literature

Conditional Forest Models Built Using Metagenomic Data Accurately Predicted Salmonella Contamination in Northeastern Streams.
| S-EPMC10100987 | biostudies-literature

Associations between fecal contamination of the household environment and enteric pathogen detection in children living in Maputo, Mozambique.
| S-EPMC11952620 | biostudies-literature

Associations between Fecal Contamination of the Household Environment and Enteric Pathogen Detection in Children Living in Maputo, Mozambique.
| S-EPMC12281200 | biostudies-literature

Fecal Fingerprints of Enteric Pathogen Contamination in Public Environments of Kisumu, Kenya, Associated with Human Sanitation Conditions and Domestic Animals.
| S-EPMC6557411 | biostudies-literature

Podocytes Produce and Secrete Functional Complement C3 and Complement Factor H.
| S-EPMC7457071 | biostudies-literature

Spatial-Temporal Patterns in the Enteric Pathogen Contamination of Soil in the Public Environments of Low- and Middle-Income Neighborhoods in Nairobi, Kenya.
| S-EPMC11506941 | biostudies-literature

Tracking enteric pathogen contamination from on-site sanitation facilities to groundwater in selected rural areas of Vhembe District Municipality, Limpopo Province, South Africa.
| S-EPMC10937690 | biostudies-literature

Structure of an enteric pathogen, bovine parvovirus.
| S-EPMC4325758 | biostudies-literature

Swine Enteric Coronavirus: Diverse Pathogen-Host Interactions.
| S-EPMC8999375 | biostudies-literature