Dataset Information

A Dirichlet process model for classifying and forecasting epidemic curves.

ABSTRACT: BACKGROUND:A forecast can be defined as an endeavor to quantitatively estimate a future event or probabilities assigned to a future occurrence. Forecasting stochastic processes such as epidemics is challenging since there are several biological, behavioral, and environmental factors that influence the number of cases observed at each point during an epidemic. However, accurate forecasts of epidemics would impact timely and effective implementation of public health interventions. In this study, we introduce a Dirichlet process (DP) model for classifying and forecasting influenza epidemic curves. METHODS:The DP model is a nonparametric Bayesian approach that enables the matching of current influenza activity to simulated and historical patterns, identifies epidemic curves different from those observed in the past and enables prediction of the expected epidemic peak time. The method was validated using simulated influenza epidemics from an individual-based model and the accuracy was compared to that of the tree-based classification technique, Random Forest (RF), which has been shown to achieve high accuracy in the early prediction of epidemic curves using a classification approach. We also applied the method to forecasting influenza outbreaks in the United States from 1997-2013 using influenza-like illness (ILI) data from the Centers for Disease Control and Prevention (CDC). RESULTS:We made the following observations. First, the DP model performed as well as RF in identifying several of the simulated epidemics. Second, the DP model correctly forecasted the peak time several days in advance for most of the simulated epidemics. Third, the accuracy of identifying epidemics different from those already observed improved with additional data, as expected. Fourth, both methods correctly classified epidemics with higher reproduction numbers (R) with a higher accuracy compared to epidemics with lower R values. Lastly, in the classification of seasonal influenza epidemics based on ILI data from the CDC, the methods' performance was comparable. CONCLUSIONS:Although RF requires less computational time compared to the DP model, the algorithm is fully supervised implying that epidemic curves different from those previously observed will always be misclassified. In contrast, the DP model can be unsupervised, semi-supervised or fully supervised. Since both methods have their relative merits, an approach that uses both RF and the DP model could be beneficial.

SUBMITTER: Nsoesie EO

PROVIDER: S-EPMC3901791 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Dataset Information

A Dirichlet process model for classifying and forecasting epidemic curves.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

A Dirichlet process mixture model for clustering longitudinal gene expression data.
| S-EPMC5583037 | biostudies-literature

Genome-scale MicroRNA target prediction through clustering with Dirichlet process mixture model.
| S-EPMC6157162 | biostudies-literature

Dirichlet Process Mixture Model for Correcting Technical Variation in Single-Cell Gene Expression Data.
| S-EPMC6004614 | biostudies-literature

A dependent Bayesian Dirichlet process model for source apportionment of particle number size distribution.
| S-EPMC10077992 | biostudies-literature

Discovering key transcriptomic regulators in pancreatic ductal adenocarcinoma using Dirichlet process Gaussian mixture model.
| S-EPMC8041769 | biostudies-literature

Neighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model.
| S-EPMC2861699 | biostudies-literature

Fast Bayesian Inference in Dirichlet Process Mixture Models.
| S-EPMC3812957 | biostudies-literature

Forecasting Flu Activity in the United States: Benchmarking an Endemic-Epidemic Beta Model.
| S-EPMC7068443 | biostudies-literature

Forecasting the long-term trend of COVID-19 epidemic using a dynamic model.
| S-EPMC7713358 | biostudies-literature

ePyDGGA: automatic configuration for fitting epidemic curves.
| S-EPMC10774272 | biostudies-literature