Unknown

Dataset Information

0

Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models.


ABSTRACT: BACKGROUND:The inherent difficulty of identifying and monitoring emerging outbreaks caused by novel pathogens can lead to their rapid spread; and if left unchecked, they may become major public health threats to the planet. The ongoing coronavirus disease (COVID-19) outbreak, which has infected over 2,300,000 individuals and caused over 150,000 deaths, is an example of one of these catastrophic events. OBJECTIVE:We present a timely and novel methodology that combines disease estimates from mechanistic models and digital traces, via interpretable machine learning methodologies, to reliably forecast COVID-19 activity in Chinese provinces in real time. METHODS:Our method uses the following as inputs: (a) official health reports, (b) COVID-19-related internet search activity, (c) news media activity, and (d) daily forecasts of COVID-19 activity from a metapopulation mechanistic model. Our machine learning methodology uses a clustering technique that enables the exploitation of geospatial synchronicities of COVID-19 activity across Chinese provinces and a data augmentation technique to deal with the small number of historical disease observations characteristic of emerging outbreaks. RESULTS:Our model is able to produce stable and accurate forecasts 2 days ahead of the current time and outperforms a collection of baseline models in 27 out of 32 Chinese provinces. CONCLUSIONS:Our methodology could be easily extended to other geographies currently affected by COVID-19 to aid decision makers with monitoring and possibly prevention.

SUBMITTER: Liu D 

PROVIDER: S-EPMC7459435 | biostudies-literature | 2020 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models.

Liu Dianbo D   Clemente Leonardo L   Poirier Canelle C   Ding Xiyu X   Chinazzi Matteo M   Davis Jessica J   Vespignani Alessandro A   Santillana Mauricio M  

Journal of medical Internet research 20200817 8


<h4>Background</h4>The inherent difficulty of identifying and monitoring emerging outbreaks caused by novel pathogens can lead to their rapid spread; and if left unchecked, they may become major public health threats to the planet. The ongoing coronavirus disease (COVID-19) outbreak, which has infected over 2,300,000 individuals and caused over 150,000 deaths, is an example of one of these catastrophic events.<h4>Objective</h4>We present a timely and novel methodology that combines disease estim  ...[more]

Similar Datasets

| S-EPMC10187357 | biostudies-literature
| S-EPMC10159233 | biostudies-literature
| S-EPMC5871642 | biostudies-literature
| S-EPMC7439145 | biostudies-literature
| S-EPMC6121893 | biostudies-other
| S-EPMC9427440 | biostudies-literature
| S-EPMC8725063 | biostudies-literature
| S-EPMC11333698 | biostudies-literature
| S-EPMC3405757 | biostudies-other
| S-EPMC6754403 | biostudies-other