Dataset Information

A comparison of five epidemiological models for transmission of SARS-CoV-2 in India.

ABSTRACT:

Background

Many popular disease transmission models have helped nations respond to the COVID-19 pandemic by informing decisions about pandemic planning, resource allocation, implementation of social distancing measures, lockdowns, and other non-pharmaceutical interventions. We study how five epidemiological models forecast and assess the course of the pandemic in India: a baseline curve-fitting model, an extended SIR (eSIR) model, two extended SEIR (SAPHIRE and SEIR-fansy) models, and a semi-mechanistic Bayesian hierarchical model (ICM).

Methods

Using COVID-19 case-recovery-death count data reported in India from March 15 to October 15 to train the models, we generate predictions from each of the five models from October 16 to December 31. To compare prediction accuracy with respect to reported cumulative and active case counts and reported cumulative death counts, we compute the symmetric mean absolute prediction error (SMAPE) for each of the five models. For reported cumulative cases and deaths, we compute Pearson's and Lin's correlation coefficients to investigate how well the projected and observed reported counts agree. We also present underreporting factors when available, and comment on uncertainty of projections from each model.

Results

For active case counts, SMAPE values are 35.14% (SEIR-fansy) and 37.96% (eSIR). For cumulative case counts, SMAPE values are 6.89% (baseline), 6.59% (eSIR), 2.25% (SAPHIRE) and 2.29% (SEIR-fansy). For cumulative death counts, the SMAPE values are 4.74% (SEIR-fansy), 8.94% (eSIR) and 0.77% (ICM). Three models (SAPHIRE, SEIR-fansy and ICM) return total (sum of reported and unreported) cumulative case counts as well. We compute underreporting factors as of October 31 and note that for cumulative cases, the SEIR-fansy model yields an underreporting factor of 7.25 and ICM model yields 4.54 for the same quantity. For total (sum of reported and unreported) cumulative deaths the SEIR-fansy model reports an underreporting factor of 2.97. On October 31, we observe 8.18 million cumulative reported cases, while the projections (in millions) from the baseline model are 8.71 (95% credible interval: 8.63-8.80), while eSIR yields 8.35 (7.19-9.60), SAPHIRE returns 8.17 (7.90-8.52) and SEIR-fansy projects 8.51 (8.18-8.85) million cases. Cumulative case projections from the eSIR model have the highest uncertainty in terms of width of 95% credible intervals, followed by those from SAPHIRE, the baseline model and finally SEIR-fansy.

Conclusions

In this comparative paper, we describe five different models used to study the transmission dynamics of the SARS-Cov-2 virus in India. While simulation studies are the only gold standard way to compare the accuracy of the models, here we were uniquely poised to compare the projected case-counts against observed data on a test period. The largest variability across models is observed in predicting the "total" number of infections including reported and unreported cases (on which we have no validation data). The degree of under-reporting has been a major concern in India and is characterized in this report. Overall, the SEIR-fansy model appeared to be a good choice with publicly available R-package and desired flexibility plus accuracy.

SUBMITTER: Purkayastha S

PROVIDER: S-EPMC8181542 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Similar Datasets

Project description:The current investigation was conducted with the objective to develop an epidemiological case definition of possible severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2) re-infection and assess its magnitude in India. The epidemiological case definition for SARS-CoV-2 re-infection was developed from literature review of data on viral kinetics. For achieving second objective, the individuals who satisfied the developed case definition for SARS-CoV-2 re-infection were contacted telephonically. Taking available evidence into consideration, re-infection with SARS-CoV-2 in our study was defined as any individual who tested positive for SARS-CoV-2 on two separate occasions by either molecular tests or rapid antigen test at an interval of at least 102 days with one negative molecular test in between. In this archive based, telephonic survey, 58 out of 1300 individuals (4.5%) fulfilled the above-mentioned definition; 38 individuals could be contacted with healthcare workers (HCWs) accounting for 31.6% of the cases. A large proportion of participants was asymptomatic and had higher Ct value during the first episode. While SARS-CoV-2 re-infection is still a rare phenomenon, there is a need for epidemiological definition of re-infection for establishing surveillance systems and this study contributes to such a goal.Severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2) re-infection is an emerging concern and there is a need to define it. Therefore, working epidemiological case definition for re-infection was developed and its magnitude was explored via archive-based, telephonic survey. Re-infection with SARS-CoV-2 was defined as two positive tests at an interval of at least 102 days with one interim negative test. Thirty-eight of the 58 eligible patients could be contacted with 12 (31.6%) being HCWs. Majority of the participants were asymptomatic and had higher Ct value during their first episode. To conclude, a working epidemiological case definition of SARS-CoV-2 re-infection is important to strengthen surveillance. The present investigation contributes to this goal and records reinfection in 4.5% of SARS-CoV-2 infected individuals in India.

Project description:A detailed understanding of how and when severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) transmission occurs is crucial for designing effective prevention measures. Other than contact tracing, genome sequencing provides information to help infer who infected whom. However, the effectiveness of the genomic approach in this context depends on both (high enough) mutation and (low enough) transmission rates. Today, the level of resolution that we can obtain when describing SARS-CoV-2 outbreaks using just genomic information alone remains unclear. In order to answer this question, we sequenced forty-nine SARS-CoV-2 patient samples from ten local clusters in NW Spain for which partial epidemiological information was available and inferred transmission history using genomic variants. Importantly, we obtained high-quality genomic data, sequencing each sample twice and using unique barcodes to exclude cross-sample contamination. Phylogenetic and cluster analyses showed that consensus genomes were generally sufficient to discriminate among independent transmission clusters. However, levels of intrahost variation were low, which prevented in most cases the unambiguous identification of direct transmission events. After filtering out recurrent variants across clusters, the genomic data were generally compatible with the epidemiological information but did not support specific transmission events over possible alternatives. We estimated the effective transmission bottleneck size to be one to two viral particles for sample pairs whose donor-recipient relationship was likely. Our analyses suggest that intrahost genomic variation in SARS-CoV-2 might be generally limited and that homoplasy and recurrent errors complicate identifying shared intrahost variants. Reliable reconstruction of direct SARS-CoV-2 transmission based solely on genomic data seems hindered by a slow mutation rate, potential convergent events, and technical artifacts. Detailed contact tracing seems essential in most cases to study SARS-CoV-2 transmission at high resolution.

Project description:BackgroundThe emergence of a novel coronavirus, SARS-CoV-2, in December 2019, progressed to become a world pandemic in a few months and reached South Africa at the beginning of March. To investigate introduction and understand the early transmission dynamics of the virus, we formed the South African Network for Genomics Surveillance of COVID (SANGS_COVID), a network of ten government and university laboratories. Here, we present the first results of this effort, which is a molecular epidemiological study of the first twenty-one SARS-CoV-2 whole genomes sampled in the first port of entry, KwaZulu-Natal (KZN), during the first month of the epidemic. By combining this with calculations of the effective reproduction number (R), we aim to shed light on the patterns of infections that define the epidemic in South Africa.MethodsR was calculated using positive cases and deaths from reports provided by the four major provinces. Molecular epidemiology investigation involved sequencing viral genomes from patients in KZN using ARCTIC protocols and assembling whole genomes using meticulous alignment methods. Phylogenetic analysis was performed using maximum likelihood (ML) and Bayesian trees, lineage classification and molecular clock calculations.FindingsThe epidemic in South Africa has been very heterogeneous. Two of the largest provinces, Gauteng, home of the two large metropolis Johannesburg and Pretoria, and KwaZulu-Natal, home of the third largest city in the country Durban, had a slow growth rate on the number of detected cases. Whereas, Western Cape, home of Cape Town, and the Eastern Cape provinces the epidemic is spreading fast. Our estimates of transmission potential for South Africa suggest a decreasing transmission potential towards R=1 since the first cases and deaths have been reported. However, between 06 May and 18 May 2020, we estimate that R was on average 1.39 (1.04 - 2.15, 95% CI). We also demonstrate that early transmission in KZN, and most probably in all main regions of SA, was associated with multiple international introductions and dominated by lineages B1 and B. The study also provides evidence for locally acquired infections in a hospital in Durban within the first month of the epidemic, which inflated early mortality in KZN.InterpretationThis first report of SANGS_COVID consortium focuses on understanding the epidemic heterogeneity and introduction of SARS-CoV-2 strains in the first month of the epidemic in South Africa. The early introduction of SARS-CoV-2 in KZN included caused a localized outbreak in a hospital, provides potential explanations for the initially high death rates in the province. The current high rate of transmission of COVID-19 in the Western Cape and Eastern Cape highlights the crucial need to strength local genomic surveillance in South Africa.FundingUKZN Flagship Program entitled: Afrocentric Precision Approach to Control Health Epidemic, by a research Flagship grant from the South African Medical Research Council (MRC-RFA-UFSP-01-2013/UKZN HIVEPI, by the the Technology Innovation Agency and the the Department of Science and Innovation and by National Human Genome Re- search Institute of the National Institutes of Health under Award Number U24HG006941. H3ABioNet is an initiative of the Human Health and Heredity in Africa Consortium (H3Africa).