Project description:The human pathogen severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the major pandemic of the twenty-first century. We analyzed more than 4700 SARS-CoV-2 genomes and associated metadata retrieved from public repositories. SARS-CoV-2 sequences have a high sequence identity (>99.9%), which drops to >96% when compared to bat coronavirus genome. We built a mutation-annotated reference SARS-CoV-2 phylogeny with two main macro-haplogroups, A and B, both of Asian origin, and more than 160 sub-branches representing virus strains of variable geographical origins worldwide, revealing a rather uniform mutation occurrence along branches that could have implications for diagnostics and the design of future vaccines. Identification of the root of SARS-CoV-2 genomes is not without problems, owing to conflicting interpretations derived from either using the bat coronavirus genomes as an outgroup or relying on the sampling chronology of the SARS-CoV-2 genomes and TMRCA estimates; however, the overall scenario favors haplogroup A as the ancestral node. Phylogenetic analysis indicates a TMRCA for SARS-CoV-2 genomes dating to November 12, 2019, thus matching epidemiological records. Sub-haplogroup A2 most likely originated in Europe from an Asian ancestor and gave rise to subclade A2a, which represents the major non-Asian outbreak, especially in Africa and Europe. Multiple founder effect episodes, most likely associated with super-spreader hosts, might explain COVID-19 pandemic to a large extent.
Project description:Heterogeneity in transmission is a challenge for infectious disease dynamics and control. An 80-20 "Pareto" rule has been proposed to describe this heterogeneity whereby 80% of transmission is accounted for by 20% of individuals, herein called super-spreaders. It is unclear, however, whether super-spreading can be attributed to certain individuals or whether it is an unpredictable and unavoidable feature of epidemics. Here, we investigate heterogeneous malaria transmission at three sites in Uganda and find that super-spreading is negatively correlated with overall malaria transmission intensity. Mosquito biting among humans is 90-10 at the lowest transmission intensities declining to less than 70-30 at the highest intensities. For super-spreaders, biting ranges from 70-30 down to 60-40. The difference, approximately half the total variance, is due to environmental stochasticity. Super-spreading is thus partly due to super-spreaders, but modest gains are expected from targeting super-spreaders.
Project description:Spain has been one of the main global pandemic epicenters for coronavirus disease 2019 (COVID-19). Here, we analyzed >41 000 genomes (including >26 000 high-quality (HQ) genomes) downloaded from the GISAID repository, including 1 245 (922 HQ) sampled in Spain. The aim of this study was to investigate genome variation of novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and reconstruct phylogeographic and transmission patterns in Spain. Phylogeographic analysis suggested at least 34 independent introductions of SARS-CoV-2 to Spain at the beginning of the outbreak. Six lineages spread very successfully in the country, probably favored by super-spreaders, namely, A2a4 (7.8%), A2a5 (38.4%), A2a10 (2.8%), B3a (30.1%), and B9 (8.7%), which accounted for 87.9% of all genomes in the Spanish database. One distinct feature of the Spanish SARS-CoV-2 genomes was the higher frequency of B lineages (39.3%, mainly B3a+B9) than found in any other European country. While B3a, B9, (and an important sub-lineage of A2a5, namely, A2a5c) most likely originated in Spain, the other three haplogroups were imported from other European locations. The B3a strain may have originated in the Basque Country from a B3 ancestor of uncertain geographic origin, whereas B9 likely emerged in Madrid. The time of the most recent common ancestor (TMRCA) of SARS-CoV-2 suggested that the first coronavirus entered the country around 11 February 2020, as estimated from the TMRCA of B3a, the first lineage detected in the country. Moreover, earlier claims that the D614G mutation is associated to higher transmissibility is not consistent with the very high prevalence of COVID-19 in Spain when compared to other countries with lower disease incidence but much higher frequency of this mutation (56.4% in Spain vs. 82.4% in rest of Europe). Instead, the data support a major role of genetic drift in modeling the micro-geographic stratification of virus strains across the country as well as the role of SARS-CoV-2 super-spreaders.
Project description:As lockdowns and stay-at-home orders start to be lifted across the globe, governments are struggling to establish effective and practical guidelines to reopen their economies. In dense urban environments with people returning to work and public transportation resuming full capacity, enforcing strict social distancing measures will be extremely challenging, if not practically impossible. Governments are thus paying close attention to particular locations that may become the next cluster of disease spreading. Indeed, certain places, like some people, can be "super-spreaders". Is a bustling train station in a central business district more or less susceptible and vulnerable as compared to teeming bus interchanges in the suburbs? Here, we propose a quantitative and systematic framework to identify spatial super-spreaders and the novel concept of super-susceptibles, i.e. respectively, places most likely to contribute to disease spread or to people contracting it. Our proposed data-analytic framework is based on the daily-aggregated ridership data of public transport in Singapore. By constructing the directed and weighted human movement networks and integrating human flow intensity with two neighborhood diversity metrics, we are able to pinpoint super-spreader and super-susceptible locations. Our results reveal that most super-spreaders are also super-susceptibles and that counterintuitively, busy peripheral bus interchanges are riskier places than crowded central train stations. Our analysis is based on data from Singapore, but can be readily adapted and extended for any other major urban center. It therefore serves as a useful framework for devising targeted and cost-effective preventive measures for urban planning and epidemiological preparedness.
Project description:Quantifying the nodal spreading abilities and identifying the potential influential spreaders has been one of the most engaging topics recently, which is essential and beneficial to facilitate information flow and ensure the stabilization operations of social networks. However, most of the existing algorithms just consider a fundamental quantification through combining a certain attribute of the nodes to measure the nodes' importance. Moreover, reaching a balance between the accuracy and the simplicity of these algorithms is difficult. In order to accurately identify the potential super-spreaders, the CumulativeRank algorithm is proposed in the present study. This algorithm combines the local and global performances of nodes for measuring the nodal spreading abilities. In local performances, the proposed algorithm considers both the direct influence from the node's neighbourhoods and the indirect influence from the nearest and the next nearest neighbours. On the other hand, in the global performances, the concept of the tenacity is introduced to assess the node's prominent position in maintaining the network connectivity. Extensive experiments carried out with the Susceptible-Infected-Recovered (SIR) model on real-world social networks demonstrate the accuracy and stability of the proposed algorithm. Furthermore, the comparison of the proposed algorithm with the existing well-known algorithms shows that the proposed algorithm has lower time complexity and can be applicable to large-scale networks.
Project description:Phylodynamic analyses using pathogen genetic data have become popular for making epidemiological inferences. However, many methods assume that the underlying host population follows homogenous mixing patterns. Nevertheless, in real disease outbreaks, a small number of individuals infect a disproportionately large number of others (super-spreaders). Our objective was to quantify the degree of bias in estimating the epidemic starting date in the presence of super-spreaders using different sample selection strategies. We simulated 100 epidemics of a hypothetical pathogen (fast evolving foot and mouth disease virus-like) over a real livestock movement network allowing the genetic mutations in pathogen sequence. Genetic sequences were sampled serially over the epidemic, which were then used to estimate the epidemic starting date using Extended Bayesian Coalescent Skyline plot (EBSP) and Birth-death skyline plot (BDSKY) models. Our results showed that the degree of bias varies over different epidemic situations, with substantial overestimations on the epidemic duration occurring in some occasions. While the accuracy and precision of BDSKY were deteriorated when a super-spreader generated a larger proportion of secondary cases, those of EBSP were deteriorated when epidemics were shorter. The accuracies of the inference were similar irrespective of whether the analysis used all sampled sequences or only a subset of them, although the former required substantially longer computational times. When phylodynamic analyses need to be performed under a time constraint to inform policy makers, we suggest multiple phylodynamics models to be used simultaneously for a subset of data to ascertain the robustness of inferences.
Project description:The clinical characteristics of patients with N501Y mutation in SARS-CoV-2 variants (N501YV) is not fully understood, especially in the setting of general practice. In this retrospective cohort study, COVID-19 patients admitted to one general practitioner clinic between 26 March and 26 May 2021 were retrospectively analyzed. The characteristics, clinical symptoms and radiological findings before treatment were compared between N501YV and wild-type 501N. Twenty-eight patients were classified as wild-type 501N and 24 as N501YV. The mean (±standard deviation) age was 37.4 (±16.1) years, with no significant difference between groups. Among clinical symptoms, prevalence of fever of 38 degrees Celsius (°C) or higher was significantly higher in the N501YV group than in the wild-type 501N group (p = 0.001). Multivariate analysis showed that fever of 38 °C or higher remained significantly associated with N501YV (adjust odds ratio [aOR]: 6.07, 95% confidence interval [CI]: 1.68 to 21.94). For radiological findings, the lung involvement area was significantly larger in patients infected with N501YV (p = 0.013). In conclusion, in the N501YV group, fever of 38 °C or higher and extensive pneumonia were more frequently observed compared to the wild-type 501N group. There was no significant difference in terms of other demographics and clinical symptoms.
Project description:In individual SARS-CoV-2 outbreaks, the count of confirmed cases and deaths follow a Gompertz growth function for locations of very different sizes. This lack of dependence on region size leads us to hypothesize that virus spread depends on universal properties of the network of social interactions. We test this hypothesis by simulating the propagation of a virus on networks of different topologies. Our main finding is that Gompertz growth observed for early outbreaks occurs only for a scale-free network, in which nodes with many more neighbors than average are common. These nodes that have very many neighbors are infected early in the outbreak and then spread the infection very rapidly. When these nodes are no longer infectious, the remaining nodes that have most neighbors take over and continue to spread the infection. In this way, the rate of spread is fastest at the very start and slows down immediately. Geometrically it is seen that the "surface" of the epidemic, the number of susceptible nodes in contact with the infected nodes, starts to rapidly decrease very early in the epidemic and as soon as the larger nodes have been infected. In our simulation, the speed and impact of an outbreak depend on three parameters: the average number of contacts each node makes, the probability of being infected by a neighbor, and the probability of recovery. Intelligent interventions to reduce the impact of future outbreaks need to focus on these critical parameters in order to minimize economic and social collateral damage.
Project description:Objsectives: Super-spreading events caused by overdispersed secondary transmission is crucial in the transmission of coronavirus disease 2019 (COVID-19). However, the exact level of overdispersion, demographics, and other factors associated with secondary transmission remain elusive. In this study, we aimed to elucidate the frequency and patterns of secondary transmission of SARS-CoV-2 in Japan.MethodsWe analyzed 16,471 cases between January and August 2020. We generated the number of secondary cases distribution and estimated the dispersion parameter (k) by fitting the negative binomial distribution in each phase. The frequencies of the secondary transmission were compared by demographic and clinical characteristics, calculating the odds ratio by logistic regression models.ResultsWe observed that 76.7% of the primary cases did not generate secondary cases with an estimated dispersion parameter k of 0.23. The demographic patterns of primary-secondary cases differed between phases, with 20-69 years being the predominant age group. There were higher proportions of secondary cases among elderly, symptomatic cases, and those with two days or more between onset and confirmation.ConclusionsThe study demonstrated the estimation of the frequency of secondary transmission of SARS-CoV-2 and the characteristics of people who generated the secondary transmission.