An open repository of real-time COVID-19 indicators.
Ontology highlight
ABSTRACT: The COVID-19 pandemic presented enormous data challenges in the United States. Policy makers, epidemiological modelers, and health researchers all require up-to-date data on the pandemic and relevant public behavior, ideally at fine spatial and temporal resolution. The COVIDcast API is our attempt to fill this need: Operational since April 2020, it provides open access to both traditional public health surveillance signals (cases, deaths, and hospitalizations) and many auxiliary indicators of COVID-19 activity, such as signals extracted from deidentified medical claims data, massive online surveys, cell phone mobility data, and internet search trends. These are available at a fine geographic resolution (mostly at the county level) and are updated daily. The COVIDcast API also tracks all revisions to historical data, allowing modelers to account for the frequent revisions and backfill that are common for many public health data sources. All of the data are available in a common format through the API and accompanying R and Python software packages. This paper describes the data sources and signals, and provides examples demonstrating that the auxiliary signals in the COVIDcast API present information relevant to tracking COVID activity, augmenting traditional public health reporting and empowering research and decision-making.
Project description:The COVID-19 outbreak is a global pandemic declared by the World Health Organization, with rapidly increasing cases in most countries. A wide range of research is urgently needed for understanding the COVID-19 pandemic, such as transmissibility, geographic spreading, risk factors for infections, and economic impacts. Reliable data archive and sharing are essential to jump-start innovative research to combat COVID-19. This research is a collaborative and innovative effort in building such an archive, including the collection of various data resources relevant to COVID-19 research, such as daily cases, social media, population mobility, health facilities, climate, socioeconomic data, research articles, policy and regulation, and global news. Due to the heterogeneity between data sources, our effort also includes processing and integrating different datasets based on GIS (Geographic Information System) base maps to make them relatable and comparable. To keep the data files permanent, we published all open data to the Harvard Dataverse (https://dataverse.harvard.edu/dataverse/2019ncov), an online data management and sharing platform with a permanent Digital Object Identifier number for each dataset. Finally, preliminary studies are conducted based on the shared COVID-19 datasets and revealed different spatial transmission patterns among mainland China, Italy, and the United States.
Project description:ObjectivesThe ongoing Coronavirus disease 2019 (COVID-19) pandemic has drastically impacted the global health and economy. Computed tomography (CT) is the prime imaging modality for diagnosis of lung infections in COVID-19 patients. Data-driven and Artificial intelligence (AI)-powered solutions for automatic processing of CT images predominantly rely on large-scale, heterogeneous datasets. Owing to privacy and data availability issues, open-access and publicly available COVID-19 CT datasets are difficult to obtain, thus limiting the development of AI-enabled automatic diagnostic solutions. To tackle this problem, large CT image datasets encompassing diverse patterns of lung infections are in high demand.Data descriptionIn the present study, we provide an open-source repository containing 1000+ CT images of COVID-19 lung infections established by a team of board-certified radiologists. CT images were acquired from two main general university hospitals in Mashhad, Iran from March 2020 until January 2021. COVID-19 infections were ratified with matching tests including Reverse transcription polymerase chain reaction (RT-PCR) and accompanying clinical symptoms. All data are 16-bit grayscale images composed of 512 × 512 pixels and are stored in DICOM standard. Patient privacy is preserved by removing all patient-specific information from image headers. Subsequently, all images corresponding to each patient are compressed and stored in RAR format.
Project description:Early detection of infectious disease is crucial for reducing transmission and facilitating early intervention. We built a real-time smartwatch-based alerting system for the detection of aberrant physiological and activity signals (e.g. resting heart rate, steps) associated with early infection onset at the individual level. Upon applying this system to a cohort of 3,246 participants, we found that alerts were generated for pre-symptomatic and asymptomatic COVID-19 infections in 78% of cases, and pre-symptomatic signals were observed a median of three days prior to symptom onset. Furthermore, by examining over 100,000 survey annotations, we found that other respiratory infections as well as events not associated with COVID-19 (e.g. stress, alcohol consumption, travel) could trigger alerts, albeit at a lower mean period (1.9 days) than those observed in the COVID-19 cases (4.3 days). Thus this system has potential both for advanced warning of COVID-19 as well as a general system for measuring health via detection of physiological shifts from personal baselines. The system is open-source and scalable to millions of users, offering a personal health monitoring system that can operate in real time on a global scale.
Project description:BackgroundThe coronavirus disease (COVID) pandemic caused disruption globally and was particularly distressing in low- and middle-income countries such as India. This study aimed to provide population representative estimates of COVID-related outcomes in India over time and characterize how COVID-related changes and impacts differ by key socioeconomic groups across the life course.MethodsThe sample was leveraged from an existing nationally representative study on cognition and dementia in India: Harmonized Diagnostic Assessment of Dementia for the Longitudinal Aging Study in India (LASI-DAD). The wave-1 of LASI-DAD enrolled 4096 older adults aged 60 years and older in 3316 households from 18 states and union territories of India. Out of the 3316 LASI-DAD households, 2704 with valid phone numbers were contacted and invited to participate in the Real-Time Insights COVID-19 in India (RTI COVID-India) study. RTI COVID-India was a bi-monthly phone survey that provided insight into the individual's knowledge, attitudes, and behaviour towards COVID-19 and changes in the household's economic and health conditions throughout the pandemic. The survey was started in May 2020 and 9 rounds of data have been collected.Findings till dateOut of the 2704 LASI-DAD households with valid phone numbers, 1766 households participated in the RTI COVID-India survey at least once. Participants were in the age range of 18-102 years, 49% were female, 66% resided in rural area. Across all rounds, there was a higher report of infection among respondents aged 60-69 years. There was a greater prevalence of COVID-19 diagnosis reported in urban (23.0%) compared to rural areas (9.8%). Respondents with higher education had a greater prevalence of COVID-19 diagnosis compared to those with lower or no formal education. Highest prevalence of COVID-19 diagnosis was reported from high economic status compared to middle and low economic status households. Comparing education gradients in experiencing COVID-19 symptoms and being diagnosed, we observe an opposite pattern: respondents with no formal schooling reported the highest level of experiencing COVID-19 symptoms, whereas the greatest proportion of the respondents with secondary school or higher education reported being diagnosed with COVID-19.Future plansThe study group will analyse the data collected showing the real-time changes throughout the pandemic and will make the data widely available for researchers to conduct further studies.
Project description:Electronic databases provide effective and efficient management of zebrafish colony operations, but commercially available options are expensive. In this study we have developed a free zebrafish management repository alternative using free Google applications. Husbandry information is logged into a Google Sheets-based catalog through Google Form (GF) entries. Form autopopulation can be streamlined by barcodes, which can be generated and deciphered through free smartphone applications. The repository is capable of calculating pertinent husbandry dates from GF input and sending e-mail reminders to users for specified tasks. A Google application-based repository allows for a free simple zebrafish husbandry management solution.
Project description:Introduction The COVID-19 pandemic has highlighted the need for robust data linkage systems and methods for identifying outbreaks of disease in near real-time. Objectives The primary objective of this study was to develop a real-time geospatial surveillance system to monitor the spread of COVID-19 across the UK. Methods Using self-reported app data and the Secure Anonymised Information Linkage (SAIL) Databank, we demonstrate the use of sophisticated spatial modelling for near-real-time prediction of COVID-19 prevalence at small-area resolution to inform strategic government policy areas. Results We demonstrate that using a combination of crowd-sourced app data and sophisticated geo-statistical techniques it is possible to predict hot spots of COVID-19 at fine geographic scales, nationally. We are also able to produce estimates of their precision, which is an important pre-requisite to an effective control strategy to guard against over-reaction to potentially spurious features of 'best guess' predictions. Conclusion In the UK, important emerging risk-factors such as social deprivation or ethnicity vary over small distances, hence risk needs to be modelled at fine spatial resolution to avoid aggregation bias. We demonstrate that existing geospatial statistical methods originally developed for global health applications are well-suited to this task and can be used in an anonymised databank environment, thus preserving the privacy of the individuals who contribute their data.
Project description:Mobility restrictions have been identified as key non-pharmaceutical interventions to limit the spread of the SARS-COV-2 epidemics. However, these interventions present significant drawbacks to the social fabric and negative outcomes for the real economy. In this paper we propose a real-time monitoring framework for tracking the economic consequences of various forms of mobility reductions involving European countries. We adopt a granular representation of mobility patterns during both the first and second waves of SARS-COV-2 in Italy, Germany, France and Spain to provide an analytical characterization of the rate of losses of industrial production by means of a nowcasting methodology. Our approach exploits the information encoded in massive datasets of human mobility provided by Facebook and Google, which are published at higher frequencies than the target economic variables, in order to obtain an early estimate before the official data becomes available. Our results show, in first place, the ability of mobility-related policies to induce a contraction of mobility patterns across jurisdictions. Besides this contraction, we observe a substitution effect which increases mobility within jurisdictions. Secondly, we show how industrial production strictly follows the dynamics of population commuting patterns and of human mobility trends, which thus provide information on the day-by-day variations in countries' economic activities. Our work, besides shedding light on how policy interventions targeted to induce a mobility contraction impact the real economy, constitutes a practical toolbox for helping governments to design appropriate and balanced policy actions aimed at containing the SARS-COV-2 spread, while mitigating the detrimental effect on the economy. Our study reveals how complex mobility patterns can have unequal consequences to economic losses across countries and call for a more tailored implementation of restrictions to balance the containment of contagion with the need to sustain economic activities.
Project description:BackgroundFor each of the COVID-19 pandemic waves, hospitals have had to plan for deploying surge capacity and resources to manage large but transient increases in COVID-19 admissions. While a lot of effort has gone into predicting regional trends in COVID-19 cases and hospitalizations, there are far fewer successful tools for creating accurate hospital-level forecasts.MethodsLarge-scale, anonymized mobile phone data has been shown to correlate with regional case counts during the first two waves of the pandemic (spring 2020, and fall/winter 2021). Building off this success, we developed a multi-step, recursive forecasting model to predict individual hospital admissions; this model incorporates the following data: (i) hospital-level COVID-19 admissions, (ii) statewide test positivity data, and (iii) aggregate measures of large-scale human mobility, contact patterns, and commuting volume.ResultsIncorporating large-scale, aggregate mobility data as exogenous variables in prediction models allows us to make hospital-specific COVID-19 admission forecasts 21 days ahead. We show this through highly accurate predictions of hospital admissions for five hospitals in Massachusetts during the first year of the COVID-19 pandemic.ConclusionsThe high predictive capability of the model was achieved by combining anonymized, aggregated mobile device data about users' contact patterns, commuting volume, and mobility range with COVID hospitalizations and test-positivity data. Mobility-informed forecasting models can increase the lead-time of accurate predictions for individual hospitals, giving managers valuable time to strategize how best to allocate resources to manage forthcoming surges.
Project description:Cases of a novel coronavirus were first reported in Wuhan, Hubei province, China, in December 2019 and have since spread across the world. Epidemiological studies have indicated human-to-human transmission in China and elsewhere. To aid the analysis and tracking of the COVID-19 epidemic we collected and curated individual-level data from national, provincial, and municipal health reports, as well as additional information from online reports. All data are geo-coded and, where available, include symptoms, key dates (date of onset, admission, and confirmation), and travel history. The generation of detailed, real-time, and robust data for emerging disease outbreaks is important and can help to generate robust evidence that will support and inform public health decision making.
Project description:In the opening months of the pandemic, the need for situational awareness was urgent. Forecasting models such as the Susceptible-Infectious-Recovered (SIR) model were hampered by limited testing data and key information on mobility, contact tracing, and local policy variations would not be consistently available for months. New case counts from sources like John Hopkins University and the NY Times were systematically reliable. Using these data, we developed the novel COVID County Situational Awareness Tool (CCSAT) for reliable monitoring and decision support. In CCSAT, we developed a retrospective seven-day moving window semantic map of county-level disease magnitude and acceleration that smoothed noisy daily variations. We also developed a novel Bayesian model that reliably forecasted county-level magnitude and acceleration for the upcoming week based on population and new case count data. Together these formed a robust operational update including county-level maps of new case rate changes, estimates of new cases in the upcoming week, and measures of model reliability. We found CCSAT provided stable, reliable estimates across the seven-day time window, with the greatest errors occurring in cases of anomalous, single day spikes. In this paper, we provide CCSAT details and apply it to a single week in June 2020.