Dataset Information

Re-Identification Risk versus Data Utility for Aggregated Mobility Research Using Mobile Phone Location Data.

ABSTRACT: Mobile phone location data is a newly emerging data source of great potential to support human mobility research. However, recent studies have indicated that many users can be easily re-identified based on their unique activity patterns. Privacy protection procedures will usually change the original data and cause a loss of data utility for analysis purposes. Therefore, the need for detailed data for activity analysis while avoiding potential privacy risks presents a challenge. The aim of this study is to reveal the re-identification risks from a Chinese city's mobile users and to examine the quantitative relationship between re-identification risk and data utility for an aggregated mobility analysis. The first step is to apply two reported attack models, the top N locations and the spatio-temporal points, to evaluate the re-identification risks in Shenzhen City, a metropolis in China. A spatial generalization approach to protecting privacy is then proposed and implemented, and spatially aggregated analysis is used to assess the loss of data utility after privacy protection. The results demonstrate that the re-identification risks in Shenzhen City are clearly different from those in regions reported in Western countries, which prove the spatial heterogeneity of re-identification risks in mobile phone location data. A uniform mathematical relationship has also been found between re-identification risk (x) and data (y) utility for both attack models: y = -axb+c, (a, b, c>0; 0

SUBMITTER: Yin L

PROVIDER: S-EPMC4607417 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Re-Identification Risk versus Data Utility for Aggregated Mobility Research Using Mobile Phone Location Data.

Yin Ling L Wang Qian Q Shaw Shih-Lung SL Fang Zhixiang Z Hu Jinxing J Tao Ye Y Wang Wei W

PloS one 20151015 10

Mobile phone location data is a newly emerging data source of great potential to support human mobility research. However, recent studies have indicated that many users can be easily re-identified based on their unique activity patterns. Privacy protection procedures will usually change the original data and cause a loss of data utility for analysis purposes. Therefore, the need for detailed data for activity analysis while avoiding potential privacy risks presents a challenge. The aim of this s ...[more]

PMID: 26469780

Similar Datasets

Project description:BackgroundIn early 2020, the response to the SARS-CoV-2 pandemic focused on non-pharmaceutical interventions, some of which aimed to reduce transmission by changing mixing patterns between people. Aggregated location data from mobile phones are an important source of real-time information about human mobility on a population level, but the degree to which these mobility metrics capture the relevant contact patterns of individuals at risk of transmitting SARS-CoV-2 is not clear. In this study we describe changes in the relationship between mobile phone data and SARS-CoV-2 transmission in the USA.MethodsIn this population-based study, we collected epidemiological data on COVID-19 cases and deaths, as well as human mobility metrics collated by advertisement technology that was derived from global positioning systems, from 1396 counties across the USA that had at least 100 laboratory-confirmed cases of COVID-19. We grouped these counties into six ordinal categories, defined by the National Center for Health Statistics (NCHS) and graded from urban to rural, and quantified the changes in COVID-19 transmission using estimates of the effective reproduction number (Rt) between Jan 22 and July 9, 2020, to investigate the relationship between aggregated mobility metrics and epidemic trajectory. For each county, we model the time series of Rt values with mobility proxies.FindingsWe show that the reproduction number is most strongly associated with mobility proxies for change in the travel into counties (0·757 [95% CI 0·689 to 0·857]), but this relationship primarily holds for counties in the three most urban categories as defined by the NCHS. This relationship weakens considerably after the initial 15 weeks of the epidemic (0·442 [-0·492 to -0·392]), consistent with the emergence of more complex local policies and behaviours, including masking.InterpretationOur study shows that the integration of mobility metrics into retrospective modelling efforts can be useful in identifying links between these metrics and Rt. Importantly, we highlight potential issues in the data generation process for transmission indicators derived from mobile phone data, representativeness, and equity of access, which must be addressed to improve the interpretability of these data in public health.FundingThere was no funding source for this study.

Project description:BackgroundLittle is known about the effect of changes in mobility at the subcity level on subsequent COVID-19 incidence, which is particularly relevant in Latin America, where substantial barriers prevent COVID-19 vaccine access and non-pharmaceutical interventions are essential to mitigation efforts. We aimed to examine the longitudinal associations between population mobility and COVID-19 incidence at the subcity level across a large number of Latin American cities.MethodsIn this longitudinal ecological study, we compiled aggregated mobile phone location data, daily confirmed COVID-19 cases, and features of urban and social environments to analyse population mobility and COVID-19 incidence at the subcity level among cities with more than 100 000 inhabitants in Argentina, Brazil, Colombia, Guatemala, and Mexico, from March 2 to Aug 29, 2020. Spatially aggregated mobile phone data were provided by the UN Development Programme in Latin America and the Caribbean and Grandata; confirmed COVID-19 cases were from national government reports and population and socioeconomic factors were from the latest national census in each country. We used mixed-effects negative binomial regression for a time-series analysis, to examine longitudinal associations between weekly mobility changes from baseline (prepandemic week of March 2-9, 2020) and subsequent COVID-19 incidence (lagged by 1-6 weeks) at the subcity level, adjusting for urban environmental and socioeconomic factors (time-invariant educational attainment, residential overcrowding, population density [all at the subcity level], and country).FindingsWe included 1031 subcity areas, representing 314 Latin American cities, in Argentina (107 subcity areas), Brazil (416), Colombia (82), Guatemala (20), and Mexico (406). In the main adjusted model, we observed an incidence rate ratio (IRR) of 2·35 (95% CI 2·12-2·60) for COVID-19 incidence per log unit increase in the mobility ratio (vs baseline) during the previous week. Thus, 10% lower weekly mobility was associated with 8·6% (95% CI 7·6-9·6) lower incidence of COVID-19 in the following week. This association gradually weakened as the lag between mobility and COVID-19 incidence increased and was not different from null at a 6-week lag.InterpretationReduced population movement within a subcity area is associated with a subsequent decrease in COVID-19 incidence among residents of that subcity area. Policies that reduce population mobility at the subcity level might be an effective COVID-19 mitigation strategy, although they should be combined with strategies that mitigate any adverse social and economic consequences of reduced mobility for the most vulnerable groups.FundingWellcome Trust.TranslationFor the Spanish translation of the abstract see Supplementary Materials section.

Dataset Information

Re-Identification Risk versus Data Utility for Aggregated Mobility Research Using Mobile Phone Location Data.

Publications

Re-Identification Risk versus Data Utility for Aggregated Mobility Research Using Mobile Phone Location Data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets