Dataset Information

Pseudonymization for research data collection: is the juice worth the squeeze?

ABSTRACT: BACKGROUND:The collection of data and biospecimens which characterize patients and probands in-depth is a core element of modern biomedical research. Relevant data must be considered highly sensitive and it needs to be protected from unauthorized use and re-identification. In this context, laws, regulations, guidelines and best-practices often recommend or mandate pseudonymization, which means that directly identifying data of subjects (e.g. names and addresses) is stored separately from data which is primarily needed for scientific analyses. DISCUSSION:When (authorized) re-identification of subjects is not an exceptional but a common procedure, e.g. due to longitudinal data collection, implementing pseudonymization can significantly increase the complexity of software solutions. For example, data stored in distributed databases, need to be dynamically combined with each other, which requires additional interfaces for communicating between the various subsystems. This increased complexity may lead to new attack vectors for intruders. Obviously, this is in contrast to the objective of improving data protection. What is lacking is a standardized process of evaluating and reporting risks, threats and countermeasures, which can be used to test whether integrating pseudonymization methods into data collection systems actually improves upon the degree of protection provided by system designs that simply follow common IT security best practices and implement fine-grained role-based access control models. To demonstrate that the methods used to describe systems employing pseudonymized data management are currently heterogeneous and ad-hoc, we examined the extent to which twelve recent studies address each of the six basic security properties defined by the International Organization for Standardization (ISO) standard 27,000. We show inconsistencies across the studies, with most of them failing to mention one or more security properties. CONCLUSION:We discuss the degree of privacy protection provided by implementing pseudonymization into research data collection processes. We conclude that (1) more research is needed on the interplay of pseudonymity, information security and data protection, (2) problem-specific guidelines for evaluating and reporting risks, threats and countermeasures should be developed and that (3) future work on pseudonymized research data collection should include the results of such structured and integrated analyses.

SUBMITTER: Kohlmayer F

PROVIDER: S-EPMC6727563 | biostudies-literature | 2019 Sep

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Pseudonymization for research data collection: is the juice worth the squeeze?

Kohlmayer Florian F Lautenschläger Ronald R Prasser Fabian F

BMC medical informatics and decision making 20190904 1

<h4>Background</h4>The collection of data and biospecimens which characterize patients and probands in-depth is a core element of modern biomedical research. Relevant data must be considered highly sensitive and it needs to be protected from unauthorized use and re-identification. In this context, laws, regulations, guidelines and best-practices often recommend or mandate pseudonymization, which means that directly identifying data of subjects (e.g. names and addresses) is stored separately from ...[more]

PMID: 31484555

Similar Datasets

Project description:AimHealthcare administrative databases represent a valuable source for real-life data analysis. The primary aim of this study is to compare effectiveness and cost profile in non-small-cell lung cancer (NSCLC) patients harboring synchronous brain metastases (BMs) who received non-chemo first-line systemic therapy with or without advanced radiotherapy (aRT).MethodsDiagnostic ICD-9-CM codes were used for identifying all patients with a new diagnosis of lung cancer between 2012 and 2019. Among these, patients who had started a first-line systemic treatment with either TKIs or pembrolizumab, alone or in combination with intensity-modulated or stereotactic RT, were selected. Clinical outcomes investigated included overall survival (OS), progression-free survival (PFS), and time-to-treatment failure (TTF). The cost outcome was defined as the average per capita cumulative healthcare direct costs of the treatment, including all inpatient and outpatient costs.ResultsThe final cohort included 177 patients, of whom 58 were treated with systemic treatment plus aRT (STRT) and 119 with systemic treatment alone. The addition of aRT to systemic treatment was associated with a significantly better OS (p = 0.020) and PFS (p = 0.041) than systemic therapy alone. The ICER (incremental cost-effectiveness ratio) value indicated an average cost of €3792 for each month of survival after STRT treatment and confirmed clinical effectiveness but higher healthcare costs.ConclusionsThis real-world study suggests that upfront aRT for NCLSC patients with synchronous BMs represents a valid treatment strategy, boosting the efficacy of novel and emerging drug classes with sustainable costs for the health service.Translational relevanceThe present real-world study reports that the use of upfront advanced radiotherapyaRT and new-generation systemic agents, such as TKIs and pembrolizumab, may have higher oncological control and an improved cost-effectiveness profile than the use of new-generation systemic agents alone in NCLSC patients with synchronous brain metastases. Acquired evidence can also be used to inform policymakers that adding advanced radiotherapy results is a sustainable cost for the health service. Since approximately 50% of patients do not meet RCT inclusion criteria, a significant proportion of them is receiving treatment that is not evidence-informed; therefore, these results warrant further studies to identify the best radiotherapy timing and possible dose escalation approaches to improving treatment efficacy in patient subgroups not typically represented in randomized controlled trials.

Project description:BackgroundThe ubiquitous use of mobile phones in sending and receiving text messages has become a norm for young people. Undeniably, text messaging has become a new and important communication medium not only in the social realm but in education as well. The aim of this study is to evaluate the effectiveness of using text messaging as a means to collect data for a medical research project.A cross sectional study was carried out during a double blind, randomized controlled trial to assess the efficacy and safety of a probiotic in the management of Irritable Bowel Syndrome (IBS). The study aim was to assess the response rate of weekly symptom reports via Short Message Service (SMS). The subjects were undergraduates in a private medical university in Malaysia. They were identified through a previous university wide study as suffering from IBS based on Rome III criteria. The subjects were randomly assigned to either the treatment arm receiving a daily probiotic, or the placebo arm. They were required to score their symptoms using eight-item-questionnaires at baseline, and thereafter weekly, for a total of 8 weeks. All subjects were given the choice to communicate their symptom scores by text messaging via mobile phones or by email. SMS text messages were sent to remind trial subjects to attend face-to-face visits and to complete a paper based 34-item-questionnaires on IBS quality of life assessment at baseline and at end of 8 weeks.FindingsThe response rate of weekly symptom scores via Short Message Service (SMS) from a total of 38 subjects was 100%. Through the study, 342 reports were submitted: 33.3% of these were received on the due date without reminder, 60.0% one day after the deadline, after a single reminder, 6.1% 2-3 days after the deadline, after 2-3 reminders and 0.6% 5 days after the deadline, after SMS, phone reminder and face-to-face encounter. All SMS symptom reports, whether on time or late, were complete. With the help of SMS reminder, all trial subjects completed the paper based IBS quality of life assessment at baseline and at end of study.ConclusionsThis study found using text messaging via mobile phone an excellent instrument for collecting weekly symptom reports in response to trial medication, reminding trial subjects to attend face to face visits and completing more complex paper based evaluation. The 100% response rate of weekly symptom reports was facilitated by using simple number codes for SMS submission.Trial registrationNot appropriate.

Project description:BackgroundThis scoping review reports on studies that collect survey data using quantitative research to measure self-reported oral health status outcome measures. The objective of this review is to categorize measures used to evaluate self-reported oral health status and oral health quality of life used in surveys of general populations.MethodsThe review is guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) with the search on four online bibliographic databases. The criteria include (1) peer-reviewed articles, (2) papers published between 2011 and 2021, (3) only studies using quantitative methods, and (4) containing outcome measures of self-assessed oral health status, and/or oral health-related quality of life. All survey data collection methods are assessed and papers whose methods employ newer technological approaches are also identified.ResultsOf the 2981 unduplicated papers, 239 meet the eligibility criteria. Half of the papers use impact scores such as the OHIP-14; 10% use functional measures, such as the GOHAI, and 26% use two or more measures while 8% use rating scales of oral health status. The review identifies four data collection methods: in-person, mail-in, Internet-based, and telephone surveys. Most (86%) employ in-person surveys, and 39% are conducted in Asia-Pacific and Middle East countries with 8% in North America. Sixty-six percent of the studies recruit participants directly from clinics and schools, where the surveys were carried out. The top three sampling methods are convenience sampling (52%), simple random sampling (12%), and stratified sampling (12%). Among the four data collection methods, in-person surveys have the highest response rate (91%), while the lowest response rate occurs in Internet-based surveys (37%). Telephone surveys are used to cover a wider population compared to other data collection methods. There are two noteworthy approaches: 1) sample selection where researchers employ different platforms to access subjects, and 2) mode of interaction with subjects, with the use of computers to collect self-reported data.ConclusionThe study provides an assessment of oral health outcome measures, including subject-reported oral health status and notes newly emerging computer technological approaches recently used in surveys conducted on general populations. These newer applications, though rarely used, hold promise for both researchers and the various populations that use or need oral health care.

Dataset Information

Pseudonymization for research data collection: is the juice worth the squeeze?

Publications

Pseudonymization for research data collection: is the juice worth the squeeze?

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets