Dataset Information

Data quality assessment and subsampling strategies to correct distributional bias in prevalence studies.

ABSTRACT:

Background

Healthcare-associated infections (HAIs) represent a major Public Health issue. Hospital-based prevalence studies are a common tool of HAI surveillance, but data quality problems and non-representativeness can undermine their reliability.

Methods

This study proposes three algorithms that, given a convenience sample and variables relevant for the outcome of the study, select a subsample with specific distributional characteristics, boosting either representativeness (Probability and Distance procedures) or risk factors' balance (Uniformity procedure). A "Quality Score" (QS) was also developed to grade sampled units according to data completeness and reliability. The methodologies were evaluated through bootstrapping on a convenience sample of 135 hospitals collected during the 2016 Italian Point Prevalence Survey (PPS) on HAIs.

Results

The QS highlighted wide variations in data quality among hospitals (median QS 52.9 points, range 7.98-628, lower meaning better quality), with most problems ascribable to ward and hospital-related data reporting. Both Distance and Probability procedures produced subsamples with lower distributional bias (Log-likelihood score increased from 7.3 to 29 points). The Uniformity procedure increased the homogeneity of the sample characteristics (e.g., - 58.4% in geographical variability). The procedures selected hospitals with higher data quality, especially the Probability procedure (lower QS in 100% of bootstrap simulations). The Distance procedure produced lower HAI prevalence estimates (6.98% compared to 7.44% in the convenience sample), more in line with the European median.

Conclusions

The QS and the subsampling procedures proposed in this study could represent effective tools to improve the quality of prevalence studies, decreasing the biases that can arise due to non-probabilistic sample collection.

SUBMITTER: D'Ambrosio A

PROVIDER: S-EPMC8088017 | biostudies-literature | 2021 Apr

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Data quality assessment and subsampling strategies to correct distributional bias in prevalence studies.

D'Ambrosio A A Garlasco J J Quattrocolo F F Vicentini C C Zotti C M CM

BMC medical research methodology 20210430 1

<h4>Background</h4>Healthcare-associated infections (HAIs) represent a major Public Health issue. Hospital-based prevalence studies are a common tool of HAI surveillance, but data quality problems and non-representativeness can undermine their reliability.<h4>Methods</h4>This study proposes three algorithms that, given a convenience sample and variables relevant for the outcome of the study, select a subsample with specific distributional characteristics, boosting either representativeness (Prob ...[more]

PMID: 33931025

Similar Datasets

Project description:BackgroundPopulation-based surveys collect crucial data on anthropometric measures to track trends in stunting [height-for-age z score (HAZ) < -2SD] and wasting [weight-for-height z score (WHZ) < -2SD] prevalence among young children globally. However, the quality of the anthropometric data varies between surveys, which may affect population-based estimates of malnutrition.ObjectivesWe aimed to develop composite indices of anthropometric data quality for use in multisurvey analysis of child health and nutritional status.MethodsWe used anthropometric data for children 0-59 mo of age from all publicly available Demographic and Health Surveys (DHS) from 2000 onwards. We derived 6 indicators of anthropometric data quality at the survey level, including 1) date of birth completeness, 2) anthropometric measure completeness, 3) digit preference for height and age, 4) difference in mean HAZ by month of birth, 5) proportion of biologically implausible values, and 6) dispersion of HAZ and WHZ distribution. Principal component factor analysis was used to generate a composite index of anthropometric data quality for HAZ and WHZ separately. Surveys were ranked from the highest (best) to the lowest (worst) index values in anthropometric quality across countries and over time.ResultsOf the 145 DHS included, the majority (83 of 145; 57%) were conducted in Sub-Saharan Africa. Surveys were ranked from highest to lowest anthropometric data quality relative to other surveys using the composite index for HAZ. Although slightly higher values in recent DHS suggest potential improvements in anthropometric data quality over time, there continues to be substantial heterogeneity in the quality of anthropometric data across surveys. Results were similar for the WHZ data quality index.ConclusionsA composite index of anthropometric data quality using a parsimonious set of individual indicators can effectively discriminate among surveys with excellent and poor data quality. Such indices can be used to account for variations in anthropometric data quality in multisurvey epidemiologic analyses of child health.

Project description:ObjectivesWithin cost-effectiveness models, prevalence figures can inform transition probabilities. The methodological quality of studies can inform the choice of prevalence figures but no single obvious candidate tool exists for assessing quality of the observational epidemiological studies for selecting prevalence estimates. We aimed to compare different tools to assess the risk of bias of studies reporting prevalence, and develop and compare possible numerical scoring systems using these tools to set a threshold for inclusion of reports of prevalence in an economic analysis of neonatal hypoglycaemia.DesignAssessments of bias using two tools (Joanna Briggs Institute (JBI) Checklist for Prevalence Studies and a modified version of Risk Of Bias In Non-randomised Studies-of Interventions (ROBINS-I)) were compared for 18 studies relevant to a single setting (neonatal hypoglycaemia). Inclusions of studies for use in a decision analysis model were considered based on summary scores derived from these tools.ResultsBoth tools were considered easy to use and produced dispersed scores for each of the 40 study-outcome combinations. The modified ROBINS-I scores were more skewed than the JBI scores, particularly at higher thresholds. The studies selected for inclusion are generally the same using either tool; if 50% was used as the cut-off threshold using the Applicable Score both tools would yield the same results. However, the JBI tool is shorter and may be easier to interpret and apply to studies that do not involve a control group, while the modified ROBINS-I tool assesses more methodological detail in studies that include a control group.ConclusionBoth tools performed well for systematically assessing studies that report on outcome prevalence and provided similar discrimination between studies for risk of bias. This convergent validity supports use of both tools for the purpose of assessing risk of bias and selecting studies that report prevalence for inclusion in economic analyses.

Project description:PurposeMultiple clinical and epidemiological studies have provided estimates of fibromyalgia prevalence and sex ratio, but different criteria sets and methodology, as well as bias, have led to widely varying (0.4%->11%) estimates of prevalence and female predominance (>90% to <61%). In general, studies have failed to distinguish Criteria based fibromyalgia (CritFM) from Clinical fibromyalgia (ClinFM). In the current study we compare CritFM with ClinFM to investigate gender and other biases in the diagnosis of fibromyalgia.MethodsWe used a rheumatic disease databank and 2016 fibromyalgia criteria to study prevalence and sex ratios in a selection biased sample of 1761 referred and diagnosed fibromyalgia patients and in an unbiased sample of 4342 patients with no diagnosis with respect to fibromyalgia. We compared diagnostic and clinical variables according to gender, and we reanalyzed a German population study (GPS) (n = 2435) using revised 2016 criteria for fibromyalgia.ResultsIn the selection-biased sample of referred patients with fibromyalgia, more than 90% were women. However, when an unselected sample of rheumatoid arthritis (RA) patients was studied for the presence of fibromyalgia, women represented 58.7% of fibromyalgia cases. Women had slightly more symptoms than men, including generalized pain (36.8% vs. 32.4%), count of 37 symptoms (4.7 vs. 3.7) and mean polysymptomatic distress scores (10.2 vs. 8.2). We also found a linear relation between the probability of being females and fibromyalgia and fibromyalgia severity. Women in the GPS represented 59.2% of cases.DiscussionThe perception of fibromyalgia as almost exclusively (?90%) a women's disorder is not supported by data in unbiased studies. Using validated self-report criteria and unbiased selection, the female proportion of fibromyalgia cases was ?60% in the unbiased studies, and the observed CritFM prevalence of fibromyalgia in the GPS was ~2%. ClinFM is the public face of fibromyalgia, but is severely affected by selection and confirmation bias in the clinic and publications, underestimating men with fibromyalgia and overestimating women. We recommend the use of 2016 fibromyalgia criteria for clinical diagnosis and epidemiology because of its updated scoring and generalized pain requirement. Fibromyalgia and generalized pain positivity, widespread pain (WPI), symptom severity scale (SSS) and polysymptomatic distress (PSD) scale should always be reported.

Project description:ImportanceClinical care quality improvement (QI) strategies are critical to prevent and control cardiovascular disease (CVD). However, there is limited evidence regarding which components of the health system-, clinician-, and patient-based QI strategies contribute to their impact on CVD.ObjectivesTo identify, map, and organize evidence on the effectiveness and implementation of cardiovascular QI strategies that seek to improve outcomes in patients with CVD.Evidence reviewEight electronic databases (MEDLINE, EMBASE, CINAHL, PsycINFO, the Cochrane Library, ProQuest, ClinicalTrials.gov, and the World Health Organization International Clinical Trials Registry Platform) were searched for studies published between January 1, 2009, and October 25, 2019. Eligible study designs included randomized trials and preintervention and postintervention evaluations. Descriptive findings of included studies were reported using several frameworks to map the intervention components stratified by target population, setting, outcomes, and overall results.FindingsFrom 8066 screened titles and abstracts, 456 unique studies with 150 148 unique patients (38.1% women and 61.9% men; mean [SD] age, 64.6 [7.1] years) were identified, including 427 randomized trials, 21 quasi-randomized studies, and 8 preintervention and postintervention studies. Of 336 studies from 45 countries that were classified, 255 (75.9%) were from high-income countries; 68 (20.2%), upper-middle-income countries; 13 (3.9%), lower-middle-income countries; and 0, low-income countries, with diverse clinical settings and target patient populations (post-myocardial infarction, stroke, heart failure). Patient support (311 studies), information communication technology (ICT) for health (78 studies), community support (18 studies), supervision (15 studies), and high-intensity training (14 studies) were the most commonly evaluated QI strategies. Other strategies were group problem-solving (7 studies), printed information (5 studies), strengthening infrastructure (4 studies), and financial incentives (3 studies). Patient support, ICT for health, training, and community support were strategies that had been evaluated the most for clinical end points and showed modest associations with several clinical outcomes. The other strategies did not have outcome-driven evaluations reported. Group problem-solving was associated with improved patient self-care and quality of life. Strengthening infrastructure was associated with improved treatment satisfaction. Printed information and financial incentives showed no meaningful effect.Conclusions and relevanceThis systematic review found that substantial variations exist in the types, effectiveness, and implementation of QI strategies for patients with CVD. A comprehensive map of QI strategies created by this study would be useful for researchers to identify where new knowledge is needed to improve cardiovascular outcomes. Outcome-driven evaluations and long-term studies are needed, particularly in low-income settings, to better understand the effects of QI strategies on prevention and control of CVD.

Dataset Information

Data quality assessment and subsampling strategies to correct distributional bias in prevalence studies.

Background

Methods

Results

Conclusions

Publications

Data quality assessment and subsampling strategies to correct distributional bias in prevalence studies.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets