Dataset Information

Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study.

ABSTRACT:

Background

It is unclear whether weighted or unweighted regression is preferred in the analysis of data derived from respondent driven sampling. Our objective was to evaluate the validity of various regression models, with and without weights and with various controls for clustering in the estimation of the risk of group membership from data collected using respondent-driven sampling (RDS).

Methods

Twelve networked populations, with varying levels of homophily and prevalence, based on a known distribution of a continuous predictor were simulated using 1000 RDS samples from each population. Weighted and unweighted binomial and Poisson general linear models, with and without various clustering controls and standard error adjustments were modelled for each sample and evaluated with respect to validity, bias and coverage rate. Population prevalence was also estimated.

Results

In the regression analysis, the unweighted log-link (Poisson) models maintained the nominal type-I error rate across all populations. Bias was substantial and type-I error rates unacceptably high for weighted binomial regression. Coverage rates for the estimation of prevalence were highest using RDS-weighted logistic regression, except at low prevalence (10%) where unweighted models are recommended.

Conclusions

Caution is warranted when undertaking regression analysis of RDS data. Even when reported degree is accurate, low reported degree can unduly influence regression estimates. Unweighted Poisson regression is therefore recommended.

SUBMITTER: Avery L

PROVIDER: S-EPMC6819607 | biostudies-literature | 2019 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study.

Avery Lisa L Rotondi Nooshin N McKnight Constance C Firestone Michelle M Smylie Janet J Rotondi Michael M

BMC medical research methodology 20191029 1

<h4>Background</h4>It is unclear whether weighted or unweighted regression is preferred in the analysis of data derived from respondent driven sampling. Our objective was to evaluate the validity of various regression models, with and without weights and with various controls for clustering in the estimation of the risk of group membership from data collected using respondent-driven sampling (RDS).<h4>Methods</h4>Twelve networked populations, with varying levels of homophily and prevalence, base ...[more]

PMID: 31664912

Similar Datasets

Project description:Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total population data.Total population data on age, tribe, religion, socioeconomic status, sexual activity, and HIV status were available on a population of 2402 male household heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, using current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample).We recruited 927 household heads. Full and small RDS samples were largely representative of the total population, but both samples underrepresented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven sampling statistical inference methods failed to reduce these biases. Only 31%-37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%-74% of respondent-driven sampling bootstrap 95% confidence intervals included the population proportion.Respondent-driven sampling produced a generally representative sample of this well-connected nonhidden population. However, current respondent-driven sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience sampling method, and caution is required when interpreting findings based on the sampling method.

Project description:BackgroundInternationally, wheelchair users are an emerging demographic phenomenon, due to their increased prevalence and rapidly increasing life-span. While having significant healthcare implications, basic robust epidemiological information about wheelchair users is often lacking due, in part, to this population's 'hidden' nature. Increasingly popular in epidemiological research, Respondent Driven Sampling (RDS) provides a mechanism for generating unbiased population-based estimates for hard-to-reach populations, overcoming biases inherent within other sampling methods. This paper reports the first published study to employ RDS amongst wheelchair users.MethodsBetween October 2015 and January 2016, a short, successfully piloted, internet-based national survey was initiated. Twenty seeds from diverse organisations were invited to complete the survey then circulate it to peers within their networks following a well-defined protocol. A predetermined reminder protocol was triggered when seeds or their peers failed to respond. All participants were entered into a draw for an iPad.ResultsOverall, 19 people participated (nine women); 12 initial seeds, followed by seven second-wave participants arising from four seeds . Completion time for the survey ranged between 7 and 36 minutes. Despite repeated reminders, no further people were recruited.DiscussionWhile New Zealand wheelchair user numbers are unknown, an estimated 14% of people have physical impairments that limited mobility. The 19 respondents generated from adopting the RDS methodology here thus represents a negligible fraction of wheelchair users in New Zealand, and an insufficient number to ensure equilibrium required for unbiased analyses. While successful in other hard-to-reach populations, applying RDS methodology to wheelchair users requires further consideration. Formative research exploring areas of network characteristics, acceptability of RDS, appropriate incentive options, and seed selection amongst wheelchair users is needed.

Dataset Information

Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study.

Background

Methods

Results

Conclusions

Publications

Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets