Dataset Information

On the role of marginal confounder prevalence - implications for the high-dimensional propensity score algorithm.

ABSTRACT: PURPOSE:The high-dimensional propensity score algorithm attempts to improve control of confounding in typical treatment effect studies in pharmacoepidemiology and is increasingly being used for the analysis of large administrative databases. Within this multi-step variable selection algorithm, the marginal prevalence of non-zero covariate values is considered to be an indicator for a count variable's potential confounding impact. We investigate the role of the marginal prevalence of confounder variables on potentially caused bias magnitudes when estimating risk ratios in point exposure studies with binary outcomes. METHODS:We apply the law of total probability in conjunction with an established bias formula to derive and illustrate relative bias boundaries with respect to marginal confounder prevalence. RESULTS:We show that maximum possible bias magnitudes can occur at any marginal prevalence level of a binary confounder variable. In particular, we demonstrate that, in case of rare or very common exposures, low and high prevalent confounder variables can still have large confounding impact on estimated risk ratios. CONCLUSIONS:Covariate pre-selection by prevalence may lead to sub-optimal confounder sampling within the high-dimensional propensity score algorithm. While we believe that the high-dimensional propensity score has important benefits in large-scale pharmacoepidemiologic studies, we recommend omitting the prevalence-based empirical identification of candidate covariates.

SUBMITTER: Schuster T

PROVIDER: S-EPMC5072887 | biostudies-literature | 2015 Sep

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

On the role of marginal confounder prevalence - implications for the high-dimensional propensity score algorithm.

Schuster Tibor T Pang Menglan M Platt Robert W RW

Pharmacoepidemiology and drug safety 20150410 9

<h4>Purpose</h4>The high-dimensional propensity score algorithm attempts to improve control of confounding in typical treatment effect studies in pharmacoepidemiology and is increasingly being used for the analysis of large administrative databases. Within this multi-step variable selection algorithm, the marginal prevalence of non-zero covariate values is considered to be an indicator for a count variable's potential confounding impact. We investigate the role of the marginal prevalence of conf ...[more]

PMID: 25866189

Similar Datasets

Project description:BackgroundThe High-Dimensional Propensity Score (hd-PS) algorithm can select and adjust for baseline confounders of treatment-outcome associations in pharmacoepidemiologic studies that use healthcare claims data. How hd-PS performance is affected by aggregating medications or medical diagnoses has not been assessed.MethodsWe evaluated the effects of aggregating medications or diagnoses on hd-PS performance in an empirical example using resampled cohorts with small sample size, rare outcome incidence, or low exposure prevalence. In a cohort study comparing the risk of upper gastrointestinal complications in celecoxib or traditional NSAIDs (diclofenac, ibuprofen) initiators with rheumatoid arthritis and osteoarthritis, we (1) aggregated medications and International Classification of Diseases-9 (ICD-9) diagnoses into hierarchies of the Anatomical Therapeutic Chemical classification (ATC) and the Clinical Classification Software (CCS), respectively, and (2) sampled the full cohort using techniques validated by simulations to create 9,600 samples to compare 16 aggregation scenarios across 50% and 20% samples with varying outcome incidence and exposure prevalence. We applied hd-PS to estimate relative risks (RR) using 5 dimensions, predefined confounders, ≤ 500 hd-PS covariates, and propensity score deciles. For each scenario, we calculated: (1) the geometric mean RR; (2) the difference between the scenario mean ln(RR) and the ln(RR) from published randomized controlled trials (RCT); and (3) the proportional difference in the degree of estimated confounding between that scenario and the base scenario (no aggregation).ResultsCompared with the base scenario, aggregations of medications into ATC level 4 alone or in combination with aggregation of diagnoses into CCS level 1 improved the hd-PS confounding adjustment in most scenarios, reducing residual confounding compared with the RCT findings by up to 19%.ConclusionsAggregation of codes using hierarchical coding systems may improve the performance of the hd-PS to control for confounders. The balance of advantages and disadvantages of aggregation is likely to vary across research settings.

Dataset Information

On the role of marginal confounder prevalence - implications for the high-dimensional propensity score algorithm.

Publications

On the role of marginal confounder prevalence - implications for the high-dimensional propensity score algorithm.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets