Dataset Information

Improving propensity score estimators' robustness to model misspecification using super learner.

ABSTRACT: The consistency of propensity score (PS) estimators relies on correct specification of the PS model. The PS is frequently estimated using main-effects logistic regression. However, the underlying model assumptions may not hold. Machine learning methods provide an alternative nonparametric approach to PS estimation. In this simulation study, we evaluated the benefit of using Super Learner (SL) for PS estimation. We created 1,000 simulated data sets (n = 500) under 4 different scenarios characterized by various degrees of deviance from the usual main-term logistic regression model for the true PS. We estimated the average treatment effect using PS matching and inverse probability of treatment weighting. The estimators' performance was evaluated in terms of PS prediction accuracy, covariate balance achieved, bias, standard error, coverage, and mean squared error. All methods exhibited adequate overall balancing properties, but in the case of model misspecification, SL performed better for highly unbalanced variables. The SL-based estimators were associated with the smallest bias in cases of severe model misspecification. Our results suggest that use of SL to estimate the PS can improve covariate balance and reduce bias in a meaningful manner in cases of serious model misspecification for treatment assignment.

SUBMITTER: Pirracchio R

PROVIDER: S-EPMC4351345 | biostudies-other | 2015 Jan

REPOSITORIES: biostudies-other

ACCESS DATA

Publications

Improving propensity score estimators' robustness to model misspecification using super learner.

Pirracchio Romain R Petersen Maya L ML van der Laan Mark M

American journal of epidemiology 20141216 2

The consistency of propensity score (PS) estimators relies on correct specification of the PS model. The PS is frequently estimated using main-effects logistic regression. However, the underlying model assumptions may not hold. Machine learning methods provide an alternative nonparametric approach to PS estimation. In this simulation study, we evaluated the benefit of using Super Learner (SL) for PS estimation. We created 1,000 simulated data sets (n = 500) under 4 different scenarios characteri ...[more]

PMID: 25515168

Dataset Information

Improving propensity score estimators' robustness to model misspecification using super learner.

Publications

Improving propensity score estimators' robustness to model misspecification using super learner.

OmicsDI is part of the ELIXIR infrastructure

Tweets