Dataset Information

Variable Selection for Confounder Control, Flexible Modeling and Collaborative Targeted Minimum Loss-Based Estimation in Causal Inference.

ABSTRACT: This paper investigates the appropriateness of the integration of flexible propensity score modeling (nonparametric or machine learning approaches) in semiparametric models for the estimation of a causal quantity, such as the mean outcome under treatment. We begin with an overview of some of the issues involved in knowledge-based and statistical variable selection in causal inference and the potential pitfalls of automated selection based on the fit of the propensity score. Using a simple example, we directly show the consequences of adjusting for pure causes of the exposure when using inverse probability of treatment weighting (IPTW). Such variables are likely to be selected when using a naive approach to model selection for the propensity score. We describe how the method of Collaborative Targeted minimum loss-based estimation (C-TMLE; van der Laan and Gruber, 2010 [27]) capitalizes on the collaborative double robustness property of semiparametric efficient estimators to select covariates for the propensity score based on the error in the conditional outcome model. Finally, we compare several approaches to automated variable selection in low- and high-dimensional settings through a simulation study. From this simulation study, we conclude that using IPTW with flexible prediction for the propensity score can result in inferior estimation, while Targeted minimum loss-based estimation and C-TMLE may benefit from flexible prediction and remain robust to the presence of variables that are highly correlated with treatment. However, in our study, standard influence function-based methods for the variance underestimated the standard errors, resulting in poor coverage under certain data-generating scenarios.

SUBMITTER: Schnitzer ME

PROVIDER: S-EPMC4733443 | biostudies-literature | 2016 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Variable Selection for Confounder Control, Flexible Modeling and Collaborative Targeted Minimum Loss-Based Estimation in Causal Inference.

Schnitzer Mireille E ME Lok Judith J JJ Gruber Susan S

The international journal of biostatistics 20160501 1

This paper investigates the appropriateness of the integration of flexible propensity score modeling (nonparametric or machine learning approaches) in semiparametric models for the estimation of a causal quantity, such as the mean outcome under treatment. We begin with an overview of some of the issues involved in knowledge-based and statistical variable selection in causal inference and the potential pitfalls of automated selection based on the fit of the propensity score. Using a simple exampl ...[more]

PMID: 26226129

Dataset Information

Variable Selection for Confounder Control, Flexible Modeling and Collaborative Targeted Minimum Loss-Based Estimation in Causal Inference.

Publications

Variable Selection for Confounder Control, Flexible Modeling and Collaborative Targeted Minimum Loss-Based Estimation in Causal Inference.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Outcome-adaptive lasso: Variable selection for causal inference.
| S-EPMC5591052 | biostudies-literature

Causal Inference in Multisensory Heading Estimation.
| S-EPMC5218471 | biostudies-literature

Identification of key somatic oncogenic mutation based on a confounder-free causal inference model.
| S-EPMC9499235 | biostudies-literature

Uncertainty in Propensity Score Estimation: Bayesian Methods for Variable Selection and Model Averaged Causal Effects.
| S-EPMC3969816 | biostudies-literature

Handling Missing Data in Instrumental Variable Methods for Causal Inference.
| S-EPMC8025985 | biostudies-literature

Causal Proportional Hazards Estimation with a Binary Instrumental Variable.
| S-EPMC8716008 | biostudies-literature

A biologist's guide to model selection and causal inference.
| S-EPMC7893255 | biostudies-literature

Missing data estimation in fMRI dynamic causal modeling.
| S-EPMC4082189 | biostudies-literature

A comparison of confounder selection and adjustment methods for estimating causal effects using large healthcare databases.
| S-EPMC9304306 | biostudies-literature

Flexible variable selection in the presence of missing data.
| S-EPMC11323294 | biostudies-literature