Unknown

Dataset Information

0

Conditional Sure Independence Screening.


ABSTRACT: Independence screening is powerful for variable selection when the number of variables is massive. Commonly used independence screening methods are based on marginal correlations or its variants. When some prior knowledge on a certain important set of variables is available, a natural assessment on the relative importance of the other predictors is their conditional contributions to the response given the known set of variables. This results in conditional sure independence screening (CSIS). CSIS produces a rich family of alternative screening methods by different choices of the conditioning set and can help reduce the number of false positive and false negative selections when covariates are highly correlated. This paper proposes and studies CSIS in generalized linear models. We give conditions under which sure screening is possible and derive an upper bound on the number of selected variables. We also spell out the situation under which CSIS yields model selection consistency and the properties of CSIS when a data-driven conditioning set is used. Moreover, we provide two data-driven methods to select the thresholding parameter of conditional screening. The utility of the procedure is illustrated by simulation studies and analysis of two real datasets.

SUBMITTER: Barut E 

PROVIDER: S-EPMC5367860 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

altmetric image

Publications

Conditional Sure Independence Screening.

Barut Emre E   Fan Jianqing J   Verhasselt Anneleen A  

Journal of the American Statistical Association 20161018 515


Independence screening is powerful for variable selection when the number of variables is massive. Commonly used independence screening methods are based on marginal correlations or its variants. When some prior knowledge on a certain important set of variables is available, a natural assessment on the relative importance of the other predictors is their conditional contributions to the response given the known set of variables. This results in conditional sure independence screening (CSIS). CSI  ...[more]

Similar Datasets

| S-EPMC6831100 | biostudies-literature
| S-EPMC3887322 | biostudies-literature
| S-EPMC4368776 | biostudies-literature
| S-EPMC3293491 | biostudies-literature
| S-EPMC5308866 | biostudies-literature
| S-EPMC4142445 | biostudies-literature
| S-EPMC5367320 | biostudies-literature
| S-EPMC545795 | biostudies-literature
| S-EPMC3400959 | biostudies-literature
| S-EPMC4634993 | biostudies-literature