Unknown

Dataset Information

0

A two-step method for variable selection in the analysis of a case-cohort study.


ABSTRACT:

Background

Accurate detection and estimation of true exposure-outcome associations is important in aetiological analysis; when there are multiple potential exposure variables of interest, methods for detecting the subset of variables most likely to have true associations with the outcome of interest are required. Case-cohort studies often collect data on a large number of variables which have not been measured in the entire cohort (e.g. panels of biomarkers). There is a lack of guidance on methods for variable selection in case-cohort studies.

Methods

We describe and explore the application of three variable selection methods to data from a case-cohort study. These are: (i) selecting variables based on their level of significance in univariable (i.e. one-at-a-time) Prentice-weighted Cox regression models; (ii) stepwise selection applied to Prentice-weighted Cox regression; and (iii) a two-step method which applies a Bayesian variable selection algorithm to obtain posterior probabilities of selection for each variable using multivariable logistic regression followed by effect estimation using Prentice-weighted Cox regression.

Results

Across nine different simulation scenarios, the two-step method demonstrated higher sensitivity and lower false discovery rate than the one-at-a-time and stepwise methods. In an application of the methods to data from the EPIC-InterAct case-cohort study, the two-step method identified an additional two fatty acids as being associated with incident type 2 diabetes, compared with the one-at-a-time and stepwise methods.

Conclusions

The two-step method enables more powerful and accurate detection of exposure-outcome associations in case-cohort studies. An R package is available to enable researchers to apply this method.

SUBMITTER: Newcombe PJ 

PROVIDER: S-EPMC5913627 | biostudies-literature | 2018 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

A two-step method for variable selection in the analysis of a case-cohort study.

Newcombe P J PJ   Connolly S S   Seaman S S   Richardson S S   Sharp S J SJ  

International journal of epidemiology 20180401 2


<h4>Background</h4>Accurate detection and estimation of true exposure-outcome associations is important in aetiological analysis; when there are multiple potential exposure variables of interest, methods for detecting the subset of variables most likely to have true associations with the outcome of interest are required. Case-cohort studies often collect data on a large number of variables which have not been measured in the entire cohort (e.g. panels of biomarkers). There is a lack of guidance  ...[more]

Similar Datasets

| S-EPMC5436496 | biostudies-literature
| S-EPMC6026088 | biostudies-literature
| S-EPMC6748310 | biostudies-literature
| S-EPMC2804783 | biostudies-literature
| S-EPMC3531267 | biostudies-literature
| S-EPMC5787409 | biostudies-literature
| S-EPMC8778055 | biostudies-literature
| S-EPMC6268217 | biostudies-literature
| S-EPMC6901488 | biostudies-literature
| S-EPMC3025714 | biostudies-literature