Unknown

Dataset Information

0

Adequate sample size for developing prediction models is not simply related to events per variable.


ABSTRACT:

Objectives

The choice of an adequate sample size for a Cox regression analysis is generally based on the rule of thumb derived from simulation studies of a minimum of 10 events per variable (EPV). One simulation study suggested scenarios in which the 10 EPV rule can be relaxed. The effect of a range of binary predictors with varying prevalence, reflecting clinical practice, has not yet been fully investigated.

Study design and setting

We conducted an extended resampling study using a large general-practice data set, comprising over 2 million anonymized patient records, to examine the EPV requirements for prediction models with low-prevalence binary predictors developed using Cox regression. The performance of the models was then evaluated using an independent external validation data set. We investigated both fully specified models and models derived using variable selection.

Results

Our results indicated that an EPV rule of thumb should be data driven and that EPV ? 20 ? generally eliminates bias in regression coefficients when many low-prevalence predictors are included in a Cox model.

Conclusion

Higher EPV is needed when low-prevalence predictors are present in a model to eliminate bias in regression coefficients and improve predictive accuracy.

SUBMITTER: Ogundimu EO 

PROVIDER: S-EPMC5045274 | biostudies-literature | 2016 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Adequate sample size for developing prediction models is not simply related to events per variable.

Ogundimu Emmanuel O EO   Altman Douglas G DG   Collins Gary S GS  

Journal of clinical epidemiology 20160308


<h4>Objectives</h4>The choice of an adequate sample size for a Cox regression analysis is generally based on the rule of thumb derived from simulation studies of a minimum of 10 events per variable (EPV). One simulation study suggested scenarios in which the 10 EPV rule can be relaxed. The effect of a range of binary predictors with varying prevalence, reflecting clinical practice, has not yet been fully investigated.<h4>Study design and setting</h4>We conducted an extended resampling study usin  ...[more]

Similar Datasets

| S-EPMC6710621 | biostudies-literature
| S-EPMC10439652 | biostudies-literature
| S-EPMC6590172 | biostudies-literature
| S-EPMC4048208 | biostudies-literature
| S-EPMC8649413 | biostudies-literature
| S-EPMC8026952 | biostudies-literature
| S-EPMC6874355 | biostudies-literature
| S-EPMC7868330 | biostudies-literature
| S-EPMC9995454 | biostudies-literature
| S-EPMC8352630 | biostudies-literature