Dataset Information

Two-stage sampling designs for external validation of personal risk models.

ABSTRACT: We propose a cost-effective sampling design and estimating procedure for validating personal risk models using right-censored cohort data. Validation involves using each subject's covariates, as ascertained at cohort entry, in a risk model (specified independently of the data) to assign him/her a probability of an adverse outcome within a future time period. Subjects are then grouped according to the magnitudes of their assigned risks, and within each group, the mean assigned risk is compared with the probability of outcome occurrence as estimated using the follow-up data. Such validation presents two complications. First, in the presence of right-censoring, estimating the probability of developing the outcomes before death requires competing risk analysis. Second, for rare outcomes, validation using the full cohort requires assembling covariates and assigning risks to thousands of subjects. This can be costly when some covariates involve analyzing biological specimens. A two-stage sampling design addresses this problem by assembling covariates and assigning risks only to those subjects most informative for estimating key parameters. We use this design to estimate the outcome probabilities needed to evaluate model performance and we provide theoretical and bootstrap estimates of their variances. We also describe how to choose two-stage designs with minimal efficiency loss for a parameter of interest when the quantities determining optimality are unknown at the time of design. We illustrate these methods by using subjects in the California Teachers Study to validate ovarian cancer risk models. We find that a design with optimal efficiency for one performance parameter need not be so for others, and trade-offs will be required. A two-stage design that samples all outcome-positive subjects and more outcome-negative than censored subjects will perform well in most circumstances. The methods are implemented in Risk Model Assessment Program, an R program freely available at http://med.stanford.edu/epidemiology/two-stage.html.

SUBMITTER: Whittemore AS

PROVIDER: S-EPMC3971015 | biostudies-literature | 2016 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Two-stage sampling designs for external validation of personal risk models.

Whittemore Alice S AS Halpern Jerry J

Statistical methods in medical research 20130416 4

We propose a cost-effective sampling design and estimating procedure for validating personal risk models using right-censored cohort data. Validation involves using each subject's covariates, as ascertained at cohort entry, in a risk model (specified independently of the data) to assign him/her a probability of an adverse outcome within a future time period. Subjects are then grouped according to the magnitudes of their assigned risks, and within each group, the mean assigned risk is compared wi ...[more]

PMID: 23592716

Dataset Information

Two-stage sampling designs for external validation of personal risk models.

Publications

Two-stage sampling designs for external validation of personal risk models.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Nonparametric Maximum Likelihood Estimators of Time-Dependent Accuracy Measures for Survival Outcome Under Two-Stage Sampling Designs.
| S-EPMC6291304 | biostudies-literature

Novel two-phase sampling designs for studying binary outcomes.
| S-EPMC7042058 | biostudies-literature

Severe radiation-induced lymphopenia during concurrent chemoradiotherapy for stage III non-small cell lung cancer: external validation of two prediction models.
| S-EPMC10665840 | biostudies-literature

A new concordance measure for risk prediction models in external validation settings.
| S-EPMC5550798 | biostudies-literature

External validation and comparison of risk score models in pediatric heart transplants.
| S-EPMC9157612 | biostudies-literature

Two-stage family-based designs for sequencing studies.
| S-EPMC4143728 | biostudies-literature

False discovery rate control in two-stage designs.
| S-EPMC3496575 | biostudies-literature

External validation of two mpMRI-risk calculators predicting risk of prostate cancer before biopsy.
| S-EPMC9512729 | biostudies-literature

External validation of risk prediction models for incident colorectal cancer using UK Biobank.
| S-EPMC5846069 | biostudies-literature

A permutation method to assess heterogeneity in external validation for risk prediction models.
| S-EPMC4301917 | biostudies-literature