Dataset Information

Robust risk prediction with biomarkers under two-phase stratified cohort design.

ABSTRACT: Identification of novel biomarkers for risk prediction is important for disease prevention and optimal treatment selection. However, studies aiming to discover which biomarkers are useful for risk prediction often require the use of stored biological samples from large assembled cohorts, and thus the depletion of a finite and precious resource. To make efficient use of such stored samples, two-phase sampling designs are often adopted as resource-efficient sampling strategies, especially when the outcome of interest is rare. Existing methods for analyzing data from two-phase studies focus primarily on single marker analysis or fitting the Cox regression model to combine information from multiple markers. However, the Cox model may not fit the data well. Under model misspecification, the composite score derived from the Cox model may not perform well in predicting the outcome. Under a general two-phase stratified cohort sampling design, we present a novel approach to combining multiple markers to optimize prediction by fitting a flexible nonparametric transformation model. Using inverse probability weighting to account for the outcome-dependent sampling, we propose to estimate the model parameters by maximizing an objective function which can be interpreted as a weighted C-statistic for survival outcomes. Regardless of model adequacy, the proposed procedure yields a sensible composite risk score for prediction. A major obstacle for making inference under two phase studies is due to the correlation induced by the finite population sampling, which prevents standard inference procedures such as the bootstrap from being used for variance estimation. We propose a resampling procedure to derive valid confidence intervals for the model parameters and the C-statistic accuracy measure. We illustrate the new methods with simulation studies and an analysis of a two-phase study of high-density lipoprotein cholesterol (HDL-C) subtypes for predicting the risk of coronary heart disease.

SUBMITTER: Payne R

PROVIDER: S-EPMC5045782 | biostudies-literature | 2016 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Robust risk prediction with biomarkers under two-phase stratified cohort design.

Payne Rebecca R Yang Ming M Zheng Yingye Y Jensen Majken K MK Cai Tianxi T

Biometrics 20160401 4

Identification of novel biomarkers for risk prediction is important for disease prevention and optimal treatment selection. However, studies aiming to discover which biomarkers are useful for risk prediction often require the use of stored biological samples from large assembled cohorts, and thus the depletion of a finite and precious resource. To make efficient use of such stored samples, two-phase sampling designs are often adopted as resource-efficient sampling strategies, especially when the ...[more]

PMID: 27037494

Similar Datasets

Project description:BackgroundGiven the inherent challenges of conducting randomized phase III trials in older cancer patients, single-arm phase II trials which assess the feasibility of a treatment that has already been shown to be effective in a younger population may provide a compelling alternative. Such an approach would need to evaluate treatment feasibility based on a composite endpoint that combines multiple clinical dimensions and to stratify older patients as fit or frail to account for the heterogeneity of the study population to recommend an appropriate treatment approach. In this context, stratified adaptive two-stage designs for binary or composite endpoints, initially developed for biomarker studies, allow to include two subgroups whilst maintaining competitive statistical performances. In practice, heterogeneity may indeed affect more than one dimension and incorporating co-primary endpoints, which independently assess each individual clinical dimension, would therefore appear quite pertinent. The current paper presents a novel phase II design for co-primary endpoints which takes into account the heterogeneity of a population. METHODS: We developed a stratified adaptive Bryant & Day design based on the Jones et al. and Parashar et al. algorithm. This two-stage design allows to jointly assess two dimensions (e.g. activity and toxicity) in two different subgroups. The operating characteristics of this new design were evaluated using examples and simulation comparisons with the Bryant & Day design in the context where the study population is stratified according to a pre-defined criterion.ResultsSimulation results demonstrated that the new design minimized the expected and maximum sample sizes as compared to parallel Bryant & Day designs (one in each subgroup), whilst controlling type I error rates and maintaining a competitive statistical power as well as a high probability of detecting heterogeneity.ConclusionsIn a heterogeneous population, this two-stage stratified adaptive phase II design provides a useful alternative to classical one and allows to identify a subgroup of interest without dramatically increasing sample size. As heterogeneity is not limited to older populations, this new design may also be relevant to other study populations such as children or adolescents and young adults or the development of targeted therapies based on a biomarker.

Dataset Information

Robust risk prediction with biomarkers under two-phase stratified cohort design.

Publications

Robust risk prediction with biomarkers under two-phase stratified cohort design.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets