Project description:The stepped wedge cluster randomized trial (SW-CRT) is an increasingly popular design for evaluating health service delivery or policy interventions. An essential consideration of this design is the need to account for both within-period and between-period correlations in sample size calculations. Especially when embedded in health care delivery systems, many SW-CRTs may have subclusters nested in clusters, within which outcomes are collected longitudinally. However, existing sample size methods that account for between-period correlations have not allowed for multiple levels of clustering. We present computationally efficient sample size procedures that properly differentiate within-period and between-period intracluster correlation coefficients in SW-CRTs in the presence of subclusters. We introduce an extended block exchangeable correlation matrix to characterize the complex dependencies of outcomes within clusters. For Gaussian outcomes, we derive a closed-form sample size expression that depends on the correlation structure only through two eigenvalues of the extended block exchangeable correlation structure. For non-Gaussian outcomes, we present a generic sample size algorithm based on linearization and elucidate simplifications under canonical link functions. For example, we show that the approximate sample size formula under a logistic linear mixed model depends on three eigenvalues of the extended block exchangeable correlation matrix. We provide an extension to accommodate unequal cluster sizes and validate the proposed methods via simulations. Finally, we illustrate our methods in two real SW-CRTs with subclusters.
Project description:The stepped wedge design is often used to evaluate interventions as they are rolled out across schools, health clinics, communities, or other clusters. Most models used in the design and analysis of stepped wedge trials assume that the intervention effect is immediate and constant over time following implementation of the intervention (the "exposure time"). This is known as the IT (immediate treatment effect) assumption. However, recent research has shown that using methods based on the IT assumption when the treatment effect varies over exposure time can give extremely misleading results. In this manuscript, we discuss the need to carefully specify an appropriate measure of the treatment effect when the IT assumption is violated and we show how a stepped wedge trial can be powered when it is anticipated that the treatment effect will vary as a function of the exposure time. Specifically, we describe how to power a trial when the exposure time indicator (ETI) model of Kenny et al. (Statistics in Medicine, 41, 4311-4339, 2022) is used and the estimand of interest is a weighted average of the time-varying treatment effects. We apply these methods to the ADDRESS-BP trial, a type 3 hybrid implementation study designed to address racial disparities in health care by evaluating a practice-based implementation strategy to reduce hypertension in African American communities.
Project description:BackgroundControlled implementation trials often randomize the intervention at the site level, enrolling relatively few sites (e.g., 6-20) compared to trials that randomize by subject. Trials with few sites carry a substantial risk of an imbalance between intervened (cases) and non-intervened (control) sites in important site characteristics, thereby threatening the internal validity of the primary comparison. A stepped wedge design (SWD) staggers the intervention at sites over a sequence of times or time waves until all sites eventually receive the intervention. We propose a new randomization method, sequential balance, to control time trend in site allocation by minimizing sequential imbalance across multiple characteristics. We illustrate the new method by applying it to a SWD implementation trial.MethodsThe trial investigated the impact of blended internal-external facilitation on the establishment of evidence-based teams in general mental health clinics in nine US Department of Veterans Affairs medical centers. Prior to randomization to start time, an expert panel of implementation researchers and health system program leaders identified by consensus a series of eight facility-level characteristics judged relevant to the success of implementation. We characterized each of the nine sites according to these consensus features. Using a weighted sum of these characteristics, we calculated imbalance scores for each of 1680 possible site assignments to identify the most sequentially balanced assignment schemes.ResultsFrom 1680 possible site assignments, we identified 34 assignments with minimal imbalance scores, and then randomly selected one assignment by which to randomize start time. Initially, the mean imbalance score was 3.10, but restricted to the 34 assignments, it declined to 0.99.ConclusionsSequential balancing of site characteristics across groups of sites in the time waves of a SWD strengthens the internal validity of study conclusions by minimizing potential confounding.Trial registrationRegistered at ClinicalTrials.gov as clinical trials # NCT02543840 ; entered 9/4/2015.
Project description:In stepped wedge cluster randomized trials, intact clusters of individuals switch from control to intervention from a randomly assigned period onwards. Such trials are becoming increasingly popular in health services research. When a closed cohort is recruited from each cluster for longitudinal follow-up, proper sample size calculation should account for three distinct types of intraclass correlations: the within-period, the inter-period, and the within-individual correlations. Setting the latter two correlation parameters to be equal accommodates cross-sectional designs. We propose sample size procedures for continuous and binary responses within the framework of generalized estimating equations that employ a block exchangeable within-cluster correlation structure defined from the distinct correlation types. For continuous responses, we show that the intraclass correlations affect power only through two eigenvalues of the correlation matrix. We demonstrate that analytical power agrees well with simulated power for as few as eight clusters, when data are analyzed using bias-corrected estimating equations for the correlation parameters concurrently with a bias-corrected sandwich variance estimator.
Project description:Numerous publications have now addressed the principles of designing, analyzing, and reporting the results of stepped-wedge cluster randomized trials. In contrast, there is little research available pertaining to the design and analysis of multiarm stepped-wedge cluster randomized trials, utilized to evaluate the effectiveness of multiple experimental interventions. In this paper, we address this by explaining how the required sample size in these multiarm trials can be ascertained when data are to be analyzed using a linear mixed model. We then go on to describe how the design of such trials can be optimized to balance between minimizing the cost of the trial and minimizing some function of the covariance matrix of the treatment effect estimates. Using a recently commenced trial that will evaluate the effectiveness of sensor monitoring in an occupational therapy rehabilitation program for older persons after hip fracture as an example, we demonstrate that our designs could reduce the number of observations required for a fixed power level by up to 58%. Consequently, when logistical constraints permit the utilization of any one of a range of possible multiarm stepped-wedge cluster randomized trial designs, researchers should consider employing our approach to optimize their trials efficiency.
Project description:The ability to accurately estimate the sample size required by a stepped-wedge (SW) cluster randomized trial (CRT) routinely depends upon the specification of several nuisance parameters. If these parameters are misspecified, the trial could be overpowered, leading to increased cost, or underpowered, enhancing the likelihood of a false negative. We address this issue here for cross-sectional SW-CRTs, analyzed with a particular linear-mixed model, by proposing methods for blinded and unblinded sample size reestimation (SSRE). First, blinded estimators for the variance parameters of a SW-CRT analyzed using the Hussey and Hughes model are derived. Following this, procedures for blinded and unblinded SSRE after any time period in a SW-CRT are detailed. The performance of these procedures is then examined and contrasted using two example trial design scenarios. We find that if the two key variance parameters were underspecified by 50%, the SSRE procedures were able to increase power over the conventional SW-CRT design by up to 41%, resulting in an empirical power above the desired level. Thus, though there are practical issues to consider, the performance of the procedures means researchers should consider incorporating SSRE in to future SW-CRTs.
Project description:ObjectivesTo clarify and illustrate sample size calculations for the cross-sectional stepped wedge cluster randomized trial (SW-CRT) and to present a simple approach for comparing the efficiencies of competing designs within a unified framework.Study design and settingWe summarize design effects for the SW-CRT, the parallel cluster randomized trial (CRT), and the parallel cluster randomized trial with before and after observations (CRT-BA), assuming cross-sectional samples are selected over time. We present new formulas that enable trialists to determine the required cluster size for a given number of clusters. We illustrate by example how to implement the presented design effects and give practical guidance on the design of stepped wedge studies.ResultsFor a fixed total cluster size, the choice of study design that provides the greatest power depends on the intracluster correlation coefficient (ICC) and the cluster size. When the ICC is small, the CRT tends to be more efficient; when the ICC is large, the SW-CRT tends to be more efficient and can serve as an alternative design when the CRT is an infeasible design.ConclusionOur unified approach allows trialists to easily compare the efficiencies of three competing designs to inform the decision about the most efficient design in a given scenario.
Project description:Recent years have seen a surge of interest in stepped-wedge cluster randomized trials (SW-CRTs). SW-CRTs include several design variations and methodology is rapidly developing. Accordingly, a variety of power and sample size calculation software for SW-CRTs has been developed. However, each calculator may support only a selected set of design features and may not be appropriate for all scenarios. Currently, there is no resource to assist researchers in selecting the most appropriate calculator for planning their trials. In this paper, we review and classify 18 existing calculators that can be implemented in major platforms, such as R, SAS, Stata, Microsoft Excel, PASS and nQuery. After reviewing the main sample size considerations for SW-CRTs, we summarize the features supported by the available calculators, including the types of designs, outcomes, correlation structures and treatment effects; whether incomplete designs, cluster-size variation or secular trends are accommodated; and the analytical approach used. We then discuss in more detail four main calculators and identify their strengths and limitations. We illustrate how to use these four calculators to compute power for two real SW-CRTs with a continuous and binary outcome and compare the results. We show that the choice of calculator can make a substantial difference in the calculated power and explain these differences. Finally, we make recommendations for implementing sample size or power calculations using the available calculators. An R Shiny app is available for users to select the calculator that meets their requirements (https://douyang.shinyapps.io/swcrtcalculator/).
Project description:Multivariate outcomes are common in pragmatic cluster randomized trials. While sample size calculation procedures for multivariate outcomes exist under parallel assignment, none have been developed for a stepped wedge design. In this article, we present computationally efficient power and sample size procedures for stepped wedge cluster randomized trials (SW-CRTs) with multivariate outcomes that differentiate the within-period and between-period intracluster correlation coefficients (ICCs). Under a multivariate linear mixed model, we derive the joint distribution of the intervention test statistics which can be used for determining power under different hypotheses and provide an example using the commonly utilized intersection-union test for co-primary outcomes. Simplifications under a common treatment effect and common ICCs across endpoints and an extension to closed-cohort designs are also provided. Finally, under the common ICC across endpoints assumption, we formally prove that the multivariate linear mixed model leads to a more efficient treatment effect estimator compared to the univariate linear mixed model, providing a rigorous justification on the use of the former with multivariate outcomes. We illustrate application of the proposed methods using data from an existing SW-CRT and present extensive simulations to validate the methods.
Project description:Stepped-wedge (SW) designs have been steadily implemented in a variety of trials. A SW design typically assumes a three-level hierarchical data structure where participants are nested within times or periods which are in turn nested within clusters. Therefore, statistical models for analysis of SW trial data need to consider two correlations, the first and second level correlations. Existing power functions and sample size determination formulas had been derived based on statistical models for two-level data structures. Consequently, the second-level correlation has not been incorporated in conventional power analyses. In this paper, we derived a closed-form explicit power function based on a statistical model for three-level continuous outcome data. The power function is based on a pooled overall estimate of stratified cluster-specific estimates of an intervention effect. The sampling distribution of the pooled estimate is derived by applying a fixed-effect meta-analytic approach. Simulation studies verified that the derived power function is unbiased and can be applicable to varying number of participants per period per cluster. In addition, when data structures are assumed to have two levels, we compare three types of power functions by conducting additional simulation studies under a two-level statistical model. In this case, the power function based on a sampling distribution of a marginal, as opposed to pooled, estimate of the intervention effect performed the best. Extensions of power functions to binary outcomes are also suggested.