Project description:The stepped wedge cluster randomized trial (SW-CRT) is an increasingly popular design for evaluating health service delivery or policy interventions. An essential consideration of this design is the need to account for both within-period and between-period correlations in sample size calculations. Especially when embedded in health care delivery systems, many SW-CRTs may have subclusters nested in clusters, within which outcomes are collected longitudinally. However, existing sample size methods that account for between-period correlations have not allowed for multiple levels of clustering. We present computationally efficient sample size procedures that properly differentiate within-period and between-period intracluster correlation coefficients in SW-CRTs in the presence of subclusters. We introduce an extended block exchangeable correlation matrix to characterize the complex dependencies of outcomes within clusters. For Gaussian outcomes, we derive a closed-form sample size expression that depends on the correlation structure only through two eigenvalues of the extended block exchangeable correlation structure. For non-Gaussian outcomes, we present a generic sample size algorithm based on linearization and elucidate simplifications under canonical link functions. For example, we show that the approximate sample size formula under a logistic linear mixed model depends on three eigenvalues of the extended block exchangeable correlation matrix. We provide an extension to accommodate unequal cluster sizes and validate the proposed methods via simulations. Finally, we illustrate our methods in two real SW-CRTs with subclusters.
Project description:BACKGROUND:Controlled implementation trials often randomize the intervention at the site level, enrolling relatively few sites (e.g., 6-20) compared to trials that randomize by subject. Trials with few sites carry a substantial risk of an imbalance between intervened (cases) and non-intervened (control) sites in important site characteristics, thereby threatening the internal validity of the primary comparison. A stepped wedge design (SWD) staggers the intervention at sites over a sequence of times or time waves until all sites eventually receive the intervention. We propose a new randomization method, sequential balance, to control time trend in site allocation by minimizing sequential imbalance across multiple characteristics. We illustrate the new method by applying it to a SWD implementation trial. METHODS:The trial investigated the impact of blended internal-external facilitation on the establishment of evidence-based teams in general mental health clinics in nine US Department of Veterans Affairs medical centers. Prior to randomization to start time, an expert panel of implementation researchers and health system program leaders identified by consensus a series of eight facility-level characteristics judged relevant to the success of implementation. We characterized each of the nine sites according to these consensus features. Using a weighted sum of these characteristics, we calculated imbalance scores for each of 1680 possible site assignments to identify the most sequentially balanced assignment schemes. RESULTS:From 1680 possible site assignments, we identified 34 assignments with minimal imbalance scores, and then randomly selected one assignment by which to randomize start time. Initially, the mean imbalance score was 3.10, but restricted to the 34 assignments, it declined to 0.99. CONCLUSIONS:Sequential balancing of site characteristics across groups of sites in the time waves of a SWD strengthens the internal validity of study conclusions by minimizing potential confounding. TRIAL REGISTRATION:Registered at ClinicalTrials.gov as clinical trials # NCT02543840 ; entered 9/4/2015.
Project description:In stepped wedge cluster randomized trials, intact clusters of individuals switch from control to intervention from a randomly assigned period onwards. Such trials are becoming increasingly popular in health services research. When a closed cohort is recruited from each cluster for longitudinal follow-up, proper sample size calculation should account for three distinct types of intraclass correlations: the within-period, the inter-period, and the within-individual correlations. Setting the latter two correlation parameters to be equal accommodates cross-sectional designs. We propose sample size procedures for continuous and binary responses within the framework of generalized estimating equations that employ a block exchangeable within-cluster correlation structure defined from the distinct correlation types. For continuous responses, we show that the intraclass correlations affect power only through two eigenvalues of the correlation matrix. We demonstrate that analytical power agrees well with simulated power for as few as eight clusters, when data are analyzed using bias-corrected estimating equations for the correlation parameters concurrently with a bias-corrected sandwich variance estimator.
Project description:Numerous publications have now addressed the principles of designing, analyzing, and reporting the results of stepped-wedge cluster randomized trials. In contrast, there is little research available pertaining to the design and analysis of multiarm stepped-wedge cluster randomized trials, utilized to evaluate the effectiveness of multiple experimental interventions. In this paper, we address this by explaining how the required sample size in these multiarm trials can be ascertained when data are to be analyzed using a linear mixed model. We then go on to describe how the design of such trials can be optimized to balance between minimizing the cost of the trial and minimizing some function of the covariance matrix of the treatment effect estimates. Using a recently commenced trial that will evaluate the effectiveness of sensor monitoring in an occupational therapy rehabilitation program for older persons after hip fracture as an example, we demonstrate that our designs could reduce the number of observations required for a fixed power level by up to 58%. Consequently, when logistical constraints permit the utilization of any one of a range of possible multiarm stepped-wedge cluster randomized trial designs, researchers should consider employing our approach to optimize their trials efficiency.
Project description:ObjectivesTo clarify and illustrate sample size calculations for the cross-sectional stepped wedge cluster randomized trial (SW-CRT) and to present a simple approach for comparing the efficiencies of competing designs within a unified framework.Study design and settingWe summarize design effects for the SW-CRT, the parallel cluster randomized trial (CRT), and the parallel cluster randomized trial with before and after observations (CRT-BA), assuming cross-sectional samples are selected over time. We present new formulas that enable trialists to determine the required cluster size for a given number of clusters. We illustrate by example how to implement the presented design effects and give practical guidance on the design of stepped wedge studies.ResultsFor a fixed total cluster size, the choice of study design that provides the greatest power depends on the intracluster correlation coefficient (ICC) and the cluster size. When the ICC is small, the CRT tends to be more efficient; when the ICC is large, the SW-CRT tends to be more efficient and can serve as an alternative design when the CRT is an infeasible design.ConclusionOur unified approach allows trialists to easily compare the efficiencies of three competing designs to inform the decision about the most efficient design in a given scenario.
Project description:The ability to accurately estimate the sample size required by a stepped-wedge (SW) cluster randomized trial (CRT) routinely depends upon the specification of several nuisance parameters. If these parameters are misspecified, the trial could be overpowered, leading to increased cost, or underpowered, enhancing the likelihood of a false negative. We address this issue here for cross-sectional SW-CRTs, analyzed with a particular linear-mixed model, by proposing methods for blinded and unblinded sample size reestimation (SSRE). First, blinded estimators for the variance parameters of a SW-CRT analyzed using the Hussey and Hughes model are derived. Following this, procedures for blinded and unblinded SSRE after any time period in a SW-CRT are detailed. The performance of these procedures is then examined and contrasted using two example trial design scenarios. We find that if the two key variance parameters were underspecified by 50%, the SSRE procedures were able to increase power over the conventional SW-CRT design by up to 41%, resulting in an empirical power above the desired level. Thus, though there are practical issues to consider, the performance of the procedures means researchers should consider incorporating SSRE in to future SW-CRTs.
Project description:BACKGROUND:Cluster randomised trials with unequal sized clusters often have lower precision than with clusters of equal size. To allow for this, sample sizes are inflated by a modified version of the design effect for clustering. These inflation factors are valid under the assumption that randomisation is stratified by cluster size. We investigate the impact of unequal cluster size when that constraint is relaxed, with particular focus on the stepped-wedge cluster randomised trial, where this is more difficult to achieve. METHODS:Assuming a multi-level mixed effect model with exchangeable correlation structure for a cross-sectional design, we use simulation methods to compare the precision for a trial with clusters of unequal size to a trial with clusters of equal size (relative efficiency). For a range of scenarios we illustrate the impact of various design features (the cluster-mean correlation - a function of the intracluster correlation and the cluster size, the number of clusters, number of randomisation sequences) on the average and distribution of the relative efficiency. RESULTS:Simulations confirm that the average reduction in precision, due to varying cluster sizes, is smaller in a stepped-wedge trial compared to the parallel trial. However, the variance of the distribution of the relative efficiency is large; and is larger under the stepped-wedge design compared to the parallel design. This can result in large variations in actual power, depending on the allocation of clusters to sequences. Designs with larger variations in cluster sizes, smaller number of clusters and studies with smaller cluster-mean correlations (smaller cluster sizes or smaller intra-cluster correlation) are particularly at risk. CONCLUSION:The actual realised power in a stepped-wedge trial might be substantially higher or lower than that estimated. This is particularly important when there are a small number of clusters or the variability in cluster sizes is large. Constraining the randomisation on cluster size, where feasible, might mitigate this effect.
Project description:Stepped-wedge (SW) designs have been steadily implemented in a variety of trials. A SW design typically assumes a three-level hierarchical data structure where participants are nested within times or periods which are in turn nested within clusters. Therefore, statistical models for analysis of SW trial data need to consider two correlations, the first and second level correlations. Existing power functions and sample size determination formulas had been derived based on statistical models for two-level data structures. Consequently, the second-level correlation has not been incorporated in conventional power analyses. In this paper, we derived a closed-form explicit power function based on a statistical model for three-level continuous outcome data. The power function is based on a pooled overall estimate of stratified cluster-specific estimates of an intervention effect. The sampling distribution of the pooled estimate is derived by applying a fixed-effect meta-analytic approach. Simulation studies verified that the derived power function is unbiased and can be applicable to varying number of participants per period per cluster. In addition, when data structures are assumed to have two levels, we compare three types of power functions by conducting additional simulation studies under a two-level statistical model. In this case, the power function based on a sampling distribution of a marginal, as opposed to pooled, estimate of the intervention effect performed the best. Extensions of power functions to binary outcomes are also suggested.
Project description:BackgroundA cluster trial with unequal cluster sizes often has lower precision than one with equal clusters, with a corresponding inflation of the design effect. For parallel group trials, adjustments to the design effect are available under sampling models with a single intracluster correlation. Design effects for equal clusters under more complex scenarios have appeared recently (including stepped wedge trials under cross-sectional or longitudinal sampling). We investigate the impact of unequal cluster size in these more general settings.ResultsAssuming a linear mixed model with an exchangeable correlation structure that incorporates cluster and subject autocorrelation, we compute the relative efficiency (RE) of a trial with clusters of unequal size under a size-stratified randomization scheme, as compared to an equal cluster trial with the same total number of observations. If there are no within-cluster time effects, the RE exceeds that for a parallel trial. In general, the RE is a weighted average of the RE for a parallel trial and the RE for a crossover trial in the same clusters. Existing approximations for parallel designs are extended to the general setting. Increasing the cluster size by the factor (1 + CV2 ), where CV is the coefficient of variation of cluster size, leads to conservative sample sizes, as in a popular method for parallel trials.ConclusionMethods to assess experimental precision for single-period parallel trials with unequal cluster sizes can be extended to stepped wedge and other complete layouts under longitudinal or cross-sectional sampling. In practice, the loss of precision due to unequal cluster sizes is unlikely to exceed 12%.
Project description:Background: Stepped-wedge designs (SWDs) are currently being used in the investigation of interventions to reduce opioid-related deaths in communities located in several states. However, these interventions are competing with external factors such as newly initiated public policies limiting opioid prescriptions, media awareness campaigns, and COVID-19 social distancing mandates. Furthermore, control communities may prematurely adopt components of the intervention as they become available. The presence of time-varying external factors that impact study outcomes is a well-known limitation of SWDs; common approaches to adjusting for them make use of a mixed effects modeling framework. However, these models have several shortcomings when external factors differentially impact intervention and control clusters. Methods: We discuss limitations of commonly used mixed effects models in the context of proposed SWDs to investigate interventions intended to reduce opioid-related mortality, and propose extensions of these models to address these limitations. We conduct an extensive simulation study of anticipated data from SWD trials targeting the current opioid epidemic in order to examine the performance of these models in the presence of external factors. We consider confounding by time, premature adoption of components of the intervention, and time-varying effect modificationâ€" in which external factors differentially impact intervention and control clusters. Results: In the presence of confounding by time, commonly used mixed effects models yield unbiased intervention effect estimates, but can have inflated Type 1 error and result in under coverage of confidence intervals. These models yield biased intervention effect estimates when premature intervention adoption or effect modification are present. In such scenarios, models incorporating fixed intervention-by-time interactions with an unstructured covariance for intervention-by-cluster-by-time random effects result in unbiased intervention effect estimates, reach nominal confidence interval coverage, and preserve Type 1 error. Conclusions: Mixed effects models can adjust for different combinations of external factors through correct specification of fixed and random time effects; misspecification can result in bias of the intervention effect estimate, under coverage of confidence intervals, and Type 1 error inflation. Since model choice has considerable impact on validity of results and study power, careful consideration must be given to choosing appropriate models that account for potential external factors.