Sample Size Calculation for Cluster Randomized Trials

In cluster randomized trials, the intraclass correlation coefficient (ICC) is classically used to measure clustering. When the outcome is binary, the ICC is known to be associated with the prevalence of the outcome. This association challenges its interpretation and can be problematic for sample size calculation. To overcome these situations, Crespi et al. extended a coefficient named R, initially proposed by Rosner for ophthalmologic data, to cluster randomized trials. Crespi et al. asserted that R may be less influenced by the outcome prevalence than is the ICC, although the authors provided only empirical data to support their assertion. They also asserted that “the traditional ICC approach to sample size determination tends to overpower studies under many scenarios, calling for more clusters than truly required”, although they did not consider empirical power. The aim of this study was to investigate whether R could indeed be considered independent of the outcome prevalence. We also considered whether sample size calculation should be better based on the R coefficient or the ICC. Considering the particular case of 2 individuals per cluster, we theoretically demonstrated that R is not symmetrical around the 0.5 prevalence value. This in itself demonstrates the dependence of R on prevalence. We also conducted a simulation study to explore the case of both fixed and variable cluster sizes greater than 2. This simulation study demonstrated that R decreases when prevalence increases from 0 to 1. Both the analytical and simulation results demonstrate that R depends on the outcome prevalence. In terms of sample size calculation, we showed that an approach based on the ICC is preferable to an approach based on the R coefficient because with the former, the empirical power is closer to the nominal one. Hence, the R coefficient does not outperform the ICC for binary outcomes because it does not offer any advantage over the ICC.

Download Full-text

Comparison of empirical study power in sample size calculation approaches for cluster randomized trials with varying cluster sizes – a continuous outcome endpoint

Open Access Medical Statistics ◽

10.2147/oams.s96508 ◽

2016 ◽

pp. 1

Author(s):

Mavuto Mukaka ◽

Lawrence Moulton

Keyword(s):

Empirical Study ◽

Sample Size ◽

Randomized Trials ◽

Sample Size Calculation ◽

Cluster Randomized Trials ◽

Continuous Outcome ◽

Study Power ◽

Cluster Randomized

Download Full-text

Statistical approaches to computing sample size in cluster randomized trials: a simulation study

10.21203/rs.2.18978/v1 ◽

2019 ◽

Author(s):

Ashutosh Ranjan ◽

Guangzi Song ◽

Christopher S Coffey ◽

Leslie A McClure

Keyword(s):

Sample Size ◽

Simulation Study ◽

Cluster Size ◽

Randomized Trials ◽

Sample Size Calculation ◽

Minimum Variance ◽

Type I ◽

Cluster Randomized Trials ◽

Weighted Analysis ◽

Cluster Randomized

Abstract Background: Cluster randomized trials, which randomize groups of individuals to an intervention, are common in health services research when one wants to evaluate improvement in a subject's outcome by intervening at an organizational level. For many such trials sample size calculation is performed under the assumption of equal cluster size. Many trials that set out to recruit equal clusters end up with unequal clusters for various reasons. This leads to a misalignment between the method used for sample size calculation and the data analysis, which may affect trial power. Various weighted analysis methods for analyzing cluster means have been suggested to overcome the problem introduced by unbalanced clusters; however, the performance of such methods has not been evaluated extensively.Methods: We examine the use of the general linear model for analysis of clustered randomized trials assuming equal cluster sizes during the planning stage but ending up with unequal clusters. We demonstrate the performance of three approaches using different weights for analyzing the cluster means: (1) the standard analysis of cluster means, (2) weighting by cluster size, and (3) minimum variance weights. Several distributions are used to generate cluster sizes to cover a wide range of patterns of imbalance. The variability in cluster size is measured by the coefficient of variation (CV). By means of a simulation study, we assess the impact of using each of the three analysis methods with respect to type I error and power of the study and how it is affected by the variability in cluster size. Results: Analyses that assumes equal clusters provide a reasonable approximation when cluster sizes vary minimally (CV < 0.30). In an analysis weighted by cluster size type I errors were inflated, and that worsened as the variation in cluster size increases. However, a minimum variance weighted analysis best maintains target power and level of significance under all degrees of imbalance considered. Conclusion: The unweighted analysis works well as an approximate method when the variation in cluster size is minimal. However, using minimum variance weights performs much better across the full range of variation in cluster size and is recommended.

Download Full-text

Simple sample size calculation for cluster-randomized trials

International Journal of Epidemiology ◽

10.1093/ije/28.2.319 ◽

1999 ◽

Vol 28 (2) ◽

pp. 319-326 ◽

Cited By ~ 507

Author(s):

R. J. Hayes ◽

S. Bennett

Keyword(s):

Sample Size ◽

Randomized Trials ◽

Sample Size Calculation ◽

Cluster Randomized Trials ◽

Cluster Randomized

Download Full-text

Sample size calculation in three‐level cluster randomized trials using generalized estimating equation models

Statistics in Medicine ◽

10.1002/sim.8670 ◽

2020 ◽

Vol 39 (24) ◽

pp. 3347-3372

Author(s):

Jingxia Liu ◽

Graham A. Colditz

Keyword(s):

Sample Size ◽

Randomized Trials ◽

Sample Size Calculation ◽

Estimating Equation ◽

Generalized Estimating Equation ◽

Cluster Randomized Trials ◽

Cluster Randomized ◽

Level Cluster ◽

Generalized Estimating

Download Full-text

Statistical approaches to computing sample size in cluster randomized trials: a simulation study

10.21203/rs.2.18978/v2 ◽

2020 ◽

Author(s):

Ashutosh Ranjan ◽

Guangzi Song ◽

Christopher S Coffey ◽

Leslie A McClure

Keyword(s):

Sample Size ◽

Cluster Size ◽

Randomized Trials ◽

Sample Size Calculation ◽

Minimum Variance ◽

Type I ◽

Cluster Randomized Trials ◽

Planning Stage ◽

Cluster Randomized ◽

The Impact

Abstract Background: Cluster randomized trials, which randomize groups of individuals to an intervention, are common in health services research when one wants to evaluate improvement in a subject's outcome by intervening at an organizational level. For many such trials, sample size calculation is performed under the assumption of equal cluster size. For a variety of reasons, many trials that set out to recruit clusters of the same size end up with unequal clusters. This leads to a misalignment between the method used for sample size calculation and the data analysis, which may affect trial power. Various weighted analysis methods for analyzing cluster means have been suggested to overcome the problem introduced by unbalanced clusters; however, the performance of such methods has not been evaluated extensively. Methods: We examine the use of the general linear model for analysis of clustered randomized trials that assume equal cluster sizes during the planning stage, but for which the realized cluster sizes are unequal. We demonstrate the performance of three approaches using different weights for analyzing the cluster means: (1) the standard analysis of cluster means, (2) weighting by cluster size, and (3) minimum variance weights. Several distributions are used to generate cluster sizes to assess a range of patterns of imbalance. The variability in cluster size is measured by the coefficient of variation (CV). We assess the impact of using each of the three methods of analysis with respect to type I error and power of the study and how each are impacted by the variability in cluster size via simulations. Results: Analyses that assumes equal clusters provide a reasonable approximation when cluster sizes vary minimally (CV < 0.30). For analyses weighted by cluster size type I errors were inflated, and that worsened as the variation in cluster size increases, despite reasonable power. However, minimum variance weighted analyses best maintain target power and level of significance under scenarios considered. Conclusion: Unweighted analyses work well as an approximate method when variation in cluster size is minimal. However, using minimum variance weights performs much better across the full range of variation in cluster size and is recommended.

Download Full-text