Comparing completely and stratified randomized designs in cluster randomized trials when the stratifying factor is cluster size: a simulation study

Abstract Background: Cluster randomized trials, which randomize groups of individuals to an intervention, are common in health services research when one wants to evaluate improvement in a subject's outcome by intervening at an organizational level. For many such trials sample size calculation is performed under the assumption of equal cluster size. Many trials that set out to recruit equal clusters end up with unequal clusters for various reasons. This leads to a misalignment between the method used for sample size calculation and the data analysis, which may affect trial power. Various weighted analysis methods for analyzing cluster means have been suggested to overcome the problem introduced by unbalanced clusters; however, the performance of such methods has not been evaluated extensively.Methods: We examine the use of the general linear model for analysis of clustered randomized trials assuming equal cluster sizes during the planning stage but ending up with unequal clusters. We demonstrate the performance of three approaches using different weights for analyzing the cluster means: (1) the standard analysis of cluster means, (2) weighting by cluster size, and (3) minimum variance weights. Several distributions are used to generate cluster sizes to cover a wide range of patterns of imbalance. The variability in cluster size is measured by the coefficient of variation (CV). By means of a simulation study, we assess the impact of using each of the three analysis methods with respect to type I error and power of the study and how it is affected by the variability in cluster size. Results: Analyses that assumes equal clusters provide a reasonable approximation when cluster sizes vary minimally (CV < 0.30). In an analysis weighted by cluster size type I errors were inflated, and that worsened as the variation in cluster size increases. However, a minimum variance weighted analysis best maintains target power and level of significance under all degrees of imbalance considered. Conclusion: The unweighted analysis works well as an approximate method when the variation in cluster size is minimal. However, using minimum variance weights performs much better across the full range of variation in cluster size and is recommended.

Download Full-text

Sample size estimation for modified Poisson analysis of cluster randomized trials with a binary outcome

Statistical Methods in Medical Research ◽

10.1177/0962280221990415 ◽

2021 ◽

pp. 096228022199041

Author(s):

Fan Li ◽

Guangyu Tong

Keyword(s):

Relative Risk ◽

Sample Size ◽

Poisson Regression ◽

Cluster Size ◽

Randomized Trials ◽

Cluster Randomized Trials ◽

Binomial Regression ◽

Size Variability ◽

Working Correlation ◽

Cluster Randomized

The modified Poisson regression coupled with a robust sandwich variance has become a viable alternative to log-binomial regression for estimating the marginal relative risk in cluster randomized trials. However, a corresponding sample size formula for relative risk regression via the modified Poisson model is currently not available for cluster randomized trials. Through analytical derivations, we show that there is no loss of asymptotic efficiency for estimating the marginal relative risk via the modified Poisson regression relative to the log-binomial regression. This finding holds both under the independence working correlation and under the exchangeable working correlation provided a simple modification is used to obtain the consistent intraclass correlation coefficient estimate. Therefore, the sample size formulas developed for log-binomial regression naturally apply to the modified Poisson regression in cluster randomized trials. We further extend the sample size formulas to accommodate variable cluster sizes. An extensive Monte Carlo simulation study is carried out to validate the proposed formulas. We find that the proposed formulas have satisfactory performance across a range of cluster size variability, as long as suitable finite-sample corrections are applied to the sandwich variance estimator and the number of clusters is at least 10. Our findings also suggest that the sample size estimate under the exchangeable working correlation is more robust to cluster size variability, and recommend the use of an exchangeable working correlation over an independence working correlation for both design and analysis. The proposed sample size formulas are illustrated using the Stop Colorectal Cancer (STOP CRC) trial.

Download Full-text

What are the statistical implications of treatment non‐compliance in cluster randomized trials: A simulation study

Statistics in Medicine ◽

10.1002/sim.8351 ◽

2019 ◽

Vol 38 (26) ◽

pp. 5071-5084 ◽

Cited By ~ 1

Author(s):

Mirjam Moerbeek ◽

Sander van Schie

Keyword(s):

Simulation Study ◽

Randomized Trials ◽

Cluster Randomized Trials ◽

Cluster Randomized

Download Full-text

Generalized estimating equations in cluster randomized trials with a small number of clusters: Review of practice and simulation study

Clinical Trials ◽

10.1177/1740774516643498 ◽

2016 ◽

Vol 13 (4) ◽

pp. 445-449 ◽

Cited By ~ 26

Author(s):

Shuang Huang ◽

Mallorie H Fiero ◽

Melanie L Bell

Keyword(s):

Simulation Study ◽

Generalized Estimating Equations ◽

Randomized Trials ◽

Estimating Equations ◽

Cluster Randomized Trials ◽

Number Of Clusters ◽

Cluster Randomized ◽

Generalized Estimating

Download Full-text

How large are the consequences of covariate imbalance in cluster randomized trials: a simulation study with a continuous outcome and a binary covariate at the cluster level

BMC Medical Research Methodology ◽

10.1186/s12874-016-0182-7 ◽

2016 ◽

Vol 16 (1) ◽

Cited By ~ 4

Author(s):

Mirjam Moerbeek ◽

Sander van Schie

Keyword(s):

Simulation Study ◽

Randomized Trials ◽

Cluster Randomized Trials ◽

Continuous Outcome ◽

Cluster Randomized ◽

Cluster Level

Download Full-text

The Effect of Cluster Size Variability on Statistical Power in Cluster-Randomized Trials

PLoS ONE ◽

10.1371/journal.pone.0119074 ◽

2015 ◽

Vol 10 (4) ◽

pp. e0119074 ◽

Cited By ~ 7

Author(s):

Stephen A. Lauer ◽

Ken P. Kleinman ◽

Nicholas G. Reich

Keyword(s):

Cluster Size ◽

Statistical Power ◽

Randomized Trials ◽

Cluster Randomized Trials ◽

Size Variability ◽

Cluster Randomized

Download Full-text

Propensity scores used for analysis of cluster randomized trials with selection bias: a simulation study

Statistics in Medicine ◽

10.1002/sim.5795 ◽

2013 ◽

Vol 32 (19) ◽

pp. 3357-3372 ◽

Cited By ~ 10

Author(s):

C. Leyrat ◽

A. Caille ◽

A. Donner ◽

B. Giraudeau

Keyword(s):

Selection Bias ◽

Simulation Study ◽

Randomized Trials ◽

Propensity Scores ◽

Cluster Randomized Trials ◽

Cluster Randomized

Download Full-text

Is the R coefficient of interest in cluster randomized trials with a binary outcome?

Statistical Methods in Medical Research ◽

10.1177/0962280219900200 ◽

2020 ◽

Vol 29 (9) ◽

pp. 2470-2480

Author(s):

Ariane M Mbekwe Yepnang ◽

Agnès Caille ◽

Sandra M Eldridge ◽

Bruno Giraudeau

Keyword(s):

Sample Size ◽

Simulation Study ◽

Randomized Trials ◽

Intraclass Correlation ◽

Sample Size Calculation ◽

Sample Size Determination ◽

Binary Outcomes ◽

Cluster Randomized Trials ◽

Empirical Power ◽

Cluster Randomized

In cluster randomized trials, the intraclass correlation coefficient (ICC) is classically used to measure clustering. When the outcome is binary, the ICC is known to be associated with the prevalence of the outcome. This association challenges its interpretation and can be problematic for sample size calculation. To overcome these situations, Crespi et al. extended a coefficient named R, initially proposed by Rosner for ophthalmologic data, to cluster randomized trials. Crespi et al. asserted that R may be less influenced by the outcome prevalence than is the ICC, although the authors provided only empirical data to support their assertion. They also asserted that “the traditional ICC approach to sample size determination tends to overpower studies under many scenarios, calling for more clusters than truly required”, although they did not consider empirical power. The aim of this study was to investigate whether R could indeed be considered independent of the outcome prevalence. We also considered whether sample size calculation should be better based on the R coefficient or the ICC. Considering the particular case of 2 individuals per cluster, we theoretically demonstrated that R is not symmetrical around the 0.5 prevalence value. This in itself demonstrates the dependence of R on prevalence. We also conducted a simulation study to explore the case of both fixed and variable cluster sizes greater than 2. This simulation study demonstrated that R decreases when prevalence increases from 0 to 1. Both the analytical and simulation results demonstrate that R depends on the outcome prevalence. In terms of sample size calculation, we showed that an approach based on the ICC is preferable to an approach based on the R coefficient because with the former, the empirical power is closer to the nominal one. Hence, the R coefficient does not outperform the ICC for binary outcomes because it does not offer any advantage over the ICC.

Download Full-text

Sample Size Issues for Cluster Randomized Trials With Discrete-Time Survival Endpoints

Methodology ◽

10.1027/1614-2241/a000047 ◽

2012 ◽

Vol 8 (4) ◽

pp. 146-158 ◽

Cited By ~ 4

Author(s):

Mirjam Moerbeek

Keyword(s):

Discrete Time ◽

Cluster Size ◽

Randomized Trials ◽

Cluster Randomized Trial ◽

Smoking Initiation ◽

Fixed Number ◽

Cluster Randomized Trials ◽

Number Of Clusters ◽

Sufficient Power ◽

Cluster Randomized

With cluster randomized trials complete groups of subjects are randomized to treatment conditions. An important question might be whether and when the subjects experience a particular event, such as smoking initiation or recovery from disease. In the social sciences the timing of such events is often measured in discrete time by using time intervals. At the planning phase of a cluster randomized trial one should decide on the number of clusters and cluster size such that parameters are estimated accurately and sufficient power on the test on treatment effect is achieved. On basis of a simulation study it is concluded that regression coefficients are estimated more accurately than the variance of the random cluster effect. In addition, it is shown that power increases with cluster size and number of clusters, and that a sufficient power cannot always be achieved by using larger cluster sizes at a fixed number of clusters.

Download Full-text