A review of methodology for sample size calculations in cluster randomised trials

Background: Sample size calculations for longitudinal cluster randomised trials, such as crossover and stepped-wedge trials, require estimates of the assumed correlation structure. This includes both within-period intra-cluster correlations, which importantly differ from conventional intra-cluster correlations by their dependence on period, and also cluster autocorrelation coefficients to model correlation decay. There are limited resources to inform these estimates. In this article, we provide a repository of correlation estimates from a bank of real-world clustered datasets. These are provided under several assumed correlation structures, namely exchangeable, block-exchangeable and discrete-time decay correlation structures. Methods: Longitudinal studies with clustered outcomes were collected to form the CLustered OUtcome Dataset bank. Forty-four available continuous outcomes from 29 datasets were obtained and analysed using each correlation structure. Patterns of within-period intra-cluster correlation coefficient and cluster autocorrelation coefficients were explored by study characteristics. Results: The median within-period intra-cluster correlation coefficient for the discrete-time decay model was 0.05 (interquartile range: 0.02–0.09) with a median cluster autocorrelation of 0.73 (interquartile range: 0.19–0.91). The within-period intra-cluster correlation coefficients were similar for the exchangeable, block-exchangeable and discrete-time decay correlation structures. Within-period intra-cluster correlation coefficients and cluster autocorrelations were found to vary with the number of participants per cluster-period, the period-length, type of cluster (primary care, secondary care, community or school) and country income status (high-income country or low- and middle-income country). The within-period intra-cluster correlation coefficients tended to decrease with increasing period-length and slightly decrease with increasing cluster-period sizes, while the cluster autocorrelations tended to move closer to 1 with increasing cluster-period size. Using the CLustered OUtcome Dataset bank, an RShiny app has been developed for determining plausible values of correlation coefficients for use in sample size calculations. Discussion: This study provides a repository of intra-cluster correlations and cluster autocorrelations for longitudinal cluster trials. This can help inform sample size calculations for future longitudinal cluster randomised trials.

Download Full-text

Sample size calculations for stepped wedge and cluster randomised trials: a unified approach

Journal of Clinical Epidemiology ◽

10.1016/j.jclinepi.2015.08.015 ◽

2016 ◽

Vol 69 ◽

pp. 137-146 ◽

Cited By ~ 63

Author(s):

Karla Hemming ◽

Monica Taljaard

Keyword(s):

Sample Size ◽

Randomised Trials ◽

Stepped Wedge ◽

Unified Approach ◽

Cluster Randomised Trials ◽

Cluster Randomised ◽

Sample Size Calculations

Download Full-text

Inadequate reporting of sample size calculations in cluster randomised trials: a review

Trials ◽

10.1186/1745-6215-14-s1-p122 ◽

2013 ◽

Vol 14 (S1) ◽

Author(s):

Clare Rutterford ◽

Monica Taljaard ◽

Stephanie Dixon ◽

Andrew Copas ◽

Sandra Eldridge

Keyword(s):

Sample Size ◽

Randomised Trials ◽

Cluster Randomised Trials ◽

Cluster Randomised ◽

Sample Size Calculations

Download Full-text

Sample Size Calculations for Cluster Randomised Trials

Journal of Health Services Research & Policy ◽

10.1177/135581960000500105 ◽

2000 ◽

Vol 5 (1) ◽

pp. 12-16 ◽

Cited By ~ 198

Author(s):

Marion Campbell ◽

Jeremy Grimshaw ◽

Nick Steen ◽

Keyword(s):

Sample Size ◽

Randomised Trials ◽

Cluster Randomised Trials ◽

Cluster Randomised ◽

Sample Size Calculations

Download Full-text

Re-estimating sample size in cluster randomised trials with active recruitment within clusters

Statistics in Medicine ◽

10.1002/sim.6172 ◽

2014 ◽

Vol 33 (19) ◽

pp. 3253-3268 ◽

Cited By ~ 8

Author(s):

S. van Schie ◽

M. Moerbeek

Keyword(s):

Sample Size ◽

Randomised Trials ◽

Cluster Randomised Trials ◽

Cluster Randomised

Download Full-text

Quality of reporting of pilot and feasibility cluster randomised trials: a systematic review

BMJ Open ◽

10.1136/bmjopen-2017-016970 ◽

2017 ◽

Vol 7 (11) ◽

pp. e016970 ◽

Cited By ~ 4

Author(s):

Claire L Chan ◽

Clémence Leyrat ◽

Sandra M Eldridge

Keyword(s):

Hypothesis Testing ◽

Sample Size ◽

Randomised Trials ◽

Quality Of Reporting ◽

Cluster Randomised Trials ◽

Cluster Randomised ◽

Pilot Trials ◽

Formal Hypothesis Testing ◽

Cluster Design

ObjectivesTo systematically review the quality of reporting of pilot and feasibility of cluster randomised trials (CRTs). In particular, to assess (1) the number of pilot CRTs conducted between 1 January 2011 and 31 December 2014, (2) whether objectives and methods are appropriate and (3) reporting quality.MethodsWe searched PubMed (2011–2014) for CRTs with ‘pilot’ or ‘feasibility’ in the title or abstract; that were assessing some element of feasibility and showing evidence the study was in preparation for a main effectiveness/efficacy trial. Quality assessment criteria were based on the Consolidated Standards of Reporting Trials (CONSORT) extensions for pilot trials and CRTs.ResultsEighteen pilot CRTs were identified. Forty-four per cent did not have feasibility as their primary objective, and many (50%) performed formal hypothesis testing for effectiveness/efficacy despite being underpowered. Most (83%) included ‘pilot’ or ‘feasibility’ in the title, and discussed implications for progression from the pilot to the future definitive trial (89%), but fewer reported reasons for the randomised pilot trial (39%), sample size rationale (44%) or progression criteria (17%). Most defined the cluster (100%), and number of clusters randomised (94%), but few reported how the cluster design affected sample size (17%), whether consent was sought from clusters (11%), or who enrolled clusters (17%).ConclusionsThat only 18 pilot CRTs were identified necessitates increased awareness of the importance of conducting and publishing pilot CRTs and improved reporting. Pilot CRTs should primarily be assessing feasibility, avoiding formal hypothesis testing for effectiveness/efficacy and reporting reasons for the pilot, sample size rationale and progression criteria, as well as enrolment of clusters, and how the cluster design affects design aspects. We recommend adherence to the CONSORT extensions for pilot trials and CRTs.

Download Full-text

Systematic review finds major deficiencies in sample size methodology and reporting for stepped-wedge cluster randomised trials

BMJ Open ◽

10.1136/bmjopen-2015-010166 ◽

2016 ◽

Vol 6 (2) ◽

pp. e010166 ◽

Cited By ~ 35

Author(s):

James Martin ◽

Monica Taljaard ◽

Alan Girling ◽

Karla Hemming

Keyword(s):

Systematic Review ◽

Sample Size ◽

Randomised Trials ◽

Stepped Wedge ◽

Cluster Randomised Trials ◽

Cluster Randomised

Download Full-text

Cluster randomised trials with different numbers of measurements at baseline and endline: Sample size and optimal allocation

Clinical Trials ◽

10.1177/1740774519873888 ◽

2019 ◽

Vol 17 (1) ◽

pp. 69-76

Author(s):

Andrew J Copas ◽

Richard Hooper

Keyword(s):

Sample Size ◽

Cluster Size ◽

Randomised Trial ◽

Sample Size Calculation ◽

Baseline Data ◽

Cluster Randomised Trial ◽

Randomised Trials ◽

Cluster Randomised Trials ◽

Cluster Randomised ◽

The Impact

Background/Aims: Published methods for sample size calculation for cluster randomised trials with baseline data are inflexible and primarily assume an equal amount of data collected at baseline and endline, that is, before and after the intervention has been implemented in some clusters. We extend these methods to any amount of baseline and endline data. We explain how to explore sample size for a trial if some baseline data from the trial clusters have already been collected as part of a separate study. Where such data aren’t available, we show how to choose the proportion of data collection devoted to the baseline within the trial, when a particular cluster size or range of cluster sizes is proposed. Methods: We provide a design effect given the cluster size and correlation parameters, assuming different participants are assessed at baseline and endline in the same clusters. We show how to produce plots to identify the impact of varying the amount of baseline data accounting for the inevitable uncertainty in the cluster autocorrelation. We illustrate the methodology using an example trial. Results: Baseline data provide more power, or allow a greater reduction in trial size, with greater values of the cluster size, intracluster correlation and cluster autocorrelation. Conclusion: Investigators should think carefully before collecting baseline data in a cluster randomised trial if this is at the expense of endline data. In some scenarios, this will increase the sample size required to achieve given power and precision.

Download Full-text

Sample size in cluster randomised trials with unequal clusters

Trials ◽

10.1186/1745-6215-12-s1-a25 ◽

2011 ◽

Vol 12 (S1) ◽

Author(s):

Ivana Holloway ◽

Amanda Farrin

Keyword(s):

Sample Size ◽

Randomised Trials ◽

Cluster Randomised Trials ◽

Cluster Randomised

Download Full-text

Interim data monitoring in cluster randomised trials: Practical issues and a case study

Clinical Trials ◽

10.1177/17407745211024751 ◽

2021 ◽

pp. 174077452110247

Author(s):

K Hemming ◽

J Martin ◽

I Gallos ◽

A Coomarasamy ◽

L Middleton

Keyword(s):

Sample Size ◽

Selection Bias ◽

Statistical Testing ◽

Data Monitoring ◽

Randomised Trials ◽

Interim Monitoring ◽

Cluster Randomised Trials ◽

Cluster Randomised ◽

Post Randomisation

Background There is an abundance of guidance for the interim monitoring of individually randomised trials. While methodological literature exists on how to extend these methods to cluster randomised trials, there is little guidance on practical implementation. Cluster trials have many features which make their monitoring needs different. We outline the methodological and practical challenges of interim monitoring of cluster trials; and apply these considerations to a case study. Case study The E-MOTIVE study is an 80-cluster randomised trial of a bundle of interventions to treat postpartum haemorrhage. The proposed data monitoring plan includes (1) monitor sample size assumptions, (2) monitor for evidence of selection bias, and (3) an interim assessment of the primary outcome, as well as monitoring data completeness. The timing of the sample size monitoring is chosen with both consideration of statistical precision and to allow time to recruit more clusters. Monitoring for selection bias involves comparing individual-level characteristics and numbers recruited between study arms to identify any post-randomisation participant identification bias. An interim analysis of outcomes presented with 99.9% confidence intervals using the Haybittle–Peto approach should mitigate any concern regarding the inflation of type-I error. The pragmatic nature of the trial means monitoring for adherence is not relevant, as it is built into a process evaluation. Conclusions The interim analyses of cluster trials have a number of important differences to monitoring individually randomised trials. In cluster trials, there will often be a greater need to monitor nuisance parameters, yet there will often be considerable uncertainty in their estimation. This means the utility of sample size re-estimation can be questionable particularly when there are practical or funding difficulties associated with making any changes to planned sample sizes. Perhaps most importantly interim monitoring has the potential to identify selection bias, particularly in trials with post-randomisation identification or recruitment. Finally, the pragmatic nature of cluster trials might mean that the utility of methods to allow for interim monitoring of outcomes based on statistical testing, or monitoring for adherence to study interventions, are less relevant. Our intention is to facilitate the planning of future cluster randomised trials and to promote discussion and debate to improve monitoring of these studies.

Download Full-text