Comparing the performance of different multiple imputation strategies for missing binary outcomes in cluster randomized trials: a simulation study

In cluster randomized trials, the intraclass correlation coefficient (ICC) is classically used to measure clustering. When the outcome is binary, the ICC is known to be associated with the prevalence of the outcome. This association challenges its interpretation and can be problematic for sample size calculation. To overcome these situations, Crespi et al. extended a coefficient named R, initially proposed by Rosner for ophthalmologic data, to cluster randomized trials. Crespi et al. asserted that R may be less influenced by the outcome prevalence than is the ICC, although the authors provided only empirical data to support their assertion. They also asserted that “the traditional ICC approach to sample size determination tends to overpower studies under many scenarios, calling for more clusters than truly required”, although they did not consider empirical power. The aim of this study was to investigate whether R could indeed be considered independent of the outcome prevalence. We also considered whether sample size calculation should be better based on the R coefficient or the ICC. Considering the particular case of 2 individuals per cluster, we theoretically demonstrated that R is not symmetrical around the 0.5 prevalence value. This in itself demonstrates the dependence of R on prevalence. We also conducted a simulation study to explore the case of both fixed and variable cluster sizes greater than 2. This simulation study demonstrated that R decreases when prevalence increases from 0 to 1. Both the analytical and simulation results demonstrate that R depends on the outcome prevalence. In terms of sample size calculation, we showed that an approach based on the ICC is preferable to an approach based on the R coefficient because with the former, the empirical power is closer to the nominal one. Hence, the R coefficient does not outperform the ICC for binary outcomes because it does not offer any advantage over the ICC.

Download Full-text

A simulation study of odds ratio estimation for binary outcomes from cluster randomized trials

Statistics in Medicine ◽

10.1002/sim.2769 ◽

2006 ◽

Vol 26 (18) ◽

pp. 3415-3428 ◽

Cited By ~ 18

Author(s):

Obioha C. Ukoumunne ◽

John B. Carlin ◽

Martin C. Gulliford

Keyword(s):

Simulation Study ◽

Odds Ratio ◽

Randomized Trials ◽

Binary Outcomes ◽

Ratio Estimation ◽

Cluster Randomized Trials ◽

Cluster Randomized

Download Full-text

Comparison of population-averaged and cluster-specific models for the analysis of cluster randomized trials with missing binary outcomes: a simulation study

BMC Medical Research Methodology ◽

10.1186/1471-2288-13-9 ◽

2013 ◽

Vol 13 (1) ◽

Cited By ~ 15

Author(s):

Jinhui Ma ◽

Parminder Raina ◽

Joseph Beyene ◽

Lehana Thabane

Keyword(s):

Simulation Study ◽

Randomized Trials ◽

Binary Outcomes ◽

Cluster Randomized Trials ◽

Cluster Randomized

Download Full-text

Properties and pitfalls of weighting as an alternative to multilevel multiple imputation in cluster randomized trials with missing binary outcomes under covariate-dependent missingness

Statistical Methods in Medical Research ◽

10.1177/0962280219859915 ◽

2019 ◽

Vol 29 (5) ◽

pp. 1338-1353

Author(s):

Elizabeth L Turner ◽

Lanqiu Yao ◽

Fan Li ◽

Melanie Prague

Keyword(s):

Multiple Imputation ◽

Randomized Trials ◽

Simulated Data ◽

Real Data ◽

Cluster Randomized Trial ◽

Outcome Data ◽

Binary Outcomes ◽

Cluster Randomized Trials ◽

Cluster Randomized ◽

Multilevel Multiple Imputation

The generalized estimating equation (GEE) approach can be used to analyze cluster randomized trial data to obtain population-averaged intervention effects. However, most cluster randomized trials have some missing outcome data and a GEE analysis of available data may be biased when outcome data are not missing completely at random. Although multilevel multiple imputation for GEE (MMI-GEE) has been widely used, alternative approaches such as weighted GEE are less common in practice. Using both simulations and a real data example, we evaluate the performance of inverse probability weighted GEE vs. MMI-GEE for binary outcomes. Simulated data are generated assuming a covariate-dependent missing data pattern across a range of missingness clustering (from none to high), where all covariates are measured at baseline and are fully observed (i.e. a type of missing-at-random mechanism). Two types of weights are estimated and used in the weighted GEE: (1) assuming no clustering of missingness (W-GEE) and (2) accounting for such clustering (CW-GEE). Results show that, even in settings with high missingness clustering, CW-GEE can lead to more bias and lower coverage than W-GEE, whereas W-GEE and MMI-GEE provide comparable results. W-GEE should be considered a viable strategy to account for missing outcomes in cluster randomized trials.

Download Full-text

Impact of unequal cluster sizes for GEE analyses of stepped wedge cluster randomized trials with binary outcomes

Biometrical Journal ◽

10.1002/bimj.202100112 ◽

2021 ◽

Author(s):

Zibo Tian ◽

John S. Preisser ◽

Denise Esserman ◽

Elizabeth L. Turner ◽

Paul J. Rathouz ◽

...

Keyword(s):

Randomized Trials ◽

Binary Outcomes ◽

Cluster Randomized Trials ◽

Stepped Wedge ◽

Cluster Randomized ◽

Unequal Cluster

Download Full-text

Handling missing data in cluster randomized trials: A demonstration of multiple imputation with PAN through SAS

The Quantitative Methods for Psychology ◽

10.20982/tqmp.10.2.p153 ◽

2014 ◽

Vol 10 (2) ◽

pp. 153-166

Author(s):

Jiangxiu Zhou ◽

Lauren E. Connell ◽

John W. Graham

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Randomized Trials ◽

Cluster Randomized Trials ◽

Cluster Randomized

Download Full-text

What are the statistical implications of treatment non‐compliance in cluster randomized trials: A simulation study

Statistics in Medicine ◽

10.1002/sim.8351 ◽

2019 ◽

Vol 38 (26) ◽

pp. 5071-5084 ◽

Cited By ~ 1

Author(s):

Mirjam Moerbeek ◽

Sander van Schie

Keyword(s):

Simulation Study ◽

Randomized Trials ◽

Cluster Randomized Trials ◽

Cluster Randomized

Download Full-text

A new dependence parameter approach to improve the design of cluster randomized trials with binary outcomes

Clinical Trials ◽

10.1177/1740774511423851 ◽

2011 ◽

Vol 8 (6) ◽

pp. 687-698 ◽

Cited By ~ 15

Author(s):

Catherine M Crespi ◽

Weng Kee Wong ◽

Sheng Wu

Keyword(s):

Sample Size ◽

Randomized Trials ◽

Pearson Correlation ◽

Sample Size Determination ◽

Binary Outcomes ◽

Alternative Measure ◽

Cluster Randomized Trials ◽

Sample Size Calculations ◽

Cluster Randomized ◽

Measure Of Dependence

Background and Purpose Power and sample size calculations for cluster randomized trials require prediction of the degree of correlation that will be realized among outcomes of participants in the same cluster. This correlation is typically quantified as the intraclass correlation coefficient (ICC), defined as the Pearson correlation between two members of the same cluster or proportion of the total variance attributable to variance between clusters. It is widely known but perhaps not fully appreciated that for binary outcomes, the ICC is a function of outcome prevalence. Hence, the ICC and the outcome prevalence are intrinsically related, making the ICC poorly generalizable across study conditions and between studies with different outcome prevalences. Methods We use a simple parametrization of the ICC that aims to isolate that part of the ICC that measures dependence among responses within a cluster from the outcome prevalence. We incorporate this parametrization into sample size calculations for cluster randomized trials and compare our method to the traditional approach using the ICC. Results Our dependence parameter, R, may be less influenced by outcome prevalence and has an intuitive meaning that facilitates interpretation. Estimates of R from previous studies can be obtained using simple statistics. Comparison of methods showed that the traditional ICC approach to sample size determination tends to overpower studies under many scenarios, calling for more clusters than truly required. Limitations The methods are developed for equal-sized clusters, whereas cluster size may vary in practice. Conclusions The dependence parameter R is an alternative measure of dependence among binary outcomes in cluster randomized trials that has a number of advantages over the ICC.

Download Full-text