Constructing intervals for the intracluster correlation coefficient using Bayesian modelling, and application in cluster randomized trials

2006 ◽  
Vol 25 (9) ◽  
pp. 1443-1456 ◽  
Author(s):  
Rebecca M. Turner ◽  
Rumana Z. Omar ◽  
Simon G. Thompson
2021 ◽  
pp. 174077452110598
Author(s):  
Lee Kennedy-Shaffer ◽  
Michael D Hughes

Background/Aims Generalized estimating equations are commonly used to fit logistic regression models to clustered binary data from cluster randomized trials. A commonly used correlation structure assumes that the intracluster correlation coefficient does not vary by treatment arm or other covariates, but the consequences of this assumption are understudied. We aim to evaluate the effect of allowing variation of the intracluster correlation coefficient by treatment or other covariates on the efficiency of analysis and show how to account for such variation in sample size calculations. Methods We develop formulae for the asymptotic variance of the estimated difference in outcome between treatment arms obtained when the true exchangeable correlation structure depends on the treatment arm and the working correlation structure used in the generalized estimating equations analysis is: (i) correctly specified, (ii) independent, or (iii) exchangeable with no dependence on treatment arm. These formulae require a known distribution of cluster sizes; we also develop simplifications for the case when cluster sizes do not vary and approximations that can be used when the first two moments of the cluster size distribution are known. We then extend the results to settings with adjustment for a second binary cluster-level covariate. We provide formulae to calculate the required sample size for cluster randomized trials using these variances. Results We show that the asymptotic variance of the estimated difference in outcome between treatment arms using these three working correlation structures is the same if all clusters have the same size, and this asymptotic variance is approximately the same when intracluster correlation coefficient values are small. We illustrate these results using data from a recent cluster randomized trial for infectious disease prevention in which the clusters are groups of households and modest in size (mean 9.6 individuals), with intracluster correlation coefficient values of 0.078 in the control arm and 0.057 in an intervention arm. In this application, we found a negligible difference between the variances calculated using structures (i) and (iii) and only a small increase (typically [Formula: see text]) for the independent correlation structure (ii), and hence minimal effect on power or sample size requirements. The impact may be larger in other applications if there is greater variation in the ICC between treatment arms or with an additional covariate. Conclusion The common approach of fitting generalized estimating equations with an exchangeable working correlation structure with a common intracluster correlation coefficient across arms likely does not substantially reduce the power or efficiency of the analysis in the setting of a large number of small or modest-sized clusters, even if the intracluster correlation coefficient varies by treatment arm. Our formulae, however, allow formal evaluation of this and may identify situations in which variation in intracluster correlation coefficient by treatment arm or another binary covariate may have a more substantial impact on power and hence sample size requirements.


2018 ◽  
Vol 16 (1) ◽  
pp. 41-51 ◽  
Author(s):  
Philip M Westgate

Background/aims Cluster randomized trials are popular in health-related research due to the need or desire to randomize clusters of subjects to different trial arms as opposed to randomizing each subject individually. As outcomes from subjects within the same cluster tend to be more alike than outcomes from subjects within other clusters, an exchangeable correlation arises that is measured via the intra-cluster correlation coefficient. Intra-cluster correlation coefficient estimation is especially important due to the increasing awareness of the need to publish such values from studies in order to help guide the design of future cluster randomized trials. Therefore, numerous methods have been proposed to accurately estimate the intra-cluster correlation coefficient, with much attention given to binary outcomes. As marginal models are often of interest, we focus on intra-cluster correlation coefficient estimation in the context of fitting such a model with binary outcomes using generalized estimating equations. Traditionally, intra-cluster correlation coefficient estimation with generalized estimating equations has been based on the method of moments, although such estimators can be negatively biased. Furthermore, alternative estimators that work well, such as the analysis of variance estimator, are not as readily applicable in the context of practical data analyses with generalized estimating equations. Therefore, in this article we assess, in terms of bias, the readily available residual pseudo-likelihood approach to intra-cluster correlation coefficient estimation with the GLIMMIX procedure of SAS (SAS Institute, Cary, NC). Furthermore, we study a possible corresponding approach to confidence interval construction for the intra-cluster correlation coefficient. Methods We utilize a simulation study and application example to assess bias in intra-cluster correlation coefficient estimates obtained from GLIMMIX using residual pseudo-likelihood. This estimator is contrasted with method of moments and analysis of variance estimators which are standards of comparison. The approach to confidence interval construction is assessed by examining coverage probabilities. Results Overall, the residual pseudo-likelihood estimator performs very well. It has considerably less bias than moment estimators, which are its competitor for general generalized estimating equation–based analyses, and therefore, it is a major improvement in practice. Furthermore, it works almost as well as analysis of variance estimators when they are applicable. Confidence intervals have near-nominal coverage when the intra-cluster correlation coefficient estimate has negligible bias. Conclusion Our results show that the residual pseudo-likelihood estimator is a good option for intra-cluster correlation coefficient estimation when conducting a generalized estimating equation–based analysis of binary outcome data arising from cluster randomized trials. The estimator is practical in that it is simply a result from fitting a marginal model with GLIMMIX, and a confidence interval can be easily obtained. An additional advantage is that, unlike most other options for performing generalized estimating equation–based analyses, GLIMMIX provides analysts the option to utilize small-sample adjustments that ensure valid inference.


2019 ◽  
Author(s):  
Xiaoran Han ◽  
Jiaye Lin ◽  
Jinjing Xu ◽  
Maggie Wang ◽  
Benny Zee ◽  
...  

Abstract Background Cluster randomized trials (CRTs) are widely adopted in health and primary care research. However, the cluster effect needs to be taken into account appropriately in the design and analysis of CRTs. The objectives of this study were (i) to review the reporting of intracluster correlations in CRTs; and (ii) to evaluate whether the assumed intracluster correlation measures in sample size planning are consistent with those obtained in the analysis. Methods The Aggregate Analysis of ClinicalTrials.gov database was searched to identify CRTs registered between January 1, 2004 and March 27, 2016. The selected CRTs with accessible publications were screened according to eligibility criteria. Results Of the 281 CRTs identified, the percentage of studies accounting for cluster effect increased annually. A total of 183 studies accounted for clustering in sample size estimation, among them 43% of CRTs adopted the intraclass correlation coefficient (ICC) but the exact estimated value of ICC was provided in only 26% of the included studies. In different intervention types, there were no statistically significant differences between the assumed and reported values of ICC (all p-values >0.05). Conclusion Although the difference between the values of ICC assumed in sample size planning and that reported in the analysis was not statistically significant, deficiencies in CRTs are still common, such as low rates of considering cluster effect in sample size and reporting intracluster correlation estimates. We also suggest that researchers ought to be familiar with the properties of statistical approaches to improve the analysis of CRTs. Thus, more recommendations and guidelines such as the CONSORT statement for CRTs should be suggested to researchers.


Sign in / Sign up

Export Citation Format

Share Document