scholarly journals Has the time come to stop using the “standardised mean difference”?

2021 ◽  
Vol 3 (3) ◽  
Author(s):  
Pim Cuijpers

Background Most meta-analyses use the ‘standardised mean difference’ (effect size) to summarise the outcomes of studies. However, the effect size has important limitations that need to be considered. Method After a brief explanation of the standardized mean difference, limitations are discussed and possible solutions in the context of meta-analyses are suggested. Results When using the effect size, three major limitations have to be considered. First, the effect size is still a statistical concept and small effect sizes may have considerable clinical meaning while large effect sizes may not. Second, specific assumptions of the effect size may not be correct. Third, and most importantly, it is very difficult to explain what the meaning of the effect size is to non-researchers. As possible solutions, the use of the ‘binomial effect size display’ and the number-needed-to-treat are discussed. Furthermore, I suggest the use of binary outcomes, which are often easier to understand. However, it is not clear what the best binary outcome is for continuous outcomes. Conclusion The effect size is still useful, as long as the limitations are understood and also binary outcomes are given.

2021 ◽  
pp. 0013189X2110513
Author(s):  
Joseph A. Taylor ◽  
Terri Pigott ◽  
Ryan Williams

Toward the goal of more rapid knowledge accumulation via better meta-analyses, this article explores statistical approaches intended to increase the precision and comparability of effect sizes from education research. The featured estimate of the proposed approach is a standardized mean difference effect size whose numerator is a mean difference that has been adjusted for baseline differences in the outcome measure, at a minimum, and whose denominator is the total variance. The article describes the utility and efficiency of covariate adjustment through baseline measures and the need to standardize effects on a total variance that accounts for variation at multiple levels. As computation of the total variance can be complex in multilevel studies, a shiny application is provided to assist with computation of the total variance and subsequent effect size. Examples are provided for how to interpret and input the required calculator inputs.


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Liansheng Larry Tang ◽  
Michael Caudy ◽  
Faye Taxman

Multiple meta-analyses may use similar search criteria and focus on the same topic of interest, but they may yield different or sometimes discordant results. The lack of statistical methods for synthesizing these findings makes it challenging to properly interpret the results from multiple meta-analyses, especially when their results are conflicting. In this paper, we first introduce a method to synthesize the meta-analytic results when multiple meta-analyses use the same type of summary effect estimates. When meta-analyses use different types of effect sizes, the meta-analysis results cannot be directly combined. We propose a two-step frequentist procedure to first convert the effect size estimates to the same metric and then summarize them with a weighted mean estimate. Our proposed method offers several advantages over existing methods by Hemming et al. (2012). First, different types of summary effect sizes are considered. Second, our method provides the same overall effect size as conducting a meta-analysis on all individual studies from multiple meta-analyses. We illustrate the application of the proposed methods in two examples and discuss their implications for the field of meta-analysis.


1990 ◽  
Vol 24 (3) ◽  
pp. 405-415 ◽  
Author(s):  
Nathaniel McConaghy

Meta-analysis replaced statistical significance with effect size in the hope of resolving controversy concerning evaluation of treatment effects. Statistical significance measured reliability of the effect of treatment, not its efficacy. It was strongly influenced by the number of subjects investigated. Effect size as assessed originally, eliminated this influence but by standardizing the size of the treatment effect could distort it. Meta-analyses which combine the results of studies which employ different subject types, outcome measures, treatment aims, no-treatment rather than placebo controls or therapists with varying experience can be misleading. To ensure discussion of these variables meta-analyses should be used as an aid rather than a substitute for literature review. While meta-analyses produce contradictory findings, it seems unwise to rely on the conclusions of an individual analysis. Their consistent finding that placebo treatments obtain markedly higher effect sizes than no treatment hopefully will render the use of untreated control groups obsolete.


2018 ◽  
Vol 38 (7) ◽  
pp. 866-880 ◽  
Author(s):  
Yong Yi Lee ◽  
Long Khanh-Dao Le ◽  
Emily A. Stockings ◽  
Phillipa Hay ◽  
Harvey A. Whiteford ◽  
...  

Introduction. The raw mean difference (RMD) and standardized mean difference (SMD) are continuous effect size measures that are not readily usable in decision-analytic models of health care interventions. This study compared the predictive performance of 3 methods by which continuous outcomes data collected using psychiatric rating scales can be used to calculate a relative risk (RR) effect size. Methods. Three methods to calculate RR effect sizes from continuous outcomes data are described: the RMD, SMD, and Cochrane conversion methods. Each conversion method was validated using data from randomized controlled trials (RCTs) examining the efficacy of interventions for the prevention of depression in youth (aged ≤17 years) and adults (aged ≥18 years) and the prevention of eating disorders in young women (aged ≤21 years). Validation analyses compared predicted RR effect sizes to actual RR effect sizes using scatterplots, correlation coefficients ( r), and simple linear regression. An applied analysis was also conducted to examine the impact of using each conversion method in a cost-effectiveness model. Results. The predictive performances of the RMD and Cochrane conversion methods were strong relative to the SMD conversion method when analyzing RCTs involving depression in adults (RMD: r = 0.89–0.90; Cochrane: r = 0.73; SMD: r = 0.41–0.67) and eating disorders in young women (RMD: r = 0.89; Cochrane: r = 0.96). Moderate predictive performances were observed across the 3 methods when analyzing RCTs involving depression in youth (RMD: r = 0.50; Cochrane: r = 0.47; SMD: r = 0.46–0.46). Negligible differences were observed between the 3 methods when applied to a cost-effectiveness model. Conclusion. The RMD and Cochrane conversion methods are both valid methods for predicting RR effect sizes from continuous outcomes data. However, further validation and refinement are required before being applied more broadly.


2020 ◽  
Vol 63 (5) ◽  
pp. 1572-1580
Author(s):  
Laura Gaeta ◽  
Christopher R. Brydges

Purpose The purpose was to examine and determine effect size distributions reported in published audiology and speech-language pathology research in order to provide researchers and clinicians with more relevant guidelines for the interpretation of potentially clinically meaningful findings. Method Cohen's d, Hedges' g, Pearson r, and sample sizes ( n = 1,387) were extracted from 32 meta-analyses in journals in speech-language pathology and audiology. Percentile ranks (25th, 50th, 75th) were calculated to determine estimates for small, medium, and large effect sizes, respectively. The median sample size was also used to explore statistical power for small, medium, and large effect sizes. Results For individual differences research, effect sizes of Pearson r = .24, .41, and .64 were found. For group differences, Cohen's d /Hedges' g = 0.25, 0.55, and 0.93. These values can be interpreted as small, medium, and large effect sizes in speech-language pathology and audiology. The majority of published research was inadequately powered to detect a medium effect size. Conclusions Effect size interpretations from published research in audiology and speech-language pathology were found to be underestimated based on Cohen's (1988, 1992) guidelines. Researchers in the field should consider using Pearson r = .25, .40, and .65 and Cohen's d /Hedges' g = 0.25, 0.55, and 0.95 as small, medium, and large effect sizes, respectively, and collect larger sample sizes to ensure that both significant and nonsignificant findings are robust and replicable.


1997 ◽  
Vol 22 (1) ◽  
pp. 109-117 ◽  
Author(s):  
Kenneth N. Thompson ◽  
Randall E. Schumacker

The binomial effect size display (BESD) has been proposed by Rosenthal and Rubin (1979 , 1982 ; Rosenthal, 1990 ; Rosenthal & Rosnow, 1991) as a format for presenting effect sizes associated with certain experimental and nonexperimental research. An evaluation of the BESD suggests that its application is limited to presenting the results of 2 × 2 tables where φ is employed as the index of effect size. Findings indicate that the BESD provides little added information beyond an examination of the raw percentages in the 2 × 2 table and dramatically distorts effect sizes when binomial success rates vary from .50.


1982 ◽  
Vol 7 (2) ◽  
pp. 119-137 ◽  
Author(s):  
Larry V. Hedges

One method of combining the results of a series of two-group experiments involves the estimation of the effect size (population value of the standarized mean difference) for each experiment. When each experiment has the same effect size, a pooled estimate of effect size provides a summary of the results of the series of experiments. However, when effect sizes are not homogeneous, a pooled estimate can be misleading. A statistical test is provided for testing whether a series of experiments have the same effect size. A general strategy is provided for fitting models to the results of a series of experiments when the experiments do not share the same effect size and the collection of experiments is divided into a priori classes. The overall fit statistic H T is partitioned into a between-class fit statistic H B and a within-class fit statistic H w. The statistics H B and H w permit the assessment of differences between effect sizes for different classes and the assessment of the homogeneity of effect size within classes.


Sign in / Sign up

Export Citation Format

Share Document