scholarly journals Examining the Reproducibility of Meta-Analyses in Psychology: A Preliminary Report

Author(s):  
Daniel Lakens ◽  
Elizabeth Page-Gould ◽  
Marcel A. L. M. van Assen ◽  
Bobbie Spellman ◽  
Felix D. Schönbrodt ◽  
...  

Meta-analyses are an important tool to evaluate the literature. It is essential that meta-analyses can easily be reproduced to allow researchers to evaluate the impact of subjective choices on meta-analytic effect sizes, but also to update meta-analyses as new data comes in, or as novel statistical techniques (for example to correct for publication bias) are developed. Research in medicine has revealed meta-analyses often cannot be reproduced. In this project, we examined the reproducibility of meta-analyses in psychology by reproducing twenty published meta-analyses. Reproducing published meta-analyses was surprisingly difficult. 96% of meta-analyses published in 2013-2014 did not adhere to reporting guidelines. A third of these meta-analyses did not contain a table specifying all individual effect sizes. Five of the 20 randomly selected meta-analyses we attempted to reproduce could not be reproduced at all due to lack of access to raw data, no details about the effect sizes extracted from each study, or a lack of information about how effect sizes were coded. In the remaining meta-analyses, differences between the reported and reproduced effect size or sample size were common. We discuss a range of possible improvements, such as more clearly indicating which data were used to calculate an effect size, specifying all individual effect sizes, adding detailed information about equations that are used, and how multiple effect size estimates from the same study are combined, but also sharing raw data retrieved from original authors, or unpublished research reports. This project clearly illustrates there is a lot of room for improvement when it comes to the transparency and reproducibility of published meta-analyses.


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Liansheng Larry Tang ◽  
Michael Caudy ◽  
Faye Taxman

Multiple meta-analyses may use similar search criteria and focus on the same topic of interest, but they may yield different or sometimes discordant results. The lack of statistical methods for synthesizing these findings makes it challenging to properly interpret the results from multiple meta-analyses, especially when their results are conflicting. In this paper, we first introduce a method to synthesize the meta-analytic results when multiple meta-analyses use the same type of summary effect estimates. When meta-analyses use different types of effect sizes, the meta-analysis results cannot be directly combined. We propose a two-step frequentist procedure to first convert the effect size estimates to the same metric and then summarize them with a weighted mean estimate. Our proposed method offers several advantages over existing methods by Hemming et al. (2012). First, different types of summary effect sizes are considered. Second, our method provides the same overall effect size as conducting a meta-analysis on all individual studies from multiple meta-analyses. We illustrate the application of the proposed methods in two examples and discuss their implications for the field of meta-analysis.



2018 ◽  
Vol 11 (10) ◽  
pp. 42 ◽  
Author(s):  
Yujin Lee ◽  
Mary M. Capraro ◽  
Robert M. Capraro ◽  
Ali Bicer

Although algebraic reasoning has been considered as an important factor influencing students’ mathematical performance, many students struggle to build concrete algebraic reasoning. Metacognitive training has been regarded as one effective method to develop students’ algebraic reasoning; however, there are no published meta-analyses that include an examination of the effects of metacognitive training on students’ algebraic reasoning. Therefore, the purpose of this meta-analysis was to examine the impact of metacognitive training on students’ algebraic reasoning. Eighteen studies with 22 effect sizes were selected for inclusion in the present meta-analysis. In the process of the analysis, one study was determined as an outlier; therefore, another meta-analysis was reconstructed without the outlier to calculate more robust results. The findings indicated that the overall effect size without an outlier equaled d=0.973 with SE=0.196. Q=20.201 (p<.05) and I2=0.997, which indicated heterogeneity of the studies. These results showed that the metacognitive training had a statistically significant positive impact on students’ algebraic reasoning.



2015 ◽  
Vol 19 (2) ◽  
pp. 172-182 ◽  
Author(s):  
Michèle B. Nuijten ◽  
Marcel A. L. M. van Assen ◽  
Coosje L. S. Veldkamp ◽  
Jelte M. Wicherts

Replication is often viewed as the demarcation between science and nonscience. However, contrary to the commonly held view, we show that in the current (selective) publication system replications may increase bias in effect size estimates. Specifically, we examine the effect of replication on bias in estimated population effect size as a function of publication bias and the studies’ sample size or power. We analytically show that incorporating the results of published replication studies will in general not lead to less bias in the estimated population effect size. We therefore conclude that mere replication will not solve the problem of overestimation of effect sizes. We will discuss the implications of our findings for interpreting results of published and unpublished studies, and for conducting and interpreting results of meta-analyses. We also discuss solutions for the problem of overestimation of effect sizes, such as discarding and not publishing small studies with low power, and implementing practices that completely eliminate publication bias (e.g., study registration).



2021 ◽  
Vol 5 (1) ◽  
pp. e100135
Author(s):  
Xue Ying Zhang ◽  
Jan Vollert ◽  
Emily S Sena ◽  
Andrew SC Rice ◽  
Nadia Soliman

ObjectiveThigmotaxis is an innate predator avoidance behaviour of rodents and is enhanced when animals are under stress. It is characterised by the preference of a rodent to seek shelter, rather than expose itself to the aversive open area. The behaviour has been proposed to be a measurable construct that can address the impact of pain on rodent behaviour. This systematic review will assess whether thigmotaxis can be influenced by experimental persistent pain and attenuated by pharmacological interventions in rodents.Search strategyWe will conduct search on three electronic databases to identify studies in which thigmotaxis was used as an outcome measure contextualised to a rodent model associated with persistent pain. All studies published until the date of the search will be considered.Screening and annotationTwo independent reviewers will screen studies based on the order of (1) titles and abstracts, and (2) full texts.Data management and reportingFor meta-analysis, we will extract thigmotactic behavioural data and calculate effect sizes. Effect sizes will be combined using a random-effects model. We will assess heterogeneity and identify sources of heterogeneity. A risk-of-bias assessment will be conducted to evaluate study quality. Publication bias will be assessed using funnel plots, Egger’s regression and trim-and-fill analysis. We will also extract stimulus-evoked limb withdrawal data to assess its correlation with thigmotaxis in the same animals. The evidence obtained will provide a comprehensive understanding of the strengths and limitations of using thigmotactic outcome measure in animal pain research so that future experimental designs can be optimised. We will follow the Preferred Reporting Items for Systematic Reviews and Meta-Analyses reporting guidelines and disseminate the review findings through publication and conference presentation.



Stroke ◽  
2021 ◽  
Author(s):  
Johanna Maria Ospel ◽  
Scott Brown ◽  
Manon Kappelhof ◽  
Wim van Zwam ◽  
Tudor Jovin ◽  
...  

Background and Purpose: Little is known about the combined effect of age and National Institutes of Health Stroke Scale (NIHSS) in endovascular treatment (EVT) for acute ischemic stroke due to large vessel occlusion, and it is not clear how the effects of baseline age and NIHSS on outcome compare to each other. The previously described Stroke Prognostication Using Age and NIHSS (SPAN) index adds up NIHSS and age to a 1:1 combined prognostic index. We added a weighting factor to the NIHSS/age SPAN index to compare the relative prognostic impact of NIHSS and age and assessed EVT effect based on weighted age and NIHSS. Methods: We performed adjusted logistic regression with good outcome (90-day modified Rankin Scale score 0–2) as primary outcome. From this model, the coefficients for NIHSS and age were obtained. The ratio between the NIHSS and age coefficients was calculated to determine a weighted SPAN index. We obtained adjusted effect size estimates for EVT in patient subgroups defined by weighted SPAN increments of 3, to evaluate potential changes in treatment effect. Results: We included 1750/1766 patients from the HERMES collaboration (Highly Effective Reperfusion Using Multiple Endovascular Devices) with available age and NIHSS data. Median NIHSS was 17 (interquartile range, 13–21), and median age was 68 (interquartile range, 57–76). Good outcome was achieved by 682/1743 (39%) patients. The NIHSS/age effect coefficient ratio was ([−0.0032]/[−0.111])=3.4, which was rounded to 3, resulting in a weighted SPAN index defined as ([3×NIHSS]+age). Cumulative EVT effect size estimates across weighted SPAN subgroups consistently favored EVT, with a number needed to treat ranging from 5.3 to 8.7. Conclusions: The impact on chance of good outcome of a 1-point increase in NIHSS roughly corresponded to a 3-year increase in patient age. EVT was beneficial across all weighted age/NIHSS subgroups.





BMJ Open ◽  
2019 ◽  
Vol 9 (6) ◽  
pp. e024886 ◽  
Author(s):  
Klaus Munkholm ◽  
Asger Sand Paludan-Müller ◽  
Kim Boesen

ObjectivesTo investigate whether the conclusion of a recent systematic review and network meta-analysis (Ciprianiet al) that antidepressants are more efficacious than placebo for adult depression was supported by the evidence.DesignReanalysis of a systematic review, with meta-analyses.Data sources522 trials (116 477 participants) as reported in the systematic review by Ciprianiet aland clinical study reports for 19 of these trials.AnalysisWe used the Cochrane Handbook’s risk of bias tool and the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach to evaluate the risk of bias and the certainty of evidence, respectively. The impact of several study characteristics and publication status was estimated using pairwise subgroup meta-analyses.ResultsSeveral methodological limitations in the evidence base of antidepressants were either unrecognised or underestimated in the systematic review by Ciprianiet al. The effect size for antidepressants versus placebo on investigator-rated depression symptom scales was higher in trials with a ‘placebo run-in’ study design compared with trials without a placebo run-in design (p=0.05). The effect size of antidepressants was higher in published trials compared with unpublished trials (p<0.0001). The outcome data reported by Ciprianiet aldiffered from the clinical study reports in 12 (63%) of 19 trials. The certainty of the evidence for the placebo-controlled comparisons should be very low according to GRADE due to a high risk of bias, indirectness of the evidence and publication bias. The mean difference between antidepressants and placebo on the 17-item Hamilton depression rating scale (range 0–52 points) was 1.97 points (95% CI 1.74 to 2.21).ConclusionsThe evidence does not support definitive conclusions regarding the benefits of antidepressants for depression in adults. It is unclear whether antidepressants are more efficacious than placebo.



Author(s):  
Liana R. Taylor ◽  
Avinash Bhati ◽  
Faye S. Taxman

The Washington State Institute for Public Policy (WSIPP) uses meta-analyses to help program administrators identify effective programs that reduce recidivism. The results are displayed as summary effect sizes. Yet, many programs are grouped within a category (such as Intensive Supervision or Correctional Education), even though the features of the programs might suggest they may be very different. The following research question was examined: What program features are related to the effect size in the WSIPP program category? Researchers at ACE! at George Mason University reviewed the studies analyzed by WSIPP and their effect sizes. The meta-regression global models showed recidivism decreased with certain program features, while other program features actually increased recidivism. A multivariate meta-regression showed substantial variation across Cognitive-Behavioral Therapy programs. These preliminary findings suggest the need to further research how differing program features contribute to client-level outcomes, and develop a scheme to better classify programs.



JAMA ◽  
2007 ◽  
Vol 297 (5) ◽  
pp. 465 ◽  
Author(s):  
Toshi A. Furukawa ◽  
Norio Watanabe ◽  
Ichiro M. Omori ◽  
Victor M. Montori ◽  
Gordon H. Guyatt


1990 ◽  
Vol 24 (3) ◽  
pp. 405-415 ◽  
Author(s):  
Nathaniel McConaghy

Meta-analysis replaced statistical significance with effect size in the hope of resolving controversy concerning evaluation of treatment effects. Statistical significance measured reliability of the effect of treatment, not its efficacy. It was strongly influenced by the number of subjects investigated. Effect size as assessed originally, eliminated this influence but by standardizing the size of the treatment effect could distort it. Meta-analyses which combine the results of studies which employ different subject types, outcome measures, treatment aims, no-treatment rather than placebo controls or therapists with varying experience can be misleading. To ensure discussion of these variables meta-analyses should be used as an aid rather than a substitute for literature review. While meta-analyses produce contradictory findings, it seems unwise to rely on the conclusions of an individual analysis. Their consistent finding that placebo treatments obtain markedly higher effect sizes than no treatment hopefully will render the use of untreated control groups obsolete.



Sign in / Sign up

Export Citation Format

Share Document