scholarly journals Addressing Publication Bias in Meta-Analysis

2020 ◽  
Vol 228 (1) ◽  
pp. 50-61 ◽  
Author(s):  
Sho Tsuji ◽  
Alejandrina Cristia ◽  
Michael C. Frank ◽  
Christina Bergmann

Abstract. Meta-analyses are an indispensable research synthesis tool for characterizing bodies of literature and advancing theories. One important open question concerns the inclusion of unpublished data into meta-analyses. Finding such studies can be effortful, but their exclusion potentially leads to consequential biases like overestimation of a literature’s mean effect. We address two questions about unpublished data using MetaLab, a collection of community-augmented meta-analyses focused on developmental psychology. First, we assess to what extent MetaLab datasets include gray literature, and by what search strategies they are unearthed. We find that an average of 11% of datapoints are from unpublished literature; standard search strategies like database searches, complemented with individualized approaches like including authors’ own data, contribute the majority of this literature. Second, we analyze the effect of including versus excluding unpublished literature on estimates of effect size and publication bias, and find this decision does not affect outcomes. We discuss lessons learned and implications.

2019 ◽  
Author(s):  
Sho Tsuji ◽  
Alejandrina Cristia ◽  
Michael C. Frank ◽  
Christina Bergmann

Meta-analyses are an indispensable research synthesis tool for characterizing bodies of literature and advancing theories. One important open question concerns the inclusion of unpublished data into meta-analyses. Finding such studies can be effortful, but their exclusion potentially leads to consequential biases like overestimation of a literature’s mean effect. We address two key questions using MetaLab, a collection of community-augmented meta-analyses focused on developmental psychology. First, we assess to what extent these datasets include grey literature, and by what search strategies they are unearthed. An average of 11% of datapoints are from unpublished literature, and that standard search strategies like database searches, complemented with individualized approaches like including authors’ own data, contribute the majority of this literature. Second, we analyze the effect of including versus excluding unpublished literature on estimates of effect size and publication bias, and find this decision does not affect outcomes. We discuss lessons learned and implications.


2019 ◽  
Author(s):  
Maya B Mathur ◽  
Tyler VanderWeele

Selective publication and reporting in individual papers compromise the scientific record, but are meta-analyses as compromised as their constituent studies? We systematically sampled 63 meta-analyses (each comprising at least 40 studies) in PLOS One, top medical journals, top psychology journals, and Metalab, an online, open-data database of developmental psychology meta-analyses. We empirically estimated publication bias in each. Across all meta-analyses, “statistically significant” results in the expected direction were only 1.17 times more likely to be published than “nonsignificant” results or those in the unexpected direction (95%CI: [0.94, 1.47]), with a confidence interval substantially overlapping the null. Comparable estimates were 0.83 for meta-analyses in PLOS One, 1.02 for top medical journals, 1.54 for top psychology journals, and 4.70 for Metalab. The severity of publication bias did differ across individual meta-analyses; in a small minority (10%; 95% CI: [2%, 21%]), publication bias appeared to favor "significant" results in the expected direction by more than 3-fold. We estimated that for 89% of meta-analyses, the amount of publication bias that would be required to attenuate the point estimate to the null exceeded the amount of publication estimated to be actually present in the vast majority of meta-analyses from the relevant scientific discipline (exceeding the 95th percentile of publication bias). Study-level measures (“statistical significance” with a point estimate in the expected direction and point estimate size) did not indicate more publication bias in higher-tier versus lower-tier journals, nor in the earliest studies published on a topic versus later studies. Overall, the mere act of performing a meta-analysis with a large number of studies (at least 40) and that includes non-headline results may largely mitigate publication bias in meta-analyses, suggesting optimism about the validity of meta-analytic results.


2018 ◽  
Author(s):  
Michele B. Nuijten ◽  
Marcel A. L. M. van Assen ◽  
Hilde Augusteijn ◽  
Elise Anne Victoire Crompvoets ◽  
Jelte M. Wicherts

In this meta-study, we analyzed 2,442 effect sizes from 131 meta-analyses in intelligence research, published from 1984 to 2014, to estimate the average effect size, median power, and evidence for bias. We found that the average effect size in intelligence research was a Pearson’s correlation of .26, and the median sample size was 60. Furthermore, across primary studies, we found a median power of 11.9% to detect a small effect, 54.5% to detect a medium effect, and 93.9% to detect a large effect. We documented differences in average effect size and median estimated power between different types of in intelligence studies (correlational studies, studies of group differences, experiments, toxicology, and behavior genetics). On average, across all meta-analyses (but not in every meta-analysis), we found evidence for small study effects, potentially indicating publication bias and overestimated effects. We found no differences in small study effects between different study types. We also found no convincing evidence for the decline effect, US effect, or citation bias across meta-analyses. We conclude that intelligence research does show signs of low power and publication bias, but that these problems seem less severe than in many other scientific fields.


2020 ◽  
Vol 46 (2-3) ◽  
pp. 343-354 ◽  
Author(s):  
Timothy R Levine ◽  
René Weber

Abstract We examined the interplay between how communication researchers use meta-analyses to make claims and the prevalence, causes, and implications of unresolved heterogeneous findings. Heterogeneous findings can result from substantive moderators, methodological artifacts, and combined construct invalidity. An informal content analysis of meta-analyses published in four elite communication journals revealed that unresolved between-study effect heterogeneity was ubiquitous. Communication researchers mainly focus on computing mean effect sizes, to the exclusion of how effect sizes in primary studies are distributed and of what might be driving effect size distributions. We offer four recommendations for future meta-analyses. Researchers are advised to be more diligent and sophisticated in testing for heterogeneity. We encourage greater description of how effects are distributed, coupled with greater reliance on graphical displays. We council greater recognition of combined construct invalidity and advocate for content expertise. Finally, we endorse greater awareness and improved tests for publication bias and questionable research practices.


Author(s):  
John C. Norcross ◽  
Thomas P. Hogan ◽  
Gerald P. Koocher ◽  
Lauren A. Maggio

Assessing and interpreting research reports involves examination of individual studies as well as summaries of many studies. Summaries may be conveyed in narrative reviews or, more typically, in meta-analyses. This chapter reviews how researchers conduct a meta-analysis and report the results, especially by means of forest plots, which incorporate measures of effect size and their confidence intervals. A meta-analysis may also use moderator analyses or meta-regressions to identify important influences on the results. Critical appraisal of a study requires careful attention to the details of the sample used, the independent variable (treatment), dependent variable (outcome measure), the comparison groups, and the relation between the stated conclusions and the actual results. The CONSORT flow diagram provides a context for interpreting the sample and comparison groups. Finally, users must be alert to possible artifacts of publication bias.


2020 ◽  
Vol 8 (4) ◽  
pp. 36
Author(s):  
Michèle B. Nuijten ◽  
Marcel A. L. M. van Assen ◽  
Hilde E. M. Augusteijn ◽  
Elise A. V. Crompvoets ◽  
Jelte M. Wicherts

In this meta-study, we analyzed 2442 effect sizes from 131 meta-analyses in intelligence research, published from 1984 to 2014, to estimate the average effect size, median power, and evidence for bias. We found that the average effect size in intelligence research was a Pearson’s correlation of 0.26, and the median sample size was 60. Furthermore, across primary studies, we found a median power of 11.9% to detect a small effect, 54.5% to detect a medium effect, and 93.9% to detect a large effect. We documented differences in average effect size and median estimated power between different types of intelligence studies (correlational studies, studies of group differences, experiments, toxicology, and behavior genetics). On average, across all meta-analyses (but not in every meta-analysis), we found evidence for small-study effects, potentially indicating publication bias and overestimated effects. We found no differences in small-study effects between different study types. We also found no convincing evidence for the decline effect, US effect, or citation bias across meta-analyses. We concluded that intelligence research does show signs of low power and publication bias, but that these problems seem less severe than in many other scientific fields.


2018 ◽  
Author(s):  
Robbie Cornelis Maria van Aert

More and more scientific research gets published nowadays, asking for statistical methods that enable researchers to get an overview of the literature in a particular research field. For that purpose, meta-analysis methods were developed that can be used for statistically combining the effect sizes from independent primary studies on the same topic. My dissertation focuses on two issues that are crucial when conducting a meta-analysis: publication bias and heterogeneity in primary studies’ true effect sizes. Accurate estimation of both the meta-analytic effect size as well as the between-study variance in true effect size is crucial since the results of meta-analyses are often used for policy making. Publication bias distorts the results of a meta-analysis since it refers to situations where publication of a primary study depends on its results. We developed new meta-analysis methods, p-uniform and p-uniform*, which estimate effect sizes corrected for publication bias and also test for publication bias. Although the methods perform well in many conditions, these and the other existing methods are shown not to perform well when researchers use questionable research practices. Additionally, when publication bias is absent or limited, traditional methods that do not correct for publication bias outperform p¬-uniform and p-uniform*. Surprisingly, we found no strong evidence for the presence of publication bias in our pre-registered study on the presence of publication bias in a large-scale data set consisting of 83 meta-analyses and 499 systematic reviews published in the fields of psychology and medicine. We also developed two methods for meta-analyzing a statistically significant published original study and a replication of that study, which reflects a situation often encountered by researchers. One method is a frequentist whereas the other method is a Bayesian statistical method. Both methods are shown to perform better than traditional meta-analytic methods that do not take the statistical significance of the original study into account. Analytical studies of both methods also show that sometimes the original study is better discarded for optimal estimation of the true effect size. Finally, we developed a program for determining the required sample size in a replication analogous to power analysis in null hypothesis testing. Computing the required sample size with the method revealed that large sample sizes (approximately 650 participants) are required to be able to distinguish a zero from a small true effect.Finally, in the last two chapters we derived a new multi-step estimator for the between-study variance in primary studies’ true effect sizes, and examined the statistical properties of two methods (Q-profile and generalized Q-statistic method) to compute the confidence interval of the between-study variance in true effect size. We proved that the multi-step estimator converges to the Paule-Mandel estimator which is nowadays one of the recommended methods to estimate the between-study variance in true effect sizes. Two Monte-Carlo simulation studies showed that the coverage probabilities of Q-profile and generalized Q-statistic method can be substantially below the nominal coverage rate if the assumptions underlying the random-effects meta-analysis model were violated.


2020 ◽  
Author(s):  
Robbie Cornelis Maria van Aert ◽  
Helen Niemeyer

Meta-analysis is the statistical method for synthesizing studies on the same topic and is often used in clinical psychology to quantify the efficacy of treatments. A major threat to the validity of meta-analysis is publication bias, which implies that some studies are less likely to be published and are therefore less often included in a meta-analysis. A consequence of publication bias is the overestimation of the meta-analytic effect size that may give a false impression with respect to the efficacy of a treatment, which might result in (avoidable) suffering of patients and waste of resources. Guidelines recommend to routinely assess publication bias in meta-analyses, but this is currently not common practice. This chapter describes popular and state-of-the-art methods to assess publication bias in a meta-analysis and summarizes recommendations for applying these methods. We also illustrate how these methods can be applied to two meta-analyses that are typical for clinical psychology such that psychologists can readily apply the methods in their own meta-analyses.


2020 ◽  
Author(s):  
Malte Friese ◽  
Julius Frankenbach

Science depends on trustworthy evidence. Thus, a biased scientific record is of questionable value because it impedes scientific progress, and the public receives advice on the basis of unreliable evidence that has the potential to have far-reaching detrimental consequences. Meta-analysis is a valid and reliable technique that can be used to summarize research evidence. However, meta-analytic effect size estimates may themselves be biased, threatening the validity and usefulness of meta-analyses to promote scientific progress. Here, we offer a large-scale simulation study to elucidate how p-hacking and publication bias distort meta-analytic effect size estimates under a broad array of circumstances that reflect the reality that exists across a variety of research areas. The results revealed that, first, very high levels of publication bias can severely distort the cumulative evidence. Second, p-hacking and publication bias interact: At relatively high and low levels of publication bias, p-hacking does comparatively little harm, but at medium levels of publication bias, p-hacking can considerably contribute to bias, especially when the true effects are very small or are approaching zero. Third, p-hacking can severely increase the rate of false positives. A key implication is that, in addition to preventing p-hacking, policies in research institutions, funding agencies, and scientific journals need to make the prevention of publication bias a top priority to ensure a trustworthy base of evidence.


2019 ◽  
Vol 227 (4) ◽  
pp. 261-279 ◽  
Author(s):  
Frank Renkewitz ◽  
Melanie Keiner

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.


Sign in / Sign up

Export Citation Format

Share Document