The influence of unpublished studies on results of recent meta-analyses: publication bias, the file drawer problem, and implications for the replication crisis

Author(s):  
John A Wagner
2019 ◽  
Vol 40 (4) ◽  
pp. 416-430 ◽  
Author(s):  
Jessica S. Iwachiw ◽  
Amy Lynn Button ◽  
Jana Atlas

Researchers appear to assume that published research is limited to significant findings. If that is the case, it may be related to perceived or actual publication bias (i.e., journals publishing only significant findings) and/or the file-drawer problem (i.e., researchers not pursuing publication of null results). The lack of published null results can result in faulty decision-making based upon incomplete evidence. Thus, it is important to know the prevalence of, and the contributing factors to, researchers' failure to submit null results. Few studies have addressed this issue in psychology and none have targeted school psychology. Consequently, this study examined the file drawer problem and perception of publication bias among school psychologists. Survey data from 95 school psychology faculty indicated that participants published about half of the studies that they had conducted, suggesting that the file drawer problem is experienced by this population. While lack of time appeared to impact publication pursuit, participants' responses also suggested they believed in publication bias. Obtaining null results substantially impacted the decision to write up studies in pursuit of publication. Therefore, it seems that a sizeable percentage of school psychology research is not available for review by researchers or practitioners.


2017 ◽  
Author(s):  
Andreas Schneck

Background Publication bias is a form of scientific misconduct. It threatens the validity of research results and the credibility of science. Although several tests on publication bias exist, no in-depth evaluations are available that suggest which test to use for the specific research problem. Methods In the study at hand four tests on publication bias, Egger’s test (FAT), p-uniform, the test of excess significance (TES), as well as the caliper test, were evaluated in a Monte Carlo simulation. Two different types of publication bias, as well as its degree (0%, 50%, 100%), were simulated. The type of publication bias was defined either as file-drawer, meaning the repeated analysis of new datasets, or p-hacking, meaning the inclusion of covariates in order to obtain a significant result. In addition, the underlying effect (β = 0, 0.5, 1, 1.5), effect heterogeneity, and the number of observations in the simulated primary studies (N =100, 500), as well as in the number of observations for the publication bias tests (K =100, 1000), were varied. Results All tests evaluated were able to identify publication bias both in the file-drawer and p-hacking condition. The false positive rates were, with the exception of the 15%- and 20%-caliper test, unbiased. The FAT had the largest statistical power in the file-drawer conditions, whereas under p-hacking the TES was, except under effect heterogeneity, slightly better. The caliper test was, however, inferior to the other tests under effect homogeneity and had a decent statistical power only in conditions with 1000 primary studies. Discussion The FAT is recommended as a test for publication bias in standard meta-analyses with no or only small effect heterogeneity. If no clear direction of publication bias is suspected the TES is the first alternative to the FAT. The 5%-caliper tests is recommended under conditions of effect heterogeneity, which may be found if publication bias is examined in a discipline-wide setting when primary studies cover different research problems.


Author(s):  
Andreas Schneck

Background Publication bias is a form of scientific misconduct. It threatens the validity of research results and the credibility of science. Although several tests on publication bias exist, no in-depth evaluations are available that suggest which test to use for the specific research problem. Methods In the study at hand four tests on publication bias, Egger’s test (FAT), p-uniform, the test of excess significance (TES), as well as the caliper test, were evaluated in a Monte Carlo simulation. Two different types of publication bias, as well as its degree (0%, 50%, 100%), were simulated. The type of publication bias was defined either as file-drawer, meaning the repeated analysis of new datasets, or p-hacking, meaning the inclusion of covariates in order to obtain a significant result. In addition, the underlying effect (β = 0, 0.5, 1, 1.5), effect heterogeneity, and the number of observations in the simulated primary studies (N =100, 500), as well as in the number of observations for the publication bias tests (K =100, 1000), were varied. Results All tests evaluated were able to identify publication bias both in the file-drawer and p-hacking condition. The false positive rates were, with the exception of the 15%- and 20%-caliper test, unbiased. The FAT had the largest statistical power in the file-drawer conditions, whereas under p-hacking the TES was, except under effect heterogeneity, slightly better. The caliper test was, however, inferior to the other tests under effect homogeneity and had a decent statistical power only in conditions with 1000 primary studies. Discussion The FAT is recommended as a test for publication bias in standard meta-analyses with no or only small effect heterogeneity. If no clear direction of publication bias is suspected the TES is the first alternative to the FAT. The 5%-caliper tests is recommended under conditions of effect heterogeneity, which may be found if publication bias is examined in a discipline-wide setting when primary studies cover different research problems.


2017 ◽  
Author(s):  
Freya Acar ◽  
Ruth Seurinck ◽  
Simon B. Eickhoff ◽  
Beatrijs Moerkerke

AbstractThe importance of integrating research findings is incontrovertible and coordinate based meta-analyses have become a popular approach to combine results of fMRI studies when only peaks of activation are reported. Similar to classical meta-analyses, coordinate based meta-analyses may be subject to different forms of publication bias which impacts results and possibly invalidates findings. We develop a tool that assesses the robustness to potential publication bias on cluster level. We investigate the possible influence of the file-drawer effect, where studies that do not report certain results fail to get published, by determining the number of noise studies that can be added to an existing fMRI meta-analysis before the results are no longer statistically significant. In this paper we illustrate this tool through an example and test the effect of several parameters through extensive simulations. We provide an algorithm for which code is freely available to generate noise studies and enables users to determine the robustness of meta-analytical results.


1987 ◽  
Vol 11 (2) ◽  
pp. 233-242 ◽  
Author(s):  
Barbara Sommer

The file drawer problem refers to a publication bias for positive results, leading to studies which support the null hypothesis being relegated to the file drawer. The assumption is that researchers are unable to publish studies with nonsignificant findings. A survey of investigators studying the menstrual cycle showed this assumption to be unwarranted. Much of the research did not lend itself to a hypothesis-testing model. A more important contribution to the likelihood of publication was research productivity, and researchers whose first study was published were more likely to have continued their work.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e4115 ◽  
Author(s):  
Andreas Schneck

Background Publication bias is a form of scientific misconduct. It threatens the validity of research results and the credibility of science. Although several tests on publication bias exist, no in-depth evaluations are available that examine which test performs best for different research settings. Methods Four tests on publication bias, Egger’s test (FAT), p-uniform, the test of excess significance (TES), as well as the caliper test, were evaluated in a Monte Carlo simulation. Two different types of publication bias and its degree (0%, 50%, 100%) were simulated. The type of publication bias was defined either as file-drawer, meaning the repeated analysis of new datasets, or p-hacking, meaning the inclusion of covariates in order to obtain a significant result. In addition, the underlying effect (β = 0, 0.5, 1, 1.5), effect heterogeneity, the number of observations in the simulated primary studies (N = 100, 500), and the number of observations for the publication bias tests (K = 100, 1,000) were varied. Results All tests evaluated were able to identify publication bias both in the file-drawer and p-hacking condition. The false positive rates were, with the exception of the 15%- and 20%-caliper test, unbiased. The FAT had the largest statistical power in the file-drawer conditions, whereas under p-hacking the TES was, except under effect heterogeneity, slightly better. The CTs were, however, inferior to the other tests under effect homogeneity and had a decent statistical power only in conditions with 1,000 primary studies. Discussion The FAT is recommended as a test for publication bias in standard meta-analyses with no or only small effect heterogeneity. If two-sided publication bias is suspected as well as under p-hacking the TES is the first alternative to the FAT. The 5%-caliper test is recommended under conditions of effect heterogeneity and a large number of primary studies, which may be found if publication bias is examined in a discipline-wide setting when primary studies cover different research problems.


2017 ◽  
Author(s):  
Pantelis Samartsidis ◽  
Silvia Montagna ◽  
Angela R. Laird ◽  
Peter T. Fox ◽  
Timothy D. Johnson ◽  
...  

AbstractCoordinate-based meta-analyses (CBMA) allow researchers to combine the results from multiple fMRI experiments with the goal of obtaining results that are more likely to generalise. However, the interpretation of CBMA findings can be impaired by the file drawer problem, a type of publications bias that refers to experiments that are carried out but are not published. Using foci per contrast count data from the BrainMap database, we propose a zero-truncated modelling approach that allows us to estimate the prevalence of non-significant experiments. We validate our method with simulations and real coordinate data generated from the Human Connectome Project. Application of our method to the data from BrainMap provides evidence for the existence of a file drawer effect, with the rate of missing experiments estimated as at least 6 per 100 reported.


2019 ◽  
Author(s):  
Jian Saturn Chen

Whether enumeration of small number of objects requires attention remains controversial. Although most recent studies argue for a role of attention in subitizing, these studies include varied stimuli and different methods of manipulating attention. It is unclear if the observed attention effects in different studies are real effects. It is also unclear whether there is publication bias in these studies. To answer this question, a systematic review and meta-analysis was performed to evaluate the attention effects on enumeration of small numbers. A total of 14 studies (22 experiments, 35 comparisons) were included in a meta-analysis to compare the attention effects on subitizing. Results from the meta-analysis suggest that a manipulation of attention can evidently modulate the behavioural performance in the subitizing range (response time, accuracy and Weber fraction). These findings were consistently observed in various experimental designs and different stimuli (p < .010; p < .001; p < .001; respectively), suggesting attention does play a role in subitizing. A new model was proposed to explain the mechanism of subitizing and enumeration. Findings in this study may contribute to the understanding of “number module” in brain and contribute to models of numerical cognition in education. However, a publication bias was observed in this study, suggesting the observed effects might not be very accurate. To better estimate the effect of attention manipulations in the subitizing range, studies with larger samples, or future meta-analyses including unpublished outcomes and unpublished studies may be required.


2015 ◽  
Vol 19 (2) ◽  
pp. 172-182 ◽  
Author(s):  
Michèle B. Nuijten ◽  
Marcel A. L. M. van Assen ◽  
Coosje L. S. Veldkamp ◽  
Jelte M. Wicherts

Replication is often viewed as the demarcation between science and nonscience. However, contrary to the commonly held view, we show that in the current (selective) publication system replications may increase bias in effect size estimates. Specifically, we examine the effect of replication on bias in estimated population effect size as a function of publication bias and the studies’ sample size or power. We analytically show that incorporating the results of published replication studies will in general not lead to less bias in the estimated population effect size. We therefore conclude that mere replication will not solve the problem of overestimation of effect sizes. We will discuss the implications of our findings for interpreting results of published and unpublished studies, and for conducting and interpreting results of meta-analyses. We also discuss solutions for the problem of overestimation of effect sizes, such as discarding and not publishing small studies with low power, and implementing practices that completely eliminate publication bias (e.g., study registration).


2019 ◽  
Vol 227 (4) ◽  
pp. 261-279 ◽  
Author(s):  
Frank Renkewitz ◽  
Melanie Keiner

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.


Sign in / Sign up

Export Citation Format

Share Document