scholarly journals Low statistical power and overestimated anthropogenic impacts, exacerbated by publication bias, dominate field studies in global change biology

2021 ◽  
Author(s):  
Yefeng Yang ◽  
Helmut Hillebrand ◽  
Malgorzata Lagisz ◽  
Ian Cleasby ◽  
Shinichi Nakagawa
2021 ◽  
Author(s):  
Yefeng Yang ◽  
Helmut Hillebrand ◽  
Malgorzata Lagisz ◽  
Ian Cleasby ◽  
Shinichi Nakagawa

Field studies are essential to reliably quantify ecological responses to global change because they are exposed to realistic climate manipulations. Yet such studies are limited in replicates, resulting in less power and, therefore, unreliable effect estimates. Further, while manipulative field experiments are assumed to be more powerful than non-manipulative observations, it has rarely been scrutinized using extensive data. Here, using 3,847 field experiments that were designed to estimate the effect of environmental stressors on ecosystems, we systematically quantified their statistical power and magnitude (Type M) and sign (Type S) errors. Our investigations focused upon the reliability of field experiments to assess the effect of stressors on both ecosystem’s response magnitude and variability. When controlling for publication bias, single experiments were underpowered to detect response magnitude (median power: 18% – 38% depending on mean difference metrics). Single experiments also had much lower power to detect response variability (6% – 12% depending on variance difference metrics) than response magnitude. Such underpowered studies could exaggerate estimates of response magnitude by 2 – 3 times (Type M errors) and variability by 4 – 10 times. Type S errors were comparatively rare. These observations indicate that low power, coupled with publication bias, inflates the estimates of anthropogenic impacts. Importantly, we found that meta-analyses largely mitigated the issues of low power and exaggerated effect size estimates. Rather surprisingly, manipulative experiments and non-manipulative observations had very similar results in terms of their power, Type M and S errors. Therefore, the previous assumption about the superiority of manipulative experiments in terms of power is overstated. These results call for highly powered field studies to reliably inform theory building and policymaking, via more collaboration and team science, and large-scale ecosystem facilities. Future studies also require transparent reporting and open science practices to approach reproducible and reliable empirical work and evidence synthesis.


2019 ◽  
Vol 227 (4) ◽  
pp. 261-279 ◽  
Author(s):  
Frank Renkewitz ◽  
Melanie Keiner

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.


Author(s):  
Domenico D'Alelio ◽  
Salvatore Rampone ◽  
Luigi Maria Cusano ◽  
Nadia Sanseverino ◽  
Luca Russo ◽  
...  

2018 ◽  
Vol 374 (1763) ◽  
pp. 20170392 ◽  
Author(s):  
Carrie Andrew ◽  
Jeffrey Diez ◽  
Timothy Y. James ◽  
Håvard Kauserud

For several hundred years, millions of fungal sporocarps have been collected and deposited in worldwide collections (fungaria) to support fungal taxonomy. Owing to large-scale digitization programs, metadata associated with the records are now becoming publicly available, including information on taxonomy, sampling location, collection date and habitat/substrate information. This metadata, as well as data extracted from the physical fungarium specimens themselves, such as DNA sequences and biochemical characteristics, provide a rich source of information not only for taxonomy but also for other lines of biological inquiry. Here, we highlight and discuss how this information can be used to investigate emerging topics in fungal global change biology and beyond. Fungarium data are a prime source of knowledge on fungal distributions and richness patterns, and for assessing red-listed and invasive species. Information on collection dates has been used to investigate shifts in fungal distributions as well as phenology of sporocarp emergence in response to climate change. In addition to providing material for taxonomy and systematics, DNA sequences derived from the physical specimens provide information about fungal demography, dispersal patterns, and are emerging as a source of genomic data. As DNA analysis technologies develop further, the importance of fungarium specimens as easily accessible sources of information will likely continue to grow. This article is part of the theme issue ‘Biological collections for understanding biodiversity in the Anthropocene’.


Author(s):  
Thomas Groß

AbstractBackground. In recent years, cyber security user studies have been appraised in meta-research, mostly focusing on the completeness of their statistical inferences and the fidelity of their statistical reporting. However, estimates of the field’s distribution of statistical power and its publication bias have not received much attention.Aim. In this study, we aim to estimate the effect sizes and their standard errors present as well as the implications on statistical power and publication bias.Method. We built upon a published systematic literature review of 146 user studies in cyber security (2006–2016). We took into account 431 statistical inferences including t-, $$\chi ^2$$ χ 2 -, r-, one-way F-tests, and Z-tests. In addition, we coded the corresponding total sample sizes, group sizes and test families. Given these data, we established the observed effect sizes and evaluated the overall publication bias. We further computed the statistical power vis-à-vis of parametrized population thresholds to gain unbiased estimates of the power distribution.Results. We obtained a distribution of effect sizes and their conversion into comparable log odds ratios together with their standard errors. We, further, gained funnel-plot estimates of the publication bias present in the sample as well as insights into the power distribution and its consequences.Conclusions. Through the lenses of power and publication bias, we shed light on the statistical reliability of the studies in the field. The upshot of this introspection is practical recommendations on conducting and evaluating studies to advance the field.


Author(s):  
Valentin Amrhein ◽  
Fränzi Korner-Nievergelt ◽  
Tobias Roth

The widespread use of 'statistical significance' as a license for making a claim of a scientific finding leads to considerable distortion of the scientific process (American Statistical Association, Wasserstein & Lazar 2016). We review why degrading p-values into 'significant' and 'nonsignificant' contributes to making studies irreproducible, or to making them seem irreproducible. A major problem is that we tend to take small p-values at face value, but mistrust results with larger p-values. In either case, p-values can tell little about reliability of research, because they are hardly replicable even if an alternative hypothesis is true. Also significance (p≤0.05) is hardly replicable: at a realistic statistical power of 40%, given that there is a true effect, only one in six studies will significantly replicate the significant result of another study. Even at a good power of 80%, results from two studies will be conflicting, in terms of significance, in one third of the cases if there is a true effect. This means that a replication cannot be interpreted as having failed only because it is nonsignificant. Many apparent replication failures may thus reflect faulty judgement based on significance thresholds rather than a crisis of unreplicable research. Reliable conclusions on replicability and practical importance of a finding can only be drawn using cumulative evidence from multiple independent studies. However, applying significance thresholds makes cumulative knowledge unreliable. One reason is that with anything but ideal statistical power, significant effect sizes will be biased upwards. Interpreting inflated significant results while ignoring nonsignificant results will thus lead to wrong conclusions. But current incentives to hunt for significance lead to publication bias against nonsignificant findings. Data dredging, p-hacking and publication bias should be addressed by removing fixed significance thresholds. Consistent with the recommendations of the late Ronald Fisher, p-values should be interpreted as graded measures of the strength of evidence against the null hypothesis. Also larger p-values offer some evidence against the null hypothesis, and they cannot be interpreted as supporting the null hypothesis, falsely concluding that 'there is no effect'. Information on possible true effect sizes that are compatible with the data must be obtained from the observed effect size, e.g., from a sample average, and from a measure of uncertainty, such as a confidence interval. We review how confusion about interpretation of larger p-values can be traced back to historical disputes among the founders of modern statistics. We further discuss potential arguments against removing significance thresholds, such as 'we need more stringent decision rules', 'sample sizes will decrease' or 'we need to get rid of p-values'.


Sign in / Sign up

Export Citation Format

Share Document