scholarly journals Benefits of Open and High-Powered Research Outweigh Costs

2016 ◽  
Author(s):  
Etienne P. LeBel ◽  
Lorne Campbell ◽  
Timothy Loving

Several researchers recently outlined unacknowledged costs of open science practices, arguing these costs may outweigh benefits and stifle discovery of novel findings. We scrutinize these researchers' (1) statistical concern that heightened stringency with respect to false-positives will increase false-negatives and (2) meta-scientific concern that larger samples and executing direct replications engender opportunity costs that will decrease the rate of making novel discoveries. We argue their statistical concern is unwarranted given open science proponents recommend such practices to reduce the inflated Type I error rate from .35 down to .05 and simultaneously call for high-powered research to reduce the inflated Type II error rate. Regarding their meta-concern, we demonstrate that incurring some costs is required to increase the rate (and frequency) of making true discoveries because distinguishing true from false hypotheses requires a low Type I error rate, high statistical power, and independent direct replications. We also examine pragmatic concerns raised regarding adopting open science practices for relationship science (pre-registration, open materials, open data, direct replications, sample size); while acknowledging these concerns, we argue they are overstated given available solutions. We conclude benefits of open science practices outweigh costs for both individual researchers and the collective field in the long run, but that short term costs may exist for researchers because of the currently dysfunctional academic incentive structure. Our analysis implies our field's incentive structure needs to change whereby better alignment exists between researcher's career interests and the field's cumulative progress. We delineate recent proposals aimed at such incentive structure re-alignment.

2021 ◽  
Author(s):  
Haocheng Ding ◽  
Lingsong Meng ◽  
Andrew C. Liu ◽  
Michelle L. Gumz ◽  
Andrew J. Bryant ◽  
...  

AbstractCircadian rhythmicity in transcriptomic profiles has been shown in many physiological processes, and the disruption of circadian patterns has been founded to associate with several diseases. In this paper, we developed a series of likelihood-based methods to detect (i) circadian rhythmicity (denoted as LR rhythmicity) and (ii) differential circadian patterns comparing two experimental conditions (denoted as LR diff). In terms of circadian rhythmicity detection, we demonstrated that our proposed LR rhythmicity could better control the type I error rate compared to existing methods under a wide variety of simulation settings. In terms of differential circadian patterns, we developed methods in detecting differential amplitude, differential phase, differential basal level, and differential fit, which also successfully controlled the type I error rate. In addition, we demonstrated that the proposed LR diff could achieve higher statistical power in detecting differential fit, compared to existing methods. The superior performance of LR rhythmicity and LR diff was demonstrated in two real data applications, including a brain aging data (gene expression microarray data of human postmortem brain) and a time-restricted feeding data (RNA sequencing data of human skeletal muscles). An R package for our methods is publicly available on GitHub https://github.com/diffCircadian/diffCircadian.


2019 ◽  
Author(s):  
Axel Mayer ◽  
Felix Thoemmes

The analysis of variance (ANOVA) is still one of the most widely used statistical methods in the social sciences. This paper is about stochastic group weights in ANOVA models – a neglected aspect in the literature. Stochastic group weights are present whenever the experimenter does not determine the exact group sizes before conducting the experiment. We show that classic ANOVA tests based on estimated marginal means can have an inflated type I error rate when stochastic group weights are not taken into account, even in randomized experiments. We propose two new ways to incorporate stochastic group weights in the tests of average effects - one based on the general linear model and one based on multigroup structural equation models (SEMs). We show in simulation studies that our methods have nominal type I error rates in experiments with stochastic group weights while classic approaches show an inflated type I error rate. The SEM approach can additionally deal with heteroscedastic residual variances and latent variables. An easy-to-use software package with graphical user interface is provided.


2021 ◽  
Author(s):  
Haiyang Jin

Analysis of variance (ANOVA) is one of the most popular statistical methods employed for data analysis in psychology and other fields. Nevertheless, ANOVA is frequently used as an exploratory approach, even in confirmatory studies with explicit hypotheses. Such misapplication may invalidate ANOVA conventions, resulting in reduced statistical power, and even threatening the validity of conclusions. This paper evaluates the appropriateness of ANOVA conventions, discusses the potential motivations possibly misunderstood by researchers, and provides practical suggestions. Moreover, this paper proposes to control the Type I error rate with Hypothesis-based Type I Error Rate to consider both the number of tests and their logical relationships in rejecting the null hypothesis. Furthermore, this paper introduces the simple interaction analysis, which can employ the most straightforward interaction to test a hypothesis of interest. Finally, pre-registration is recommended to provide clarity for the selection of appropriate ANOVA tests in both confirmatory and exploratory studies.


2018 ◽  
Author(s):  
David Zelený

AbstractQuestionsCommunity weighted mean (CWM) approach analyses the relationship species attributes (like traits or Ellenberg-type indicator values) to sample attributes (environmental variables). Recently it has been shown to suffer from inflated Type I error rate if tested by standard parametric or (row-based) permutation test. Results of many published studies are likely influenced, reporting overly optimistic relationships that are in fact merely a numerical artefact. Can we evaluate results of which studies are likely to be influenced and how much?MethodsI suggest that hypotheses commonly tested by CWM approach are classified into three categories, which differ by assumption they make about the link of species composition to either species or sample attributes. I used a set of simulated and one simple real dataset to show how is the inflated Type I error rate influenced by data characteristics.ResultsFor hypotheses assuming the link of species composition to species attributes, CWM approach with standard test returns correct Type I error rate. However, for the other two categories (assuming link of species composition to sample attributes or not assuming any link) it returns inflated Type I error rate and requires alternative tests to control for it (column-based and max test, respectively). Inflation index is negatively related to the beta diversity of species composition and positively to the strength of species composition-sample attributes relationship and the number of samples in the dataset. Inflation index is also influenced by modifying species composition matrix (by transformation or removal of species). The relationship of CWM with intrinsic species attributes is a case of spurious correlation and can be tested by column-based (modified) permutation test.ConclusionsThe concept of three hypothesis categories offers a simple tool to evaluate whether given study reports correct or inflated Type I error rate, and how inflated the rate can be.


2019 ◽  
Author(s):  
Varun Saravanan ◽  
Gordon J. Berman ◽  
Samuel J. Sober

AbstractA common feature in many neuroscience datasets is the presence of hierarchical data structures, most commonly recording the activity of multiple neurons in multiple animals across multiple trials. Accordingly, the measurements constituting the dataset are not independent, even though the traditional statistical analyses often applied in such cases (e.g. Student’s t-test) treat them as such. The hierarchical bootstrap has been shown to be an effective tool to accurately analyze such data and while it has been used extensively in the statistical literature, its use is not widespread in neuroscience - despite the ubiquity of hierarchical datasets. In this paper, we illustrate the intuitiveness and utility of this approach to analyze hierarchically nested datasets. We use simulated neural data to show that traditional statistical tests can result in a false positive rate of over 45%, even if the Type-I error rate is set at 5%. While summarizing data across non-independent points (or lower levels) can potentially fix this problem, this approach greatly reduces the statistical power of the analysis. The hierarchical bootstrap, when applied sequentially over the levels of the hierarchical structure, keeps the Type-I error rate within the intended bound and retains more statistical power than summarizing methods. We conclude by demonstrating the effectiveness of the method in two real-world examples, first analyzing singing data in male Bengalese finches (Lonchura striata var. domestica) and second quantifying changes in behavior under optogenetic control in flies (Drosophila melanogaster).


2004 ◽  
Vol 94 (3_suppl) ◽  
pp. 1419-1420 ◽  
Author(s):  
Markus Neuhäuser

It is shown that the procedure of applying the Wilcoxon test after Levene's transformation can have an inflated Type I error rate when distributions are skewed. Thus, when the data may come from an asymmetric distribution, the Wilcoxon test should not be applied as a test for homogeneity of variances after Levene's transformation.


2014 ◽  
Vol 53 (05) ◽  
pp. 343-343

We have to report marginal changes in the empirical type I error rates for the cut-offs 2/3 and 4/7 of Table 4, Table 5 and Table 6 of the paper “Influence of Selection Bias on the Test Decision – A Simulation Study” by M. Tamm, E. Cramer, L. N. Kennes, N. Heussen (Methods Inf Med 2012; 51: 138 –143). In a small number of cases the kind of representation of numeric values in SAS has resulted in wrong categorization due to a numeric representation error of differences. We corrected the simulation by using the round function of SAS in the calculation process with the same seeds as before. For Table 4 the value for the cut-off 2/3 changes from 0.180323 to 0.153494. For Table 5 the value for the cut-off 4/7 changes from 0.144729 to 0.139626 and the value for the cut-off 2/3 changes from 0.114885 to 0.101773. For Table 6 the value for the cut-off 4/7 changes from 0.125528 to 0.122144 and the value for the cut-off 2/3 changes from 0.099488 to 0.090828. The sentence on p. 141 “E.g. for block size 4 and q = 2/3 the type I error rate is 18% (Table 4).” has to be replaced by “E.g. for block size 4 and q = 2/3 the type I error rate is 15.3% (Table 4).”. There were only minor changes smaller than 0.03. These changes do not affect the interpretation of the results or our recommendations.


2003 ◽  
Vol 22 (5) ◽  
pp. 665-675 ◽  
Author(s):  
Weichung J. Shih ◽  
Peter Ouyang ◽  
Hui Quan ◽  
Yong Lin ◽  
Bart Michiels ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document