scholarly journals Degrees of freedom in planning, running, analyzing, and reporting psychological studies A checklist to avoid p-hacking

Author(s):  
Jelte M. Wicherts ◽  
Coosje Lisabet Sterre Veldkamp ◽  
Hilde Augusteijn ◽  
Marjan Bakker ◽  
Robbie Cornelis Maria van Aert ◽  
...  

The designing, collecting, analyzing, and reporting of psychological studies entail many choices that are often arbitrary. The opportunistic use of these so-called researcher degrees of freedom aimed at obtaining statistically significant results is problematic because it enhances the chances of false positive results and may inflate effect size estimates. In this review article, we present an extensive list of 34 degrees of freedom that researchers have in formulating hypotheses, and in designing, running, analyzing, and reporting of psychological research. The list can be used in research methods education, and as a checklist to assess the quality of preregistrations and to determine the potential for bias due to (arbitrary) choices in unregistered studies.

2012 ◽  
Vol 7 (6) ◽  
pp. 562-571 ◽  
Author(s):  
Roger Giner-Sorolla

The current crisis in psychological research involves issues of fraud, replication, publication bias, and false positive results. I argue that this crisis follows the failure of widely adopted solutions to psychology’s similar crisis of the 1970s. The untouched root cause is an information-economic one: Too many studies divided by too few publication outlets equals a bottleneck. Articles cannot pass through just by showing theoretical meaning and methodological rigor; their results must appear to support the hypothesis perfectly. Consequently, psychologists must master the art of presenting perfect-looking results just to survive in the profession. This favors aesthetic criteria of presentation in a way that harms science’s search for truth. Shallow standards of statistical perfection distort analyses and undermine the accuracy of cumulative data; narrative expectations encourage dishonesty about the relationship between results and hypotheses; criteria of novelty suppress replication attempts. Concerns about truth in research are emerging in other sciences and may eventually descend on our heads in the form of difficult and insensitive regulations. I suggest a more palatable solution: to open the bottleneck, putting structures in place to reward broader forms of information sharing beyond the exquisite art of present-day journal publication.


2021 ◽  
Vol 44 ◽  
Author(s):  
Robert M. Ross ◽  
Robbie C. M. van Aert ◽  
Olmo R. van den Akker ◽  
Michiel van Elk

Abstract Lee and Schwarz interpret meta-analytic research and replication studies as providing evidence for the robustness of cleansing effects. We argue that the currently available evidence is unconvincing because (a) publication bias and the opportunistic use of researcher degrees of freedom appear to have inflated meta-analytic effect size estimates, and (b) preregistered replications failed to find any evidence of cleansing effects.


2018 ◽  
Author(s):  
Marjan Bakker ◽  
Coosje Lisabet Sterre Veldkamp ◽  
Marcel A. L. M. van Assen ◽  
Elise Anne Victoire Crompvoets ◽  
How Hwee Ong ◽  
...  

Researchers face many, often seemingly arbitrary choices in formulating hypotheses, designing protocols, collecting data, analyzing data, and reporting results. Opportunistic use of ‘researcher degrees of freedom’ aimed at obtaining statistical significance increases the likelihood of obtaining and publishing false positive results and overestimated effect sizes. Preregistration is a mechanism for reducing such degrees of freedom by specifying designs and analysis plans before observing the research outcomes. The effectiveness of preregistration may depend, in part, on whether the process facilitates sufficiently specific articulation of such plans. In this preregistered study, we compared two formats of preregistration available on the OSF: Standard Pre-Data Collection Registration and Prereg Challenge registration (now called “OSF Preregistration”, http://osf.io/prereg/). The Prereg Challenge format was a structured workflow with detailed instructions, and an independent review to confirm completeness; the “Standard” format was unstructured with minimal direct guidance to give researchers flexibility for what to pre-specify. Results of comparing random samples of 53 preregistrations from each format indicate that the structured format restricted the opportunistic use of researcher degrees of freedom better (Cliff’s Delta = 0.49) than the unstructured format, but neither eliminated all researcher degrees of freedom. We also observed very low concordance among coders about the number of hypotheses (14%), indicating that they are often not clearly stated. We conclude that effective preregistration is challenging, and registration formats that provide effective guidance may improve the quality of research.


2008 ◽  
Vol 106 (2) ◽  
pp. 645-649 ◽  
Author(s):  
Andrew Brand ◽  
Michael T. Bradley ◽  
Lisa A. Best ◽  
George Stoica

2021 ◽  
Author(s):  
Anton Olsson-Collentine ◽  
Robbie Cornelis Maria van Aert ◽  
Marjan Bakker ◽  
Jelte M. Wicherts

There are arbitrary decisions to be made (i.e., researcher degrees of freedom) in the execution and reporting of most research. These decisions allow for many possible outcomes from a single study. Selective reporting of results from this ‘multiverse’ of outcomes, whether intentional (_p_-hacking) or not, can lead to inflated effect size estimates and false positive results in the literature. In this study, we examine and illustrate the consequences of researcher degrees of freedom in primary research, both for primary outcomes and for subsequent meta-analyses. We used a set of 10 preregistered multi-lab direct replication projects from psychology (Registered Replication Reports) with a total of 14 primary outcome variables, 236 labs and 37,602 participants. By exploiting researcher degrees of freedom in each project, we were able to compute between 3,840 and 2,621,440 outcomes per lab. We show that researcher degrees of freedom in primary research can cause substantial variability in effect size that we denote the Underlying Multiverse Variability (UMV). In our data, the median UMV across labs was 0.1 standard deviations (interquartile range = 0.09 – 0.15). In one extreme case, the effect size estimate could change by _d_ = 1.27, evidence that _p_-hacking in some (rare) cases can provide support for almost any conclusion. We also show that researcher degrees of freedom in primary research provide another source of uncertainty in meta-analysis beyond those usually estimated. This would not be a large concern for meta-analysis if researchers made all arbitrary decisions at random. However, emulating selective reporting of lab results led to inflation of meta-analytic average effect size estimates in our data by as much as 0.1 - 0.48 standard deviations, depending to a large degree on the number of possible outcomes at the lab level (i.e., multiverse size). Our results illustrate the importance of making research decisions transparent (e.g., through preregistration and multiverse analysis), evaluating studies for selective reporting, and whenever feasible making raw data available.


2021 ◽  
Author(s):  
Robert M Ross ◽  
Robbie Cornelis Maria van Aert ◽  
Olmo Van den Akker ◽  
Michiel van Elk

Lee and Schwarz interpret meta-analytic research and replication studies as providing evidence for the robustness of cleansing effects. We argue that the currently available evidence is unconvincing because (a) publication bias and the opportunistic use of researcher degrees of freedom appear to have inflated meta-analytic effect size estimates, and (b) preregistered replications failed to find any evidence of cleansing effects.


2016 ◽  
Vol 12 (3) ◽  
pp. 594-596 ◽  
Author(s):  
Michael J. Rovito

The debate of whether testicular self-examination (TSE) should be promoted among males generally centers on a harm–benefit corollary. The benefits of TSE include improving health outcomes, inclusive of an increase in both quality of life and knowledge/awareness of potential health concerns, as well as promoting proactivity in achieving wellness. The harms include claims that false-positive results can increase anxiety and produce costs via unnecessary treatments and therapies. Further claims point to the lack of evidence suggesting TSE decreases testicular cancer mortality. This commentary primarily discusses the anxiety portion of this debate from a logic-based perspective. The argument that TSE should not be promoted among males due to the risk of inciting false-positive anxiety appears to be flawed. A 5-point perspective is presented on the illogical discouragement of TSE due to theorized levels of false-positive anxiety while existing evidence suggests late-stage testicular cancer is associated with anxiety and depression.


2019 ◽  
Author(s):  
Tyson S. Barrett ◽  
Ginger Lockhart ◽  
Rick Anthony Cruz

Mediation analysis is a widely used technique within the psychological sciences and has been shown to be an effective tool to evaluate explanatory pathways between predictors and outcomes. Multiple effect size metrics have been developed; however, mediation analysis has been slow to develop accessible, interpretable effect size metrics in the cases of categorical (or otherwise non-normally distributed) mediators and/or outcomes. Herein, we propose the use of average marginal effects within mediation analysis to alleviate these issues—termed Marginal Mediation Analysis. The method provides interpretable indirect and direct effect size estimates in the same units as the outcome even when mediators and/or outcomes are categorical, a count measure, or another non-normal distribution. The approach is shown to fit the causal definitions of mediation analysis. We further present results of Monte Carlo simulations that show the utility of the proposed method in psychological research. We also discuss the assumptions inherent in the approach. We conclude by showing an application of it to adolescent health-risk behavior data (n = 13,600), demonstrating the increased interpretability and information provided compared to other common approaches.


Sign in / Sign up

Export Citation Format

Share Document