The role of Bayes factors in testing interactions
Psychologists are often interested whether an experimental manipulation has a different effect in condition A than in condition B. To test such a question, one needs to directly compare the conditions (i.e. test the interaction). Yet, many tend to stop when they find a significant test in one condition and a non-significant test in the other condition, and deem it as sufficient evidence for the difference between the two conditions. This tutorial aims to raise awareness of this inferential mistake when Bayes factors are used with conventional cut-offs to draw conclusions. For instance, some might falsely conclude that there must be good enough evidence for the interaction if they find good enough Bayesian evidence for H1 in condition A and good enough Bayesian evidence for H0 in condition B. The introduced case study highlights that ignoring the test of the interaction can lead to unjustified conclusions and demonstrates that the principle that any assertion about the existence of an interaction necessitates the comparison of the conditions is as true for Bayesian as it is for frequentist statistics. We provide an R script of the analyses of the case study and a Shiny App that can be used with a 2x2 design to develop intuitions on the current issue, and we introduce a rule of thumb with which one can estimate the sample size one might need to have a well-powered design.