scholarly journals Advantages Masquerading as ‘Issues’ in Bayesian Hypothesis Testing: A Commentary on Tendeiro and Kiers (2019)

Author(s):  
Don van Ravenzwaaij ◽  
Eric-Jan Wagenmakers

Tendeiro and Kiers (2019) provide a detailed and scholarly critique of Null Hypothesis Bayesian Testing (NHBT) and its central component –the Bayes factor– that allows researchers to update knowledge and quantify statistical evidence. Tendeiro and Kiers conclude that NHBT constitutes an improvement over frequentist p-values, but primarily elaborate on a list of eleven ‘issues’ of NHBT. In this commentary, we provide context to each issue and conclude that many issues may in fact be conceived as pronounced advantages of NHBT.

2019 ◽  
Author(s):  
Jorge Tendeiro ◽  
Henk Kiers ◽  
Don van Ravenzwaaij

Description: The practice of sequentially testing a null hypothesis as data are collected until the null hypothesis is rejected is known as optional stopping. It is well-known that optional stopping is problematic in the context of null hypothesis significance testing: The false positive rates quickly overcome the single test’s significance level. However, the state of affairs under null hypothesis Bayesian testing, where p-values are replaced by Bayes factors, is perhaps surprisingly much less consensual. Rouder (2014) used simulations to defend the use of optional stopping under null hypothesis Bayesian testing. The idea behind these simulations is closely related to the idea of sampling from prior predictive distributions. In this paper we provide formal mathematical derivations for Rouder’s approximate simulation results for the two Bayesian hypothesis tests that he considered. The key idea is to consider the probability distribution of the Bayes factor, which is regarded as being a random variable across repeated sampling. This paper therefore offers a solid mathematical footing to the literature and we believe it is a valid contribution towards understanding the practice of optional stopping in the context of Bayesian hypothesis testing.


2021 ◽  
Vol 3 (1) ◽  
pp. 10
Author(s):  
Riko Kelter

The Full Bayesian Significance Test (FBST) has been proposed as a convenient method to replace frequentist p-values for testing a precise hypothesis. Although the FBST enjoys various appealing properties, the purpose of this paper is to investigate two aspects of the FBST which are sometimes observed as measure-theoretic inconsistencies of the procedure and have not been discussed rigorously in the literature. First, the FBST uses the posterior density as a reference for judging the Bayesian statistical evidence against a precise hypothesis. However, under absolutely continuous prior distributions, the posterior density is defined only up to Lebesgue null sets which renders the reference criterion arbitrary. Second, the FBST statistical evidence seems to have no valid prior probability. It is shown that the former aspect can be circumvented by fixing a version of the posterior density before using the FBST, and the latter aspect is based on its measure-theoretic premises. An illustrative example demonstrates the two aspects and their solution. Together, the results in this paper show that both of the two aspects which are sometimes observed as measure-theoretic inconsistencies of the FBST are not tenable. The FBST thus provides a measure-theoretically coherent Bayesian alternative for testing a precise hypothesis.


Author(s):  
Alexander Ly ◽  
Eric-Jan Wagenmakers

AbstractThe “Full Bayesian Significance Test e-value”, henceforth FBST ev, has received increasing attention across a range of disciplines including psychology. We show that the FBST ev leads to four problems: (1) the FBST ev cannot quantify evidence in favor of a null hypothesis and therefore also cannot discriminate “evidence of absence” from “absence of evidence”; (2) the FBST ev is susceptible to sampling to a foregone conclusion; (3) the FBST ev violates the principle of predictive irrelevance, such that it is affected by data that are equally likely to occur under the null hypothesis and the alternative hypothesis; (4) the FBST ev suffers from the Jeffreys-Lindley paradox in that it does not include a correction for selection. These problems also plague the frequentist p-value. We conclude that although the FBST ev may be an improvement over the p-value, it does not provide a reasonable measure of evidence against the null hypothesis.


2018 ◽  
Vol 1 (2) ◽  
pp. 281-295 ◽  
Author(s):  
Alexander Etz ◽  
Julia M. Haaf ◽  
Jeffrey N. Rouder ◽  
Joachim Vandekerckhove

Hypothesis testing is a special form of model selection. Once a pair of competing models is fully defined, their definition immediately leads to a measure of how strongly each model supports the data. The ratio of their support is often called the likelihood ratio or the Bayes factor. Critical in the model-selection endeavor is the specification of the models. In the case of hypothesis testing, it is of the greatest importance that the researcher specify exactly what is meant by a “null” hypothesis as well as the alternative to which it is contrasted, and that these are suitable instantiations of theoretical positions. Here, we provide an overview of different instantiations of null and alternative hypotheses that can be useful in practice, but in all cases the inferential procedure is based on the same underlying method of likelihood comparison. An associated app can be found at https://osf.io/mvp53/ . This article is the work of the authors and is reformatted from the original, which was published under a CC-By Attribution 4.0 International license and is available at https://psyarxiv.com/wmf3r/ .


2020 ◽  
pp. 109634802094732
Author(s):  
A. George Assaf ◽  
Mike Tsionas

In hospitality and tourism research, p-values continue to be the most common approach to hypothesis testing. In this article, we elaborate on some of the misconceptions associated with p-values. We discuss the advantages of the Bayesian approach and provide several important practical recommendations and considerations for Bayesian hypothesis testing. With the main challenge of Bayesian hypothesis testing being the sensitivity of the results to prior distributions, we present in this article several priors that can be used for that purpose and illustrate their performance in a regression context.


2021 ◽  
Author(s):  
Jorge Tendeiro ◽  
Henk Kiers

In 2019 we wrote a paper (Tendeiro & Kiers, 2019) in Psychological Methods over null hypothesis Bayesian testing and its working horse, the Bayes factor. Recently, van Ravenzwaaij and Wagenmakers (2021) offered a response to our piece, also in this journal. Although we do welcome their contribution with thought-provoking remarks on our paper, we ended up concluding that there were too many ‘issues’ in van Ravenzwaaij and Wagenmakers (2021) that warrant a rebuttal. In this paper we both defend the main premises of our original paper and we put the contribution of van Ravenzwaaij and Wagenmakers (2021) under critical appraisal. Our hope is that this exchange between scholars decisively contributes towards a better understanding among psychologists of null hypothesis Bayesian testing in general and of the Bayes factor in particular.


2016 ◽  
Vol 77 (4) ◽  
pp. 673-689 ◽  
Author(s):  
Rand R. Wilcox ◽  
Sarfaraz Serang

The article provides perspectives on p values, null hypothesis testing, and alternative techniques in light of modern robust statistical methods. Null hypothesis testing and p values can provide useful information provided they are interpreted in a sound manner, which includes taking into account insights and advances that have occurred during the past 50 years. There are, of course, limitations to what null hypothesis testing and p values reveal about data. But modern advances make it clear that there are serious limitations and concerns associated with conventional confidence intervals, standard Bayesian methods, and commonly used measures of effect size. Many of these concerns can be addressed using modern robust methods.


Sign in / Sign up

Export Citation Format

Share Document