Advantages Masquerading as ‘Issues’ in Bayesian Hypothesis Testing: A Commentary on Tendeiro and Kiers (2019)

Mapping Intimacies ◽

10.31234/osf.io/nf7rp ◽

2019 ◽

Cited By ~ 1

Author(s):

Don van Ravenzwaaij ◽

Eric-Jan Wagenmakers

Keyword(s):

Hypothesis Testing ◽

Null Hypothesis ◽

Bayes Factor ◽

Statistical Evidence ◽

Central Component ◽

P Values ◽

Bayesian Hypothesis Testing ◽

Bayesian Testing

Tendeiro and Kiers (2019) provide a detailed and scholarly critique of Null Hypothesis Bayesian Testing (NHBT) and its central component –the Bayes factor– that allows researchers to update knowledge and quantify statistical evidence. Tendeiro and Kiers conclude that NHBT constitutes an improvement over frequentist p-values, but primarily elaborate on a list of eleven ‘issues’ of NHBT. In this commentary, we provide context to each issue and conclude that many issues may in fact be conceived as pronounced advantages of NHBT.

Download Full-text

Mathematical Evidence for the Adequacy of Bayesian Optional Stopping

10.31234/osf.io/9t2e7 ◽

2019 ◽

Author(s):

Jorge Tendeiro ◽

Henk Kiers ◽

Don van Ravenzwaaij

Keyword(s):

Null Hypothesis ◽

Bayes Factor ◽

Random Variable ◽

P Values ◽

Significance Level ◽

Bayesian Testing ◽

Optional Stopping ◽

Repeated Sampling ◽

State Of Affairs ◽

Approximate Simulation

Description: The practice of sequentially testing a null hypothesis as data are collected until the null hypothesis is rejected is known as optional stopping. It is well-known that optional stopping is problematic in the context of null hypothesis significance testing: The false positive rates quickly overcome the single test’s significance level. However, the state of affairs under null hypothesis Bayesian testing, where p-values are replaced by Bayes factors, is perhaps surprisingly much less consensual. Rouder (2014) used simulations to defend the use of optional stopping under null hypothesis Bayesian testing. The idea behind these simulations is closely related to the idea of sampling from prior predictive distributions. In this paper we provide formal mathematical derivations for Rouder’s approximate simulation results for the two Bayesian hypothesis tests that he considered. The key idea is to consider the probability distribution of the Bayes factor, which is regarded as being a random variable across repeated sampling. This paper therefore offers a solid mathematical footing to the literature and we believe it is a valid contribution towards understanding the practice of optional stopping in the context of Bayesian hypothesis testing.

Download Full-text

On Two Measure-Theoretic Aspects of the Full Bayesian Significance Test for Precise Bayesian Hypothesis Testing †

Physical Sciences Forum ◽

10.3390/psf2021003010 ◽

2021 ◽

Vol 3 (1) ◽

pp. 10

Author(s):

Riko Kelter

Keyword(s):

Hypothesis Testing ◽

Convenient Method ◽

Prior Probability ◽

Significance Test ◽

Statistical Evidence ◽

Posterior Density ◽

Absolutely Continuous ◽

Prior Distributions ◽

P Values ◽

Bayesian Hypothesis Testing

The Full Bayesian Significance Test (FBST) has been proposed as a convenient method to replace frequentist p-values for testing a precise hypothesis. Although the FBST enjoys various appealing properties, the purpose of this paper is to investigate two aspects of the FBST which are sometimes observed as measure-theoretic inconsistencies of the procedure and have not been discussed rigorously in the literature. First, the FBST uses the posterior density as a reference for judging the Bayesian statistical evidence against a precise hypothesis. However, under absolutely continuous prior distributions, the posterior density is defined only up to Lebesgue null sets which renders the reference criterion arbitrary. Second, the FBST statistical evidence seems to have no valid prior probability. It is shown that the former aspect can be circumvented by fixing a version of the posterior density before using the FBST, and the latter aspect is based on its measure-theoretic premises. An illustrative example demonstrates the two aspects and their solution. Together, the results in this paper show that both of the two aspects which are sometimes observed as measure-theoretic inconsistencies of the FBST are not tenable. The FBST thus provides a measure-theoretically coherent Bayesian alternative for testing a precise hypothesis.

Download Full-text

Statistical evidence and sample size determination for Bayesian hypothesis testing

Journal of Statistical Planning and Inference ◽

10.1016/s0378-3758(03)00198-8 ◽

2004 ◽

Vol 124 (1) ◽

pp. 121-144 ◽

Cited By ~ 24

Author(s):

Fulvio De Santis

Keyword(s):

Hypothesis Testing ◽

Sample Size ◽

Statistical Evidence ◽

Sample Size Determination ◽

Size Determination ◽

Bayesian Hypothesis Testing

Download Full-text

A Critical Evaluation of the FBST ev for Bayesian Hypothesis Testing

Computational Brain & Behavior ◽

10.1007/s42113-021-00109-y ◽

2021 ◽

Author(s):

Alexander Ly ◽

Eric-Jan Wagenmakers

Keyword(s):

Hypothesis Testing ◽

Null Hypothesis ◽

Alternative Hypothesis ◽

Critical Evaluation ◽

Significance Test ◽

Bayesian Hypothesis Testing ◽

Absence Of Evidence ◽

Evidence Of Absence ◽

Foregone Conclusion

AbstractThe “Full Bayesian Significance Test e-value”, henceforth FBST ev, has received increasing attention across a range of disciplines including psychology. We show that the FBST ev leads to four problems: (1) the FBST ev cannot quantify evidence in favor of a null hypothesis and therefore also cannot discriminate “evidence of absence” from “absence of evidence”; (2) the FBST ev is susceptible to sampling to a foregone conclusion; (3) the FBST ev violates the principle of predictive irrelevance, such that it is affected by data that are equally likely to occur under the null hypothesis and the alternative hypothesis; (4) the FBST ev suffers from the Jeffreys-Lindley paradox in that it does not include a correction for selection. These problems also plague the frequentist p-value. We conclude that although the FBST ev may be an improvement over the p-value, it does not provide a reasonable measure of evidence against the null hypothesis.

Download Full-text

Bayesian Inference and Testing Any Hypothesis You Can Specify

Advances in Methods and Practices in Psychological Science ◽

10.1177/2515245918773087 ◽

2018 ◽

Vol 1 (2) ◽

pp. 281-295 ◽

Cited By ~ 6

Author(s):

Alexander Etz ◽

Julia M. Haaf ◽

Jeffrey N. Rouder ◽

Joachim Vandekerckhove

Keyword(s):

Bayesian Inference ◽

Model Selection ◽

Hypothesis Testing ◽

Likelihood Ratio ◽

Special Form ◽

Null Hypothesis ◽

Bayes Factor ◽

Alternative Hypotheses ◽

Competing Models

Hypothesis testing is a special form of model selection. Once a pair of competing models is fully defined, their definition immediately leads to a measure of how strongly each model supports the data. The ratio of their support is often called the likelihood ratio or the Bayes factor. Critical in the model-selection endeavor is the specification of the models. In the case of hypothesis testing, it is of the greatest importance that the researcher specify exactly what is meant by a “null” hypothesis as well as the alternative to which it is contrasted, and that these are suitable instantiations of theoretical positions. Here, we provide an overview of different instantiations of null and alternative hypotheses that can be useful in practice, but in all cases the inferential procedure is based on the same underlying method of likelihood comparison. An associated app can be found at https://osf.io/mvp53/ . This article is the work of the authors and is reformatted from the original, which was published under a CC-By Attribution 4.0 International license and is available at https://psyarxiv.com/wmf3r/ .

Download Full-text

Bayesian Hypothesis Testing for Hospitality and Tourism Research

Journal of Hospitality & Tourism Research ◽

10.1177/1096348020947327 ◽

2020 ◽

pp. 109634802094732

Author(s):

A. George Assaf ◽

Mike Tsionas

Keyword(s):

Hypothesis Testing ◽

Bayesian Approach ◽

Prior Distributions ◽

P Values ◽

Tourism Research ◽

Bayesian Hypothesis Testing ◽

Main Challenge ◽

Hospitality And Tourism ◽

Practical Recommendations ◽

The Bayesian Approach

In hospitality and tourism research, p-values continue to be the most common approach to hypothesis testing. In this article, we elaborate on some of the misconceptions associated with p-values. We discuss the advantages of the Bayesian approach and provide several important practical recommendations and considerations for Bayesian hypothesis testing. With the main challenge of Bayesian hypothesis testing being the sensitivity of the results to prior distributions, we present in this article several priors that can be used for that purpose and illustrate their performance in a regression context.

Download Full-text

Complementing the P-value from null-hypothesis significance testing with a Bayes factor from null-hypothesis Bayesian testing

Nurse Researcher ◽

10.7748/nr.2020.e1756 ◽

2020 ◽

Vol 28 (4) ◽

pp. 41-48

Author(s):

Helen Evelyn Malone ◽

Imelda Coyne

Keyword(s):

Null Hypothesis ◽

Bayes Factor ◽

Significance Testing ◽

P Value ◽

Null Hypothesis Significance Testing ◽

Bayesian Testing

Download Full-text

On the white, the black, and the many shades of gray in between: Our reply to van Ravenzwaaij and Wagenmakers (2021)

10.31234/osf.io/tjxvz ◽

2021 ◽

Author(s):

Jorge Tendeiro ◽

Henk Kiers

Keyword(s):

Null Hypothesis ◽

Bayes Factor ◽

Critical Appraisal ◽

Bayesian Testing ◽

The Many

In 2019 we wrote a paper (Tendeiro & Kiers, 2019) in Psychological Methods over null hypothesis Bayesian testing and its working horse, the Bayes factor. Recently, van Ravenzwaaij and Wagenmakers (2021) offered a response to our piece, also in this journal. Although we do welcome their contribution with thought-provoking remarks on our paper, we ended up concluding that there were too many ‘issues’ in van Ravenzwaaij and Wagenmakers (2021) that warrant a rebuttal. In this paper we both defend the main premises of our original paper and we put the contribution of van Ravenzwaaij and Wagenmakers (2021) under critical appraisal. Our hope is that this exchange between scholars decisively contributes towards a better understanding among psychologists of null hypothesis Bayesian testing in general and of the Bayes factor in particular.

Download Full-text

Perturbations on the uniform distribution of p-values can lead to misleading inferences from null-hypothesis testing

Trends in Neuroscience and Education ◽

10.1016/j.tine.2017.10.001 ◽

2017 ◽

Vol 8-9 ◽

pp. 18-27 ◽

Cited By ~ 1

Author(s):

László Zsolt Garamszegi ◽

Pierre de Villemereuil

Keyword(s):

Hypothesis Testing ◽

Uniform Distribution ◽

Null Hypothesis ◽

P Values ◽

Null Hypothesis Testing

Download Full-text

Hypothesis Testing, p Values, Confidence Intervals, Measures of Effect Size, and Bayesian Methods in Light of Modern Robust Techniques

Educational and Psychological Measurement ◽

10.1177/0013164416667983 ◽

2016 ◽

Vol 77 (4) ◽

pp. 673-689 ◽

Cited By ~ 4

Author(s):

Rand R. Wilcox ◽

Sarfaraz Serang

Keyword(s):

Hypothesis Testing ◽

Confidence Intervals ◽

Bayesian Methods ◽

Effect Size ◽

Null Hypothesis ◽

P Values ◽

Null Hypothesis Testing ◽

Robust Techniques ◽

Alternative Techniques ◽

Robust Statistical Methods

The article provides perspectives on p values, null hypothesis testing, and alternative techniques in light of modern robust statistical methods. Null hypothesis testing and p values can provide useful information provided they are interpreted in a sound manner, which includes taking into account insights and advances that have occurred during the past 50 years. There are, of course, limitations to what null hypothesis testing and p values reveal about data. But modern advances make it clear that there are serious limitations and concerns associated with conventional confidence intervals, standard Bayesian methods, and commonly used measures of effect size. Many of these concerns can be addressed using modern robust methods.

Download Full-text