scholarly journals fbst: An R package for the Full Bayesian Significance Test for testing a sharp null hypothesis against its alternative via the e value

Author(s):  
Riko Kelter

AbstractHypothesis testing is a central statistical method in psychology and the cognitive sciences. However, the problems of null hypothesis significance testing (NHST) and p values have been debated widely, but few attractive alternatives exist. This article introduces the R package, which implements the Full Bayesian Significance Test (FBST) to test a sharp null hypothesis against its alternative via the e value. The statistical theory of the FBST has been introduced more than two decades ago and since then the FBST has shown to be a Bayesian alternative to NHST and p values with both theoretical and practical highly appealing properties. The algorithm provided in the package is applicable to any Bayesian model as long as the posterior distribution can be obtained at least numerically. The core function of the package provides the Bayesian evidence against the null hypothesis, the e value. Additionally, p values based on asymptotic arguments can be computed and rich visualizations for communication and interpretation of the results can be produced. Three examples of frequently used statistical procedures in the cognitive sciences are given in this paper, which demonstrate how to apply the FBST in practice using the package. Based on the success of the FBST in statistical science, the package should be of interest to a broad range of researchers and hopefully will encourage researchers to consider the FBST as a possible alternative when conducting hypothesis tests of a sharp null hypothesis.

2017 ◽  
Author(s):  
Matteo Colombo ◽  
Georgi Duev ◽  
Michele B. Nuijten ◽  
Jan Sprenger

Experimental philosophy (x-phi) is a young field of research in the intersection of philosophy and psychology. It aims to make progress on philosophical questions by using experimental methods traditionally associated with the psychological and behavioral sciences, such as null hypothesis significance testing (NHST). Motivated by recent discussions about a methodological crisis in the behavioral sciences, questions have been raised about the methodological standards of x-phi. Here, we focus on one aspect of this question, namely the rate of inconsistencies in statistical reporting. Previous research has examined the extent to which published articles in psychology and other behavioral sciences present statistical inconsistencies in reporting the results of NHST. In this study, we used the R package statcheck to detect statistical inconsistencies in x-phi, and compared rates of inconsistencies in psychology and philosophy. We found that rates of inconsistencies in x-phi are lower than in the psychological and behavioral sciences. From the point of view of statistical reporting consistency, x-phi seems to do no worse, and perhaps even better, than psychological science.


2018 ◽  
Author(s):  
Eike Mark Rinke ◽  
Frank M. Schneider

Across all areas of communication research, the most popular approach to generating insights about communication is the classical significance test (also called null hypothesis significance testing, NHST). The predominance of NHST in communication research is in spite of serious concerns about the ability of researchers to properly interpret its results. We draw on data from a survey of the ICA membership to assess the evidential basis of these concerns. The vast majority of communication researchers misinterpreted NHST (91%) and the most prominent alternative, confidence intervals (96%), while overestimating their competence. Academic seniority and statistical experience did not predict better interpretation outcomes. These findings indicate major problems regarding the generation of knowledge in the field of communication research.


2020 ◽  
Author(s):  
Norbert Hirschauer ◽  
Sven Gruener ◽  
Oliver Mußhoff ◽  
Claudia Becker

It has often been noted that the “null-hypothesis-significance-testing” (NHST) framework is an inconsistent hybrid of Neyman-Pearson’s “hypotheses testing” and Fisher’s “significance test-ing” approach that almost inevitably causes misinterpretations. To facilitate a realistic assessment of the potential and the limits of statistical inference, we briefly recall widespread inferential errors and outline the two original approaches of these famous statisticians. Based on the under-standing of their irreconcilable perspectives, we propose “going back to the roots” and using the initial evidence in the data in terms of the size and the uncertainty of the estimate for the pur-pose of statistical inference. Finally, we make six propositions that hopefully contribute to im-proving the quality of inferences in future research.


1998 ◽  
Vol 21 (2) ◽  
pp. 219-220 ◽  
Author(s):  
Henderikus J. Stam ◽  
Grant A. Pasay

We argue that Chow's defense of hypothesis-testing procedures attempts to restore an aura of objectivity to the core procedures, allowing these to take on the role of judgment that should be reserved for the researcher. We provide a brief overview of what we call the historical case against hypothesis testing and argue that the latter has led to a constrained and simplified conception of what passes for theory in psychology.


2021 ◽  
Author(s):  
Ruslan Masharipov ◽  
Yaroslav Nikolaev ◽  
Alexander Korotkov ◽  
Michael Didur ◽  
Denis Cherednichenko ◽  
...  

Classical null hypothesis significance testing is limited to the rejection of the point-null hypothesis; it does not allow the interpretation of non-significant results. Moreover, studies with a sufficiently large sample size will find statistically significant results even when the effect is negligible and may be considered practically equivalent to the null effect. This leads to a publication bias against the null hypothesis. There are two main approaches to assess null effects: shifting from the point-null to the interval-null hypothesis and considering the practical significance in the frequentist approach; using the Bayesian parameter inference based on posterior probabilities, or the Bayesian model inference based on Bayes factors. Herein, we discuss these statistical methods with particular focus on the application of the Bayesian parameter inference, as it is conceptually connected to both frequentist and Bayesian model inferences. Although Bayesian methods have been theoretically elaborated and implemented in commonly used neuroimaging software, they are not widely used for null effect assessment. To demonstrate the advantages of using the Bayesian parameter inference, we compared it with classical null hypothesis significance testing for fMRI data group analysis. We also consider the problem of choosing a threshold for a practically significant effect and discuss possible applications of Bayesian parameter inference in fMRI studies. We argue that Bayesian inference, which directly provides evidence for both the null and alternative hypotheses, may be more intuitive and convenient for practical use than frequentist inference, which only provides evidence against the null hypothesis. Moreover, it may indicate that the obtained data are not sufficient to make a confident inference. Because interim analysis is easy to perform using Bayesian inference, one can evaluate the data as the sample size increases and decide to terminate the experiment if the obtained data are sufficient to make a confident inference. To facilitate the application of the Bayesian parameter inference to null effect assessment, scripts with a simple GUI were developed.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Nitipong Pichetpan ◽  
Mark W. Post

Abstract This article provides a comprehensive analysis of the little-known “bare classifier phrase” construction in Modern Standard Thai. It describes the syntax, semantics and discourse functions of Thai bare classifier phrases, and further proposes a diachronic account of their origin in reduction of post-posed numeral ‘one’. Following this synchronic and diachronic description, this article attempts to locate Thai within a working typology of bare classifier constructions in mainland Asian languages, and further argues for the importance of bare classifier constructions to the theory of classifiers more generally. Following Bisang (1999) and others, it argues that bare classifier constructions reveal the core function of classifiers in Asian languages to be individuation – a referential function. It therefore cautions against some recent proposals to merge classifiers and gender markers within a single categorical space defined on the semantic basis of nominal classification, and in favour of continuing to treat classifiers as a discrete linguistic category – in mainland Asian languages, at least.


Econometrics ◽  
2019 ◽  
Vol 7 (2) ◽  
pp. 26 ◽  
Author(s):  
David Trafimow

There has been much debate about null hypothesis significance testing, p-values without null hypothesis significance testing, and confidence intervals. The first major section of the present article addresses some of the main reasons these procedures are problematic. The conclusion is that none of them are satisfactory. However, there is a new procedure, termed the a priori procedure (APP), that validly aids researchers in obtaining sample statistics that have acceptable probabilities of being close to their corresponding population parameters. The second major section provides a description and review of APP advances. Not only does the APP avoid the problems that plague other inferential statistical procedures, but it is easy to perform too. Although the APP can be performed in conjunction with other procedures, the present recommendation is that it be used alone.


2016 ◽  
Vol 11 (4) ◽  
pp. 551-554 ◽  
Author(s):  
Martin Buchheit

The first sport-science-oriented and comprehensive paper on magnitude-based inferences (MBI) was published 10 y ago in the first issue of this journal. While debate continues, MBI is today well established in sport science and in other fields, particularly clinical medicine, where practical/clinical significance often takes priority over statistical significance. In this commentary, some reasons why both academics and sport scientists should abandon null-hypothesis significance testing and embrace MBI are reviewed. Apparent limitations and future areas of research are also discussed. The following arguments are presented: P values and, in turn, study conclusions are sample-size dependent, irrespective of the size of the effect; significance does not inform on magnitude of effects, yet magnitude is what matters the most; MBI allows authors to be honest with their sample size and better acknowledge trivial effects; the examination of magnitudes per se helps provide better research questions; MBI can be applied to assess changes in individuals; MBI improves data visualization; and MBI is supported by spreadsheets freely available on the Internet. Finally, recommendations to define the smallest important effect and improve the presentation of standardized effects are presented.


Sign in / Sign up

Export Citation Format

Share Document