Formalizing Statistical Beliefs in Hypothesis Testing Using Program Logic

2021 ◽  
Author(s):  
Yusuke Kawamoto ◽  
Tetsuya Sato ◽  
Kohei Suenaga

We propose a new approach to formally describing the requirement for statistical inference and checking whether the statistical method is appropriately used in a program. Specifically, we define belief Hoare logic (BHL) for formalizing and reasoning about the statistical beliefs acquired via hypothesis testing. This logic is equipped with axiom schemas for hypothesis tests and rules for multiple tests that can be instantiated to a variety of concrete tests. To the best of our knowledge, this is the first attempt to introduce a program logic with epistemic modal operators that can specify the preconditions for hypothesis tests to be applied appropriately.

Entropy ◽  
2019 ◽  
Vol 21 (9) ◽  
pp. 883 ◽  
Author(s):  
Luis Gustavo Esteves ◽  
Rafael Izbicki ◽  
Julio Michael Stern ◽  
Rafael Bassi Stern

This paper introduces pragmatic hypotheses and relates this concept to the spiral of scientific evolution. Previous works determined a characterization of logically consistent statistical hypothesis tests and showed that the modal operators obtained from this test can be represented in the hexagon of oppositions. However, despite the importance of precise hypothesis in science, they cannot be accepted by logically consistent tests. Here, we show that this dilemma can be overcome by the use of pragmatic versions of precise hypotheses. These pragmatic versions allow a level of imprecision in the hypothesis that is small relative to other experimental conditions. The introduction of pragmatic hypotheses allows the evolution of scientific theories based on statistical hypothesis testing to be interpreted using the narratological structure of hexagonal spirals, as defined by Pierre Gallais.


2019 ◽  
Vol 1 (3) ◽  
pp. 945-961 ◽  
Author(s):  
Frank Emmert-Streib ◽  
Matthias Dehmer

Statistical hypothesis testing is among the most misunderstood quantitative analysis methods from data science. Despite its seeming simplicity, it has complex interdependencies between its procedural components. In this paper, we discuss the underlying logic behind statistical hypothesis testing, the formal meaning of its components and their connections. Our presentation is applicable to all statistical hypothesis tests as generic backbone and, hence, useful across all application domains in data science and artificial intelligence.


2016 ◽  
Vol 21 (2) ◽  
pp. 136-147 ◽  
Author(s):  
James Nicholson ◽  
Sean Mccusker

This paper is a response to Gorard's article, ‘Damaging real lives through obstinacy: re-emphasising why significance testing is wrong’ in Sociological Research Online 21(1). For many years Gorard has criticised the way hypothesis tests are used in social science, but recently he has gone much further and argued that the logical basis for hypothesis testing is flawed: that hypothesis testing does not work, even when used properly. We have sympathy with the view that hypothesis testing is often carried out in social science contexts when it should not be, and that outcomes are often described in inappropriate terms, but this does not mean the theory of hypothesis testing, or its use, is flawed per se. There needs to be evidence to support such a contention. Gorard claims that: ‘Anyone knowing the problems, as described over one hundred years, who continues to teach, use or publish significance tests is acting unethically, and knowingly risking the damage that ensues.’ This is a very strong statement which impugns the integrity, not just the competence, of a large number of highly respected academics. We argue that the evidence he puts forward in this paper does not stand up to scrutiny: that the paper misrepresents what hypothesis tests claim to do, and uses a sample size which is far too small to discriminate properly a 10% difference in means in a simulation he constructs. He then claims that this simulates emotive contexts in which a 10% difference would be important to detect, implicitly misrepresenting the simulation as a reasonable model of those contexts.


2007 ◽  
Vol 22 (3) ◽  
pp. 637-650 ◽  
Author(s):  
Ian T. Jolliffe

Abstract When a forecast is assessed, a single value for a verification measure is often quoted. This is of limited use, as it needs to be complemented by some idea of the uncertainty associated with the value. If this uncertainty can be quantified, it is then possible to make statistical inferences based on the value observed. There are two main types of inference: confidence intervals can be constructed for an underlying “population” value of the measure, or hypotheses can be tested regarding the underlying value. This paper will review the main ideas of confidence intervals and hypothesis tests, together with the less well known “prediction intervals,” concentrating on aspects that are often poorly understood. Comparisons will be made between different methods of constructing confidence intervals—exact, asymptotic, bootstrap, and Bayesian—and the difference between prediction intervals and confidence intervals will be explained. For hypothesis testing, multiple testing will be briefly discussed, together with connections between hypothesis testing, prediction intervals, and confidence intervals.


Sign in / Sign up

Export Citation Format

Share Document