scholarly journals The Principle of Predictive Irrelevance or Why Intervals Should Not be Used for Model Comparison Featuring a Point Null Hypothesis

Author(s):  
Eric-Jan Wagenmakers ◽  
Michael D. Lee ◽  
Jeffrey N. Rouder ◽  
Richard D. Morey
2019 ◽  
Author(s):  
Eric-Jan Wagenmakers ◽  
Michael David Lee ◽  
Jeffrey N. Rouder ◽  
Richard Donald Morey

The principle of predictive irrelevance states that when two competing models predict a data set equally well, that data set cannot be used to discriminate the models and --for that specific purpose-- the data set is evidentially irrelevant. To highlight the ramifications of the principle, we first show how a single binomial observation can be irrelevant in the sense that it carries no evidential value for discriminating the null hypothesis $\theta = 1/2$ from a broad class of alternative hypotheses that allow $\theta$ to be between 0 and 1. In contrast, the Bayesian credible interval suggest that a single binomial observation does provide some evidence against the null hypothesis. We then generalize this paradoxical result to infinitely long data sequences that are predictively irrelevant throughout. Examples feature a test of a binomial rate and a test of a normal mean. These maximally uninformative data (MUD) sequences yield credible intervals and confidence intervals that are certain to exclude the point of test as the sequence lengthens. The resolution of this paradox requires the insight that interval estimation methods --and, consequently, p values-- may not be used for model comparison involving a point null hypothesis.


2019 ◽  
Author(s):  
Henk Kiers ◽  
Jorge Tendeiro

Null Hypothesis Bayesian Testing (NHBT) has been proposed as an alternative to Null Hypothesis Significance Testing (NHST). Whereas NHST has a close link to parameter estimation via confidence intervals, such a link of NHBT with Bayesian estimation via a posterior distribution is less straightforward, but does exist, and has recently been reiterated by Rouder, Haaf, and Vandekerckhove (2018). It hinges on a combination of a point mass probability and a probability density function as prior (denoted as the spike-and-slab prior). In the present paper it is first carefully explained how the spike-and-slab prior is defined, and how results can be derived for which proofs were not given in Rouder et al. (2018). Next, it is shown that this spike-and-slab prior can be approximated by a pure probability density function with a rectangular peak around the center towering highly above the remainder of the density function. Finally, we will indicate how this ‘hill-and-chimney’ prior may in turn be approximated by fully continuous priors. In this way it is shown that NHBT results can be approximated well by results from estimation using a strongly peaked prior, and it is noted that the estimation itself offers more than merely the posterior odds ratio on which NHBT is based. Thus, it complies with the strong APA requirement of not just mentioning testing results but also offering effect size information. It also offers a transparent perspective on the NHBT approach employing a prior with a strong peak around the chosen point null hypothesis value.


Econometrics ◽  
2019 ◽  
Vol 7 (2) ◽  
pp. 21
Author(s):  
Jae H. Kim ◽  
Andrew P. Robinson

This paper presents a brief review of interval-based hypothesis testing, widely used in bio-statistics, medical science, and psychology, namely, tests for minimum-effect, equivalence, and non-inferiority. We present the methods in the contexts of a one-sample t-test and a test for linear restrictions in a regression. We present applications in testing for market efficiency, validity of asset-pricing models, and persistence of economic time series. We argue that, from the point of view of economics and finance, interval-based hypothesis testing provides more sensible inferential outcomes than those based on point-null hypothesis. We propose that interval-based tests be routinely employed in empirical research in business, as an alternative to point null hypothesis testing, especially in the new era of big data.


2009 ◽  
Vol 39 (12) ◽  
pp. 2470-2485 ◽  
Author(s):  
Paul D. Anderson ◽  
Mark A. Meleason

The combined effectiveness of thinning and riparian buffers for increasing structural complexity while maintaining riparian function in second-growth forests is not well documented. We surveyed down wood and vegetation cover along transects from stream center, through buffers ranging from <5 to 150 m width into thinned stands, patch openings, or unthinned stands of 40- to 65-year-old Douglas-fir ( Pseudotsuga menziesii (Mirb.) Franco) forests in western Oregon, USA. Small-wood cover became more homogeneous among stream reaches within 5 years following thinning, primarily due to decreases for reaches having the greatest pretreatment abundance. Mean shrub cover converged, predominantly because of decreases in patch openings. Herbaceous cover increased, particularly in patch openings. Relative to unthinned stands, herbaceous cover was similar in wide buffers and increased in the narrowest buffers and in narrow buffers adjacent to patch openings. Moss cover tended to increase in thinned areas and decrease in patch openings. Both conventional point-null hypothesis tests and inequivalence tests suggested that wood and vegetation responses within buffers of ≥15 m width were insensitive to the treatments. However, inherently conservative equivalence tests infrequently inferred similarity between thinned stands or buffers and untreated stands. Difficulties defining ecologically important effect size can limit the inferential utility of equivalence–inequivalance testing.


Sign in / Sign up

Export Citation Format

Share Document