The Principle of Predictive Irrelevance or Why Intervals Should Not be Used for Model Comparison Featuring a Point Null Hypothesis

The Principle of Predictive Irrelevance, or Why Intervals Should Not be Used for Model Comparison Featuring a Point Null Hypothesis

10.31234/osf.io/rqnu5 ◽

2019 ◽

Author(s):

Eric-Jan Wagenmakers ◽

Michael David Lee ◽

Jeffrey N. Rouder ◽

Richard Donald Morey

Keyword(s):

Null Hypothesis ◽

Model Comparison ◽

Broad Class ◽

Interval Estimation ◽

Credible Interval ◽

Estimation Methods ◽

Data Set ◽

Credible Intervals ◽

Point Null Hypothesis ◽

Normal Mean

The principle of predictive irrelevance states that when two competing models predict a data set equally well, that data set cannot be used to discriminate the models and --for that specific purpose-- the data set is evidentially irrelevant. To highlight the ramifications of the principle, we first show how a single binomial observation can be irrelevant in the sense that it carries no evidential value for discriminating the null hypothesis $\theta = 1/2$ from a broad class of alternative hypotheses that allow $\theta$ to be between 0 and 1. In contrast, the Bayesian credible interval suggest that a single binomial observation does provide some evidence against the null hypothesis. We then generalize this paradoxical result to infinitely long data sequences that are predictively irrelevant throughout. Examples feature a test of a binomial rate and a test of a normal mean. These maximally uninformative data (MUD) sequences yield credible intervals and confidence intervals that are certain to exclude the point of test as the sequence lengthens. The resolution of this paradox requires the insight that interval estimation methods --and, consequently, p values-- may not be used for model comparison involving a point null hypothesis.

Download Full-text

With Bayesian Estimation One Can Get All That Bayes Factors Offer, and More

10.31234/osf.io/zbpmy ◽

2019 ◽

Author(s):

Henk Kiers ◽

Jorge Tendeiro

Keyword(s):

Probability Density Function ◽

Probability Density ◽

Bayesian Estimation ◽

Density Function ◽

Null Hypothesis ◽

Posterior Odds ◽

Close Link ◽

Posterior Odds Ratio ◽

Spike And Slab Prior ◽

Point Null Hypothesis

Null Hypothesis Bayesian Testing (NHBT) has been proposed as an alternative to Null Hypothesis Significance Testing (NHST). Whereas NHST has a close link to parameter estimation via confidence intervals, such a link of NHBT with Bayesian estimation via a posterior distribution is less straightforward, but does exist, and has recently been reiterated by Rouder, Haaf, and Vandekerckhove (2018). It hinges on a combination of a point mass probability and a probability density function as prior (denoted as the spike-and-slab prior). In the present paper it is first carefully explained how the spike-and-slab prior is defined, and how results can be derived for which proofs were not given in Rouder et al. (2018). Next, it is shown that this spike-and-slab prior can be approximated by a pure probability density function with a rectangular peak around the center towering highly above the remainder of the density function. Finally, we will indicate how this ‘hill-and-chimney’ prior may in turn be approximated by fully continuous priors. In this way it is shown that NHBT results can be approximated well by results from estimation using a strongly peaked prior, and it is noted that the estimation itself offers more than merely the posterior odds ratio on which NHBT is based. Thus, it complies with the strong APA requirement of not just mentioning testing results but also offering effect size information. It also offers a transparent perspective on the NHBT approach employing a prior with a strong peak around the chosen point null hypothesis value.

Download Full-text

Model fusion and multiple testing in the likelihood paradigm: shrinkage and evidence supporting a point null hypothesis

Statistics ◽

10.1080/02331888.2019.1660342 ◽

2019 ◽

Vol 53 (6) ◽

pp. 1187-1209 ◽

Cited By ~ 1

Author(s):

David R. Bickel ◽

Abbas Rahal

Keyword(s):

Null Hypothesis ◽

Multiple Testing ◽

Model Fusion ◽

Point Null Hypothesis

Download Full-text

Interval-Based Hypothesis Testing and Its Applications to Economics and Finance

Econometrics ◽

10.3390/econometrics7020021 ◽

2019 ◽

Vol 7 (2) ◽

pp. 21

Author(s):

Jae H. Kim ◽

Andrew P. Robinson

Keyword(s):

Hypothesis Testing ◽

Null Hypothesis ◽

Medical Science ◽

Point Of View ◽

Economic Time Series ◽

New Era ◽

Null Hypothesis Testing ◽

Point Null Hypothesis ◽

Economic Time ◽

Linear Restrictions

This paper presents a brief review of interval-based hypothesis testing, widely used in bio-statistics, medical science, and psychology, namely, tests for minimum-effect, equivalence, and non-inferiority. We present the methods in the contexts of a one-sample t-test and a test for linear restrictions in a regression. We present applications in testing for market efficiency, validity of asset-pricing models, and persistence of economic time series. We argue that, from the point of view of economics and finance, interval-based hypothesis testing provides more sensible inferential outcomes than those based on point-null hypothesis. We propose that interval-based tests be routinely employed in empirical research in business, as an alternative to point null hypothesis testing, especially in the new era of big data.

Download Full-text

Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence: Rejoinder

Journal of the American Statistical Association ◽

10.2307/2289138 ◽

1987 ◽

Vol 82 (397) ◽

pp. 133

Author(s):

George Casella ◽

Roger L. Berger

Keyword(s):

Null Hypothesis ◽

P Values ◽

Point Null Hypothesis

Download Full-text

Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence: Comment

Journal of the American Statistical Association ◽

10.2307/2289135 ◽

1987 ◽

Vol 82 (397) ◽

pp. 129

Author(s):

James M. Dickey

Keyword(s):

Null Hypothesis ◽

P Values ◽

Point Null Hypothesis

Download Full-text

Discerning responses of down wood and understory vegetation abundance to riparian buffer width and thinning treatments: an equivalence–inequivalence approach

Canadian Journal of Forest Research ◽

10.1139/x09-151 ◽

2009 ◽

Vol 39 (12) ◽

pp. 2470-2485 ◽

Cited By ~ 7

Author(s):

Paul D. Anderson ◽

Mark A. Meleason

Keyword(s):

Pseudotsuga Menziesii ◽

Null Hypothesis ◽

Structural Complexity ◽

Riparian Buffers ◽

Shrub Cover ◽

Equivalence Tests ◽

Second Growth ◽

Point Null Hypothesis ◽

Buffer Width ◽

Small Wood

The combined effectiveness of thinning and riparian buffers for increasing structural complexity while maintaining riparian function in second-growth forests is not well documented. We surveyed down wood and vegetation cover along transects from stream center, through buffers ranging from <5 to 150 m width into thinned stands, patch openings, or unthinned stands of 40- to 65-year-old Douglas-fir ( Pseudotsuga menziesii (Mirb.) Franco) forests in western Oregon, USA. Small-wood cover became more homogeneous among stream reaches within 5 years following thinning, primarily due to decreases for reaches having the greatest pretreatment abundance. Mean shrub cover converged, predominantly because of decreases in patch openings. Herbaceous cover increased, particularly in patch openings. Relative to unthinned stands, herbaceous cover was similar in wide buffers and increased in the narrowest buffers and in narrow buffers adjacent to patch openings. Moss cover tended to increase in thinned areas and decrease in patch openings. Both conventional point-null hypothesis tests and inequivalence tests suggested that wood and vegetation responses within buffers of ≥15 m width were insensitive to the treatments. However, inherently conservative equivalence tests infrequently inferred similarity between thinned stands or buffers and untreated stands. Difficulties defining ecologically important effect size can limit the inferential utility of equivalence–inequivalance testing.

Download Full-text