scholarly journals The Principle of Predictive Irrelevance, or Why Intervals Should Not be Used for Model Comparison Featuring a Point Null Hypothesis

2019 ◽  
Author(s):  
Eric-Jan Wagenmakers ◽  
Michael David Lee ◽  
Jeffrey N. Rouder ◽  
Richard Donald Morey

The principle of predictive irrelevance states that when two competing models predict a data set equally well, that data set cannot be used to discriminate the models and --for that specific purpose-- the data set is evidentially irrelevant. To highlight the ramifications of the principle, we first show how a single binomial observation can be irrelevant in the sense that it carries no evidential value for discriminating the null hypothesis $\theta = 1/2$ from a broad class of alternative hypotheses that allow $\theta$ to be between 0 and 1. In contrast, the Bayesian credible interval suggest that a single binomial observation does provide some evidence against the null hypothesis. We then generalize this paradoxical result to infinitely long data sequences that are predictively irrelevant throughout. Examples feature a test of a binomial rate and a test of a normal mean. These maximally uninformative data (MUD) sequences yield credible intervals and confidence intervals that are certain to exclude the point of test as the sequence lengthens. The resolution of this paradox requires the insight that interval estimation methods --and, consequently, p values-- may not be used for model comparison involving a point null hypothesis.

2020 ◽  
Vol 18 (1) ◽  
pp. 2-27
Author(s):  
Miodrag M. Lovric

In frequentist statistics, point-null hypothesis testing based on significance tests and confidence intervals are harmonious procedures and lead to the same conclusion. This is not the case in the domain of the Bayesian framework. An inference made about the point-null hypothesis using Bayes factor may lead to an opposite conclusion if it is based on the Bayesian credible interval. Bayesian suggestions to test point-nulls using credible intervals are misleading and should be dismissed. A null hypothesized value may be outside a credible interval but supported by Bayes factor (a Type I conflict), or contrariwise, the null value may be inside a credible interval but not supported by the Bayes factor (Type II conflict). Two computer programs in R have been developed that confirm the existence of a countable infinite number of cases, for which Bayes credible intervals are not compatible with Bayesian hypothesis testing.


2021 ◽  
Vol 20 ◽  
pp. 288-299
Author(s):  
Refah Mohammed Alotaibi ◽  
Yogesh Mani Tripathi ◽  
Sanku Dey ◽  
Hoda Ragab Rezk

In this paper, inference upon stress-strength reliability is considered for unit-Weibull distributions with a common parameter under the assumption that data are observed using progressive type II censoring. We obtain di_erent estimators of system reliability using classical and Bayesian procedures. Asymptotic interval is constructed based on Fisher information matrix. Besides, boot-p and boot-t intervals are also obtained. We evaluate Bayes estimates using Lindley's technique and Metropolis-Hastings (MH) algorithm. The Bayes credible interval is evaluated using MH method. An unbiased estimator of this parametric function is also obtained under know common parameter case. Numerical simulations are performed to compare estimation methods. Finally, a data set is studied for illustration purposes.


Author(s):  
Parisa Torkaman

The generalized inverted exponential distribution is introduced as a lifetime model with good statistical properties. This paper, the estimation of the probability density function and the cumulative distribution function of with five different estimation methods: uniformly minimum variance unbiased(UMVU), maximum likelihood(ML), least squares(LS), weighted least squares (WLS) and percentile(PC) estimators are considered. The performance of these estimation procedures, based on the mean squared error (MSE) by numerical simulations are compared. Simulation studies express that the UMVU estimator performs better than others and when the sample size is large enough the ML and UMVU estimators are almost equivalent and efficient than LS, WLS and PC. Finally, the result using a real data set are analyzed.


Genetics ◽  
1996 ◽  
Vol 143 (1) ◽  
pp. 589-602 ◽  
Author(s):  
Peter J E Goss ◽  
R C Lewontin

Abstract Regions of differing constraint, mutation rate or recombination along a sequence of DNA or amino acids lead to a nonuniform distribution of polymorphism within species or fixed differences between species. The power of five tests to reject the null hypothesis of a uniform distribution is studied for four classes of alternate hypothesis. The tests explored are the variance of interval lengths; a modified variance test, which includes covariance between neighboring intervals; the length of the longest interval; the length of the shortest third-order interval; and a composite test. Although there is no uniformly most powerful test over the range of alternate hypotheses tested, the variance and modified variance tests usually have the highest power. Therefore, we recommend that one of these two tests be used to test departure from uniformity in all circumstances. Tables of critical values for the variance and modified variance tests are given. The critical values depend both on the number of events and the number of positions in the sequence. A computer program is available on request that calculates both the critical values for a specified number of events and number of positions as well as the significance level of a given data set.


2020 ◽  
Vol 501 (2) ◽  
pp. 1663-1676
Author(s):  
R Barnett ◽  
S J Warren ◽  
N J G Cross ◽  
D J Mortlock ◽  
X Fan ◽  
...  

ABSTRACT We present the results of a new, deeper, and complete search for high-redshift 6.5 < z < 9.3 quasars over 977 deg2 of the VISTA Kilo-Degree Infrared Galaxy (VIKING) survey. This exploits a new list-driven data set providing photometry in all bands Z, Y, J, H, Ks, for all sources detected by VIKING in J. We use the Bayesian model comparison (BMC) selection method of Mortlock et al., producing a ranked list of just 21 candidates. The sources ranked 1, 2, 3, and 5 are the four known z > 6.5 quasars in this field. Additional observations of the other 17 candidates, primarily DESI Legacy Survey photometry and ESO FORS2 spectroscopy, confirm that none is a quasar. This is the first complete sample from the VIKING survey, and we provide the computed selection function. We include a detailed comparison of the BMC method against two other selection methods: colour cuts and minimum-χ2 SED fitting. We find that: (i) BMC produces eight times fewer false positives than colour cuts, while also reaching 0.3 mag deeper, (ii) the minimum-χ2 SED-fitting method is extremely efficient but reaches 0.7 mag less deep than the BMC method, and selects only one of the four known quasars. We show that BMC candidates, rejected because their photometric SEDs have high χ2 values, include bright examples of galaxies with very strong [O iii] λλ4959,5007 emission in the Y band, identified in fainter surveys by Matsuoka et al. This is a potential contaminant population in Euclid searches for faint z > 7 quasars, not previously accounted for, and that requires better characterization.


Entropy ◽  
2021 ◽  
Vol 23 (1) ◽  
pp. 126
Author(s):  
Sharu Theresa Jose ◽  
Osvaldo Simeone

Meta-learning, or “learning to learn”, refers to techniques that infer an inductive bias from data corresponding to multiple related tasks with the goal of improving the sample efficiency for new, previously unobserved, tasks. A key performance measure for meta-learning is the meta-generalization gap, that is, the difference between the average loss measured on the meta-training data and on a new, randomly selected task. This paper presents novel information-theoretic upper bounds on the meta-generalization gap. Two broad classes of meta-learning algorithms are considered that use either separate within-task training and test sets, like model agnostic meta-learning (MAML), or joint within-task training and test sets, like reptile. Extending the existing work for conventional learning, an upper bound on the meta-generalization gap is derived for the former class that depends on the mutual information (MI) between the output of the meta-learning algorithm and its input meta-training data. For the latter, the derived bound includes an additional MI between the output of the per-task learning procedure and corresponding data set to capture within-task uncertainty. Tighter bounds are then developed for the two classes via novel individual task MI (ITMI) bounds. Applications of the derived bounds are finally discussed, including a broad class of noisy iterative algorithms for meta-learning.


2020 ◽  
Vol 70 (5) ◽  
pp. 1211-1230
Author(s):  
Abdus Saboor ◽  
Hassan S. Bakouch ◽  
Fernando A. Moala ◽  
Sheraz Hussain

AbstractIn this paper, a bivariate extension of exponentiated Fréchet distribution is introduced, namely a bivariate exponentiated Fréchet (BvEF) distribution whose marginals are univariate exponentiated Fréchet distribution. Several properties of the proposed distribution are discussed, such as the joint survival function, joint probability density function, marginal probability density function, conditional probability density function, moments, marginal and bivariate moment generating functions. Moreover, the proposed distribution is obtained by the Marshall-Olkin survival copula. Estimation of the parameters is investigated by the maximum likelihood with the observed information matrix. In addition to the maximum likelihood estimation method, we consider the Bayesian inference and least square estimation and compare these three methodologies for the BvEF. A simulation study is carried out to compare the performance of the estimators by the presented estimation methods. The proposed bivariate distribution with other related bivariate distributions are fitted to a real-life paired data set. It is shown that, the BvEF distribution has a superior performance among the compared distributions using several tests of goodness–of–fit.


Sign in / Sign up

Export Citation Format

Share Document