scholarly journals Classification accuracy comparison: Hypothesis tests and the use of confidence intervals in evaluations of difference, equivalence and non-inferiority

2009 ◽  
Vol 113 (8) ◽  
pp. 1658-1663 ◽  
Author(s):  
Giles M. Foody
2019 ◽  
Author(s):  
Amanda Kay Montoya ◽  
Andrew F. Hayes

Researchers interested in testing mediation often use designs where participants are measured on a dependent variable Y and a mediator M in both of two different circumstances. The dominant approach to assessing mediation in such a design, proposed by Judd, Kenny, and McClelland (2001), relies on a series of hypothesis tests about components of the mediation model and is not based on an estimate of or formal inference about the indirect effect. In this paper we recast Judd et al.’s approach in the path-analytic framework that is now commonly used in between-participant mediation analysis. By so doing, it is apparent how to estimate the indirect effect of a within-participant manipulation on some outcome through a mediator as the product of paths of influence. This path analytic approach eliminates the need for discrete hypothesis tests about components of the model to support a claim of mediation, as Judd et al’s method requires, because it relies only on an inference about the product of paths— the indirect effect. We generalize methods of inference for the indirect effect widely used in between-participant designs to this within-participant version of mediation analysis, including bootstrap confidence intervals and Monte Carlo confidence intervals. Using this path analytic approach, we extend the method to models with multiple mediators operating in parallel and serially and discuss the comparison of indirect effects in these more complex models. We offer macros and code for SPSS, SAS, and Mplus that conduct these analyses.


2007 ◽  
Vol 22 (3) ◽  
pp. 637-650 ◽  
Author(s):  
Ian T. Jolliffe

Abstract When a forecast is assessed, a single value for a verification measure is often quoted. This is of limited use, as it needs to be complemented by some idea of the uncertainty associated with the value. If this uncertainty can be quantified, it is then possible to make statistical inferences based on the value observed. There are two main types of inference: confidence intervals can be constructed for an underlying “population” value of the measure, or hypotheses can be tested regarding the underlying value. This paper will review the main ideas of confidence intervals and hypothesis tests, together with the less well known “prediction intervals,” concentrating on aspects that are often poorly understood. Comparisons will be made between different methods of constructing confidence intervals—exact, asymptotic, bootstrap, and Bayesian—and the difference between prediction intervals and confidence intervals will be explained. For hypothesis testing, multiple testing will be briefly discussed, together with connections between hypothesis testing, prediction intervals, and confidence intervals.


2019 ◽  
Vol 48 (4) ◽  
pp. 241-243
Author(s):  
Jordan Rickles ◽  
Jessica B. Heppen ◽  
Elaine Allensworth ◽  
Nicholas Sorensen ◽  
Kirk Walters

In response to the concerns White raises in his technical comment on Rickles, Heppen, Allensworth, Sorensen, and Walters (2018), we discuss whether it would have been appropriate to test for nominally equivalent outcomes, given that the study was initially conceived and designed to test for significant differences, and that the conclusion of no difference was not solely based on a null hypothesis test. To further support the article’s conclusion, confidence intervals for the null hypothesis tests and a test of equivalence are provided.


2019 ◽  
Author(s):  
Jan Sprenger

The replication crisis poses an enormous challenge to the epistemic authority of science and the logic of statistical inference in particular. Two prominent features of Null Hypothesis Significance Testing (NHST) arguably contribute to the crisis: the lack of guidance for interpreting non-significant results and the impossibility of quantifying support for the null hypothesis. In this paper, I argue that also popular alternatives to NHST, such as confidence intervals and Bayesian inference, do not lead to a satisfactory logic of evaluating hypothesis tests. As an alternative, I motivate and explicate the concept of corroboration of the null hypothesis. Finally I show how degrees of corroboration give an interpretation to non-significant results, combat publication bias and mitigate the replication crisis.


Sign in / Sign up

Export Citation Format

Share Document