Answering Two Criticisms of Hypothesis Testing

1999 ◽  
Vol 85 (1) ◽  
pp. 3-18 ◽  
Author(s):  
Les Leventhal

Two generations of methodologists have criticized hypothesis testing by claiming that most point null hypotheses are false and that hypothesis tests do not provide the probability that the null hypothesis is true. These criticisms are answered. (1) The point-null criticism, if correct, undermines only the traditional two-tailed test, not the one-tailed test or the little-known directional two-tailed test. The directional two-tailed test is the only hypothesis test that, properly used, provides for deciding the direction of a parameter, that is, deciding whether a parameter is positive or negative or whether it falls above or below some interesting nonzero value. The point-null criticism becomes unimportant if we replace traditional one- and two-tailed tests with the directional two-tailed test, a replacement already recommended for most purposes by previous writers. (2) If one interprets probability as a relative frequency, as most textbooks do, then the concept of probability cannot meaningfully be attached to the truth of an hypothesis; hence, it is meaningless to ask for the probability that the null is true. (3) Hypothesis tests provide the next best thing, namely, a relative frequency probability that the decision about the statistical hypotheses is correct. Two arguments are offered.

2019 ◽  
Vol 48 (4) ◽  
pp. 241-243
Author(s):  
Jordan Rickles ◽  
Jessica B. Heppen ◽  
Elaine Allensworth ◽  
Nicholas Sorensen ◽  
Kirk Walters

In response to the concerns White raises in his technical comment on Rickles, Heppen, Allensworth, Sorensen, and Walters (2018), we discuss whether it would have been appropriate to test for nominally equivalent outcomes, given that the study was initially conceived and designed to test for significant differences, and that the conclusion of no difference was not solely based on a null hypothesis test. To further support the article’s conclusion, confidence intervals for the null hypothesis tests and a test of equivalence are provided.


Author(s):  
Richard McCleary ◽  
David McDowall ◽  
Bradley J. Bartos

Chapter 6 addresses the sub-category of internal validity defined by Shadish et al., as statistical conclusion validity, or “validity of inferences about the correlation (covariance) between treatment and outcome.” The common threats to statistical conclusion validity can arise, or become plausible through either model misspecification or through hypothesis testing. The risk of a serious model misspecification is inversely proportional to the length of the time series, for example, and so is the risk of mistating the Type I and Type II error rates. Threats to statistical conclusion validity arise from the classical and modern hybrid significance testing structures, the serious threats that weigh heavily in p-value tests are shown to be undefined in Beyesian tests. While the particularly vexing threats raised by modern null hypothesis testing are resolved through the elimination of the modern null hypothesis test, threats to statistical conclusion validity would inevitably persist and new threats would arise.


1998 ◽  
Vol 21 (2) ◽  
pp. 215-216 ◽  
Author(s):  
David Rindskopf

Unfortunately, reading Chow's work is likely to leave the reader more confused than enlightened. My preferred solutions to the “controversy” about null- hypothesis testing are: (1) recognize that we really want to test the hypothesis that an effect is “small,” not null, and (2) use Bayesian methods, which are much more in keeping with the way humans naturally think than are classical statistical methods.


Author(s):  
D. Brynn Hibbert ◽  
J. Justin Gooding

• To understand the concept of the null hypothesis and the role of Type I and Type II errors. • To test that data are normally distributed and whether a datum is an outlier. • To determine whether there is systematic error in the mean of measurement results. • To perform tests to compare the means of two sets of data.… One of the uses to which data analysis is put is to answer questions about the data, or about the system that the data describes. In the former category are ‘‘is the data normally distributed?’’ and ‘‘are there any outliers in the data?’’ (see the discussions in chapter 1). Questions about the system might be ‘‘is the level of alcohol in the suspect’s blood greater than 0.05 g/100 mL?’’ or ‘‘does the new sensor give the same results as the traditional method?’’ In answering these questions we determine the probability of finding the data given the truth of a stated hypothesis—hence ‘‘hypothesis testing.’’ A hypothesis is a statement that might, or might not, be true. Usually the hypothesis is set up in such a way that it is possible to calculate the probability (P) of the data (or the test statistic calculated from the data) given the hypothesis, and then to make a decision about whether the hypothesis is to be accepted (high P) or rejected (low P). A particular case of a hypothesis test is one that determines whether or not the difference between two values is significant—a significance test. For this case we actually put forward the hypothesis that there is no real difference and the observed difference arises from random effects: it is called the null hypothesis (H<sub>0</sub>). If the probability that the data are consistent with the null hypothesis falls below a predetermined low value (say 0.05 or 0.01), then the hypothesis is rejected at that probability. Therefore, p<0.05 means that if the null hypothesis were true we would find the observed data (or more accurately the value of the statistic, or greater, calculated from the data) in less than 5% of repeated experiments.


2021 ◽  
Author(s):  
◽  
Thuong Nguyen

<p>For a long time, the goodness of fit (GOF) tests have been one of the main objects of the theory of testing of statistical hypotheses. These tests possess two essential properties. Firstly, the asymptotic distribution of GOF test statistics under the null hypothesis is free from the underlying distribution within the hypothetical family. Secondly, they are of omnibus nature, which means that they are sensitive to every alternative to the null hypothesis.   GOF tests are typically based on non-linear functionals from the empirical process. The first idea to change the focus from particular functionals to the transformation of the empirical process itself into another process, which will be asymptotically distribution free, was first formulated and accomplished by {\bf Khmaladze} \cite{Estate1}. Recently, the same author in consecutive papers \cite{Estate} and \cite{Estate2} introduced another method, called here the {\bf Khmaladze-2} transformation, which is distinct from the first Khmaladze transformation and can be used for an even wider class of hypothesis testing problems and is simpler in implementation.   This thesis shows how the approach could be used to create the asymptotically distribution free empirical process in two well-known testing problems.   The first problem is the problem of testing independence of two discrete random variables/vectors in a contingency table context. Although this problem has a long history, the use of GOF tests for it has been restricted to only one possible choice -- the chi-square test and its several modifications. We start our approach by viewing the problem as one of parametric hypothesis testing and suggest looking at the marginal distributions as parameters. The crucial difficulty is that when the dimension of the table is large, the dimension of the vector of parameters is large as well. Nevertheless, we demonstrate the efficiency of our approach and confirm by simulations the distribution free property of the new empirical process and the GOF tests based on it. The number of parameters is as big as $30$. As an additional benefit, we point out some cases when the GOF tests based on the new process are more powerful than the traditional chi-square one.   The second problem is testing whether a distribution has a regularly varying tail. This problem is inspired mainly by the fact that regularly varying tail distributions play an essential role in characterization of the domain of attraction of extreme value distributions. While there are numerous studies on estimating the exponent of regular variation of the tail, using GOF tests for testing relevant distributions has appeared in few papers. We contribute to this latter aspect a construction of a class of GOF tests for testing regularly varying tail distributions.</p>


2009 ◽  
Vol 33 (2) ◽  
pp. 81-86 ◽  
Author(s):  
Douglas Curran-Everett

Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This second installment of Explorations in Statistics delves into test statistics and P values, two concepts fundamental to the test of a scientific null hypothesis. The essence of a test statistic is that it compares what we observe in the experiment to what we expect to see if the null hypothesis is true. The P value associated with the magnitude of that test statistic answers this question: if the null hypothesis is true, what proportion of possible values of the test statistic are at least as extreme as the one I got? Although statisticians continue to stress the limitations of hypothesis tests, there are two realities we must acknowledge: hypothesis tests are ingrained within science, and the simple test of a null hypothesis can be useful. As a result, it behooves us to explore the notions of hypothesis tests, test statistics, and P values.


2021 ◽  
Author(s):  
◽  
Thuong Nguyen

<p>For a long time, the goodness of fit (GOF) tests have been one of the main objects of the theory of testing of statistical hypotheses. These tests possess two essential properties. Firstly, the asymptotic distribution of GOF test statistics under the null hypothesis is free from the underlying distribution within the hypothetical family. Secondly, they are of omnibus nature, which means that they are sensitive to every alternative to the null hypothesis.   GOF tests are typically based on non-linear functionals from the empirical process. The first idea to change the focus from particular functionals to the transformation of the empirical process itself into another process, which will be asymptotically distribution free, was first formulated and accomplished by {\bf Khmaladze} \cite{Estate1}. Recently, the same author in consecutive papers \cite{Estate} and \cite{Estate2} introduced another method, called here the {\bf Khmaladze-2} transformation, which is distinct from the first Khmaladze transformation and can be used for an even wider class of hypothesis testing problems and is simpler in implementation.   This thesis shows how the approach could be used to create the asymptotically distribution free empirical process in two well-known testing problems.   The first problem is the problem of testing independence of two discrete random variables/vectors in a contingency table context. Although this problem has a long history, the use of GOF tests for it has been restricted to only one possible choice -- the chi-square test and its several modifications. We start our approach by viewing the problem as one of parametric hypothesis testing and suggest looking at the marginal distributions as parameters. The crucial difficulty is that when the dimension of the table is large, the dimension of the vector of parameters is large as well. Nevertheless, we demonstrate the efficiency of our approach and confirm by simulations the distribution free property of the new empirical process and the GOF tests based on it. The number of parameters is as big as $30$. As an additional benefit, we point out some cases when the GOF tests based on the new process are more powerful than the traditional chi-square one.   The second problem is testing whether a distribution has a regularly varying tail. This problem is inspired mainly by the fact that regularly varying tail distributions play an essential role in characterization of the domain of attraction of extreme value distributions. While there are numerous studies on estimating the exponent of regular variation of the tail, using GOF tests for testing relevant distributions has appeared in few papers. We contribute to this latter aspect a construction of a class of GOF tests for testing regularly varying tail distributions.</p>


2000 ◽  
Vol 87 (2) ◽  
pp. 579-581 ◽  
Author(s):  
Ronald C. Serlin

In a recent article, Leventhal (1999) responds to two criticisms of hypothesis testing by showing that the one-tailed test and the directional two-tailed test are valid, even if all point null hypotheses are false and that hypothesis tests can provide the probability of decisions being correct which are based on the tests. Unfortunately, the falseness of all point null hypotheses affects the operating characteristics of the directional two-tailed test, seeming to weaken certain of Leventhal's arguments in favor of this procedure.


2018 ◽  
Vol 2 (2) ◽  
pp. 43-57
Author(s):  
M. Ridhwan ◽  
Muhammad Taufik Ihsan ◽  
Naskah Naskah

The purpose of this study was to investigate the significant effect of using comic strips strategy toward students’ reading comprehension and writing ability at MTsN 1 Pekanbaru. A Quasi-Experimental by Non-equivalent Pre-test and Post-test Group was applied as a designed for study. The sample was two classes (VIII 3 and VIII 4) consisting 20 students of treatment class, and 20 students of control class. The data were computed using SPPS 20.0 to analyze Independent sample t-test and Paired sample t-test. The finding of this study revealed that there was a significant effect on students’ reading comprehension by using comic strips strategy, it shown on paired sample t-test; treatment class was 77 and control class was 64.5, the hypothesis testing showed the result of post T-test -7.149, then score of sig.(2-tailed) is 0.000, if we act to null hypothesis (Ho) that is 0.05, it means that the score of sig.(2-tailed) was smaller than score of Ho. The data also revealed that there was a significant effect on students’ writing ability, it shown on paired sample t-test; treatment class was 79.6 and control class was 54.2, the hypothesis testing showed the result of post T-test -21.9, then score of sig.(2-tailed) is 0.000, if we act to null hypothesis (Ho) that is 0.05, it means that the score of sig.(2-tailed) was smaller than score of Ho. Therefore, the null hypothesis was rejected and the alternative hypothesis was accepted. From those data it can be summarized that there is a significant effect of using comic strips strategy on students’ reading comprehension and writing ability.


2014 ◽  
Vol 6 (1) ◽  
pp. 1032-1035 ◽  
Author(s):  
Ramzi Suleiman

The research on quasi-luminal neutrinos has sparked several experimental studies for testing the "speed of light limit" hypothesis. Until today, the overall evidence favors the "null" hypothesis, stating that there is no significant difference between the observed velocities of light and neutrinos. Despite numerous theoretical models proposed to explain the neutrinos behavior, no attempt has been undertaken to predict the experimentally produced results. This paper presents a simple novel extension of Newton's mechanics to the domain of relativistic velocities. For a typical neutrino-velocity experiment, the proposed model is utilized to derive a general expression for . Comparison of the model's prediction with results of six neutrino-velocity experiments, conducted by five collaborations, reveals that the model predicts all the reported results with striking accuracy. Because in the proposed model, the direction of the neutrino flight matters, the model's impressive success in accounting for all the tested data, indicates a complete collapse of the Lorentz symmetry principle in situation involving quasi-luminal particles, moving in two opposite directions. This conclusion is support by previous findings, showing that an identical Sagnac effect to the one documented for radial motion, occurs also in linear motion.


Sign in / Sign up

Export Citation Format

Share Document