Answering Two Criticisms of Hypothesis Testing: A Comment

2000 ◽  
Vol 87 (2) ◽  
pp. 579-581 ◽  
Author(s):  
Ronald C. Serlin

In a recent article, Leventhal (1999) responds to two criticisms of hypothesis testing by showing that the one-tailed test and the directional two-tailed test are valid, even if all point null hypotheses are false and that hypothesis tests can provide the probability of decisions being correct which are based on the tests. Unfortunately, the falseness of all point null hypotheses affects the operating characteristics of the directional two-tailed test, seeming to weaken certain of Leventhal's arguments in favor of this procedure.

1999 ◽  
Vol 85 (1) ◽  
pp. 3-18 ◽  
Author(s):  
Les Leventhal

Two generations of methodologists have criticized hypothesis testing by claiming that most point null hypotheses are false and that hypothesis tests do not provide the probability that the null hypothesis is true. These criticisms are answered. (1) The point-null criticism, if correct, undermines only the traditional two-tailed test, not the one-tailed test or the little-known directional two-tailed test. The directional two-tailed test is the only hypothesis test that, properly used, provides for deciding the direction of a parameter, that is, deciding whether a parameter is positive or negative or whether it falls above or below some interesting nonzero value. The point-null criticism becomes unimportant if we replace traditional one- and two-tailed tests with the directional two-tailed test, a replacement already recommended for most purposes by previous writers. (2) If one interprets probability as a relative frequency, as most textbooks do, then the concept of probability cannot meaningfully be attached to the truth of an hypothesis; hence, it is meaningless to ask for the probability that the null is true. (3) Hypothesis tests provide the next best thing, namely, a relative frequency probability that the decision about the statistical hypotheses is correct. Two arguments are offered.


1965 ◽  
Vol 3 (12) ◽  
pp. 48-48

In a recent article (Drug Therap. Bull. April 30, 1965, p. 33) we mentioned that Asilone (Berk) and Diovol (Wallace) cost considerably more than other antacid preparations. The manufacturer has told us that Asilone is available in two versions: the one our article referred to contains aluminium hydroxide and 250 mg polymethylsiloxane per tablet, and costs 23/4 per 100 (basic NHS price); the other, Asilone 50, contains aluminium hydroxide and 50 mg polymethylsiloxane, and costs 10/- per 100 tablets. Our comment on the high cost of Asilone therefore does not apply to Asilone 50.


2016 ◽  
Vol 21 (2) ◽  
pp. 136-147 ◽  
Author(s):  
James Nicholson ◽  
Sean Mccusker

This paper is a response to Gorard's article, ‘Damaging real lives through obstinacy: re-emphasising why significance testing is wrong’ in Sociological Research Online 21(1). For many years Gorard has criticised the way hypothesis tests are used in social science, but recently he has gone much further and argued that the logical basis for hypothesis testing is flawed: that hypothesis testing does not work, even when used properly. We have sympathy with the view that hypothesis testing is often carried out in social science contexts when it should not be, and that outcomes are often described in inappropriate terms, but this does not mean the theory of hypothesis testing, or its use, is flawed per se. There needs to be evidence to support such a contention. Gorard claims that: ‘Anyone knowing the problems, as described over one hundred years, who continues to teach, use or publish significance tests is acting unethically, and knowingly risking the damage that ensues.’ This is a very strong statement which impugns the integrity, not just the competence, of a large number of highly respected academics. We argue that the evidence he puts forward in this paper does not stand up to scrutiny: that the paper misrepresents what hypothesis tests claim to do, and uses a sample size which is far too small to discriminate properly a 10% difference in means in a simulation he constructs. He then claims that this simulates emotive contexts in which a 10% difference would be important to detect, implicitly misrepresenting the simulation as a reasonable model of those contexts.


2007 ◽  
Vol 22 (3) ◽  
pp. 637-650 ◽  
Author(s):  
Ian T. Jolliffe

Abstract When a forecast is assessed, a single value for a verification measure is often quoted. This is of limited use, as it needs to be complemented by some idea of the uncertainty associated with the value. If this uncertainty can be quantified, it is then possible to make statistical inferences based on the value observed. There are two main types of inference: confidence intervals can be constructed for an underlying “population” value of the measure, or hypotheses can be tested regarding the underlying value. This paper will review the main ideas of confidence intervals and hypothesis tests, together with the less well known “prediction intervals,” concentrating on aspects that are often poorly understood. Comparisons will be made between different methods of constructing confidence intervals—exact, asymptotic, bootstrap, and Bayesian—and the difference between prediction intervals and confidence intervals will be explained. For hypothesis testing, multiple testing will be briefly discussed, together with connections between hypothesis testing, prediction intervals, and confidence intervals.


2021 ◽  
Author(s):  
Pipatphon Lapamonpinyo ◽  
Sybil Derrible ◽  
Francesco Corman

This article proposes a Python-based Amtrak and Weather Underground (PAWU) tool to collect data on Amtrak (the main passenger train operator in the United States) departure and arrival times with weather information. In addition, this article offers a database, developed with PAWU, of the operating characteristics of 16 Amtrak routes from 2008 to 2019. More specifically, PAWU enables users to retrieve Amtrak departure and arrival times of any train number throughout the United States. It then automatically retrieves weather information from Weather Underground for each rail station and stores the data collected in a local MySQL database. Users can easily select any desired train number(s) and date range(s) without dealing with the code and the raw data from those sources that are in different formats. The database itself can be used, in part, to develop, apply, and benchmark models that assess the performance of rail services such as the one offered by Amtrak.


2020 ◽  
Vol 60 (1) ◽  
pp. 49-55
Author(s):  
Jan Havlík ◽  
Tomáš Dlouhý ◽  
Michel Sabatini

This article investigates the effect of the filling ratio of the indirect rotary dryers on their operating characteristics. For moist biomass drying before combustion, the use of indirect drum dryers heated by a low pressure steam has proven to be highly suitable. Regarding the design of new dryers, it is necessary to experimentally verify the operating characteristics for specific materials and drying conditions. For this purpose, a set of experiments on a steam heated rotary drum dryer were carried out with green wood chips containing 60 to 66 wt% of moisture. The following operational characteristics of the dryer were experimentally determined: drying curves describing the process, square and volumetric evaporation capacities and drying heat consumptions. Based on the experimental results, the effect of various drum filling by dried material on the mentioned operating characteristics was analysed. On the one hand, higher drum filling ratio increases the drying time, on the other hand, the evaporation capacity also increases, while the specific energy consumption does not significantly alter. The maximum value of the evaporation capacity was reached when the drum was filled to 20 wt%. When the filling ratio was increased to 25 wt%, the evaporation capacity experienced almost no change.


1983 ◽  
Vol 29 (1) ◽  
pp. 1-24 ◽  
Author(s):  
Charles W. Hedrick

In a recent article Helmut Koester argues against the current practice of distinguishing between canonical Gospels, on the one hand, and apocryphal gospels, on the other, and treating the apocryphal gospels as ‘step children’ of New Testament research. Koester maintains that there are a number of the ‘apocryphal’ gospels which ‘belong to a very early stage in the development of gospel literature — a stage that is comparable to the sources which were used by the gospels of the New Testament.’ One of those texts to which he points is the Nag Hammadi tractate the Apocryphon of James. This paper is an attempt to legitimize one ‘step child’ of New Testament scholarship as a valid source for investigating the earliest levels of the Jesus traditions.


1976 ◽  
Vol 30 (4) ◽  
pp. 415-422 ◽  
Author(s):  
E. R. Johnson ◽  
C. K. Mann ◽  
T. J. Vickers

A system for complete computer control of the important current waveform variables in the operation of pulsed hollow cathode lamps is described and characterized. The system is shown to provide a highly flexible approach for the rapid accumulation of data on lamp operating characteristics. By implementing a simplex optimization technique with the system, it is shown that a selected lamp response (average peak intensity or integrated peak intensity) can be observed as a function of one variable, while all other variables are at values which result in an optimized response. This procedure, which probably could not be carried out without a closed loop system such as that described, avoids the potential difficulties of the one-factor-at-a-time approach. Results are reported for optimization studies of two iron hollow cathode lamps, for a response surface mapping experiment, and for examination of the pulse shapes of iron, calcium, vanadium, and aluminum hollow cathode lamps.


1982 ◽  
Vol 16 (4) ◽  
pp. 265-278
Author(s):  
Bruce Boman

In a recent article appearing in this journal, a decision of the Administrative Appeals Tribunal granting a war pension for ischaemic heart disease arising out of the stresses of military service in World War II was severely criticised. The following is a literature review supporting the Tribunal's judgement by providing evidence for an association between both neurotic illness and stresses of varying severity on the one hand and cardiovascular disease on the other.


Sign in / Sign up

Export Citation Format

Share Document