Hypothesis Testing in the Real World

Critics of null hypothesis significance testing suggest that (a) its basic logic is invalid and (b) it addresses a question that is of no interest. In contrast to (a), I argue that the underlying logic of hypothesis testing is actually extremely straightforward and compelling. To substantiate that, I present examples showing that hypothesis testing logic is routinely used in everyday life. These same examples also refute (b) by showing circumstances in which the logic of hypothesis testing addresses a question of prime interest. Null hypothesis significance testing may sometimes be misunderstood or misapplied, but these problems should be addressed by improved education.

Download Full-text

Do psychology students interpret null hypothesis significance testing critically?

10.31237/osf.io/9dz8w ◽

2017 ◽

Author(s):

Ivan Flis

Keyword(s):

Hypothesis Testing ◽

Null Hypothesis ◽

Statistical Significance ◽

Psychology Students ◽

Significance Testing ◽

Null Hypothesis Significance Testing ◽

Graduate Studies ◽

Null Hypothesis Testing ◽

Level Of Understanding ◽

Grade Average

The goal of the study was to descriptively analyze the understanding of null hypothesis significance testing among Croatian psychology students considering how it is usually understood in textbooks, which is subject to Bayesian and interpretative criticism. Also, the thesis represents a short overview of the discussions on the meaning of significance testing and how it is taught to students. There were 350 participants from undergraduate and graduate programs at five faculties in Croatia (Zagreb – Centre for Croatian Studies and Faculty of Humanities and Social Sciences, Rijeka, Zadar, Osijek). Another goal was to ascertain if the understanding of null hypothesis testing among psychology students can be predicted by their grades, attitudes and interests. The level of understanding of null hypothesis testing was measured by the Test of statistical significance misinterpretations (NHST test) (Oakes, 1986; Haller and Krauss, 2002). The attitudes toward null hypothesis significance testing were measured by a questionnaire that was constructed for this study. The grades were operationalized as the grade average of courses taken during undergraduate studies, and as a separate grade average of methodological courses taken during undergraduate and graduate studies. The students have shown limited understanding of null hypothesis testing – the percentage of correct answers in the NHST test was not higher than 56% for any of the six items. Croatian students have also shown less understanding on each item when compared to the German students in Haller and Krauss’s (2002) study. None of the variables – general grade average, average in the methodological courses, two variables measuring the attitude toward null hypothesis significance testing, failing at least one methodological course, and the variable of main interest in psychology – were predictive for the odds of answering the items in the NHST test correctly. The conclusion of the study is that education practices in teaching students the meaning and interpretation of null hypothesis significance testing have to be taken under consideration at Croatian psychology departments.

Download Full-text

A Primer on p-Value Thresholds and α-Levels – Two Different Kettles of Fish

German Journal of Agricultural Economics ◽

10.30430/70.2021.2.123-133 ◽

2021 ◽

Vol 70 (2) ◽

pp. 123-133

Author(s):

Norbert Hirschauer ◽

Sven Grüner ◽

Oliver Mußhoff ◽

Claudia Becker

Keyword(s):

Hypothesis Testing ◽

Statistical Inference ◽

Null Hypothesis ◽

Significance Testing ◽

Future Research ◽

P Value ◽

Null Hypothesis Significance Testing ◽

Realistic Assessment

It has often been noted that the “null-hypothesis-significance-testing” (NHST) framework is an inconsistent hybrid of Neyman-Pearson’s “hypothesis testing” and Fisher’s “significance testing” that almost inevitably causes misinterpretations. To facilitate a realistic assessment of the potential and the limits of statistical inference, we briefly recall widespread inferential errors and outline the two original approaches of these famous statisticians. Based on the understanding of their irreconcilable perspectives, we propose “going back to the roots” and using the initial evidence in the data in terms of the size and the uncertainty of the estimate for the purpose of statistical inference. Finally, we make six propositions that hopefully contribute to improving the quality of inferences in future research.

Download Full-text

The logic of null hypothesis testing

Behavioral and Brain Sciences ◽

10.1017/s0140525x98261164 ◽

1998 ◽

Vol 21 (2) ◽

pp. 197-198 ◽

Cited By ~ 2

Author(s):

Edward Erwin

Keyword(s):

Hypothesis Testing ◽

Null Hypothesis ◽

Power Analysis ◽

Meta Analysis ◽

Significance Testing ◽

Null Hypothesis Significance Testing ◽

Null Hypothesis Testing

In this commentary, I agree with Chow's treatment of null hypothesis significance testing as a noninferential procedure. However, I dispute his reconstruction of the logic of theory corroboration. I also challenge recent criticisms of NHSTP based on power analysis and meta-analysis.

Download Full-text

The historical case against null-hypothesis significance testing

Behavioral and Brain Sciences ◽

10.1017/s0140525x98501161 ◽

1998 ◽

Vol 21 (2) ◽

pp. 219-220 ◽

Cited By ~ 3

Author(s):

Henderikus J. Stam ◽

Grant A. Pasay

Keyword(s):

Hypothesis Testing ◽

Null Hypothesis ◽

Significance Testing ◽

Null Hypothesis Significance Testing ◽

Historical Case ◽

Testing Procedures ◽

The Core

We argue that Chow's defense of hypothesis-testing procedures attempts to restore an aura of objectivity to the core procedures, allowing these to take on the role of judgment that should be reserved for the researcher. We provide a brief overview of what we call the historical case against hypothesis testing and argue that the latter has led to a constrained and simplified conception of what passes for theory in psychology.

Download Full-text

Bayesian Hypothesis Testing: An Alternative to Null Hypothesis Significance Testing (NHST) in Psychology and Social Sciences

Bayesian Inference ◽

10.5772/intechopen.70230 ◽

2017 ◽

Cited By ~ 7

Author(s):

Alonso Ortega ◽

Gorka Navarrete

Keyword(s):

Social Sciences ◽

Hypothesis Testing ◽

Null Hypothesis ◽

Significance Testing ◽

Null Hypothesis Significance Testing ◽

Bayesian Hypothesis Testing

Download Full-text

A Frequentist Alternative to Significance Testing, p-Values, and Confidence Intervals

Econometrics ◽

10.3390/econometrics7020026 ◽

2019 ◽

Vol 7 (2) ◽

pp. 26 ◽

Cited By ~ 7

Author(s):

David Trafimow

Keyword(s):

Present Article ◽

Confidence Intervals ◽

Null Hypothesis ◽

A Priori ◽

Significance Testing ◽

Population Parameters ◽

Null Hypothesis Significance Testing ◽

P Values ◽

Statistical Procedures ◽

Major Section

There has been much debate about null hypothesis significance testing, p-values without null hypothesis significance testing, and confidence intervals. The first major section of the present article addresses some of the main reasons these procedures are problematic. The conclusion is that none of them are satisfactory. However, there is a new procedure, termed the a priori procedure (APP), that validly aids researchers in obtaining sample statistics that have acceptable probabilities of being close to their corresponding population parameters. The second major section provides a description and review of APP advances. Not only does the APP avoid the problems that plague other inferential statistical procedures, but it is easy to perform too. Although the APP can be performed in conjunction with other procedures, the present recommendation is that it be used alone.

Download Full-text

The Numbers Will Love You Back in Return—I Promise

International Journal of Sports Physiology and Performance ◽

10.1123/ijspp.2016-0214 ◽

2016 ◽

Vol 11 (4) ◽

pp. 551-554 ◽

Cited By ~ 53

Author(s):

Martin Buchheit

Keyword(s):

Sample Size ◽

Null Hypothesis ◽

Clinical Medicine ◽

Statistical Significance ◽

Significance Testing ◽

Null Hypothesis Significance Testing ◽

Sport Science ◽

Size Dependent ◽

Research Questions ◽

Per Se

The first sport-science-oriented and comprehensive paper on magnitude-based inferences (MBI) was published 10 y ago in the first issue of this journal. While debate continues, MBI is today well established in sport science and in other fields, particularly clinical medicine, where practical/clinical significance often takes priority over statistical significance. In this commentary, some reasons why both academics and sport scientists should abandon null-hypothesis significance testing and embrace MBI are reviewed. Apparent limitations and future areas of research are also discussed. The following arguments are presented: P values and, in turn, study conclusions are sample-size dependent, irrespective of the size of the effect; significance does not inform on magnitude of effects, yet magnitude is what matters the most; MBI allows authors to be honest with their sample size and better acknowledge trivial effects; the examination of magnitudes per se helps provide better research questions; MBI can be applied to assess changes in individuals; MBI improves data visualization; and MBI is supported by spreadsheets freely available on the Internet. Finally, recommendations to define the smallest important effect and improve the presentation of standardized effects are presented.

Download Full-text

In support of null hypothesis significance testing

Proceedings of The Royal Society B Biological Sciences ◽

10.1098/rsbl.2003.0105 ◽

2004 ◽

Vol 271 (suppl_3) ◽

Cited By ~ 12

Author(s):

Michael Mogie

Keyword(s):

Null Hypothesis ◽

Significance Testing ◽

Null Hypothesis Significance Testing

Download Full-text

Роль показника розміру ефекту в сучасних психологічних дослідженнях

10.52363/dcpp-2021.2.9 ◽

2021 ◽

Author(s):

Валерій Боснюк

Keyword(s):

Null Hypothesis ◽

Significance Testing ◽

P Value ◽

Null Hypothesis Significance Testing

Для підтвердження результатів дослідження в психологічних наукових роботах протягом багатьох років використовується процедура перевірки значущості нульової гіпотези (загальноприйнята абревіатура NHST – Null Hypothesis Significance Testing) із застосуванням спеціальних статистичних критеріїв. При цьому здебільшого значення статистики «p» (p-value) розглядається як еквівалент важливості отриманих результатів і сили наукових доказів на користь практичного й теоретичного ефекту дослідження. Таке некоректне використання та інтерпретації p-value ставить під сумнів застосування статистики взагалі та загрожує розвитку психології як науки. Ототожнення статистичного висновку з науковим висновком, орієнтація виключно на новизну в наукових дослідженнях, ритуальна прихильність дослідників до рівня значущості 0,05, опора на статистичну категоричність «так/ні» під час прийняття рішення призводить до того, що психологія примножує тільки результати про наявність ефекту без врахування його величини, практичної цінності. Дана робота призначена для аналізу обмеженості p-value при інтерпретації результатів психологічних досліджень та переваг представлення інформації про розмір ефекту. Застосування розмірів ефекту дозволить здійснити перехід від дихотомічного мислення до оціночного, визначати цінність результатів незалежно від рівня статистичної значущості, приймати рішення більш раціонально та обґрунтовано. Обґрунтовується позиція, що автор наукової роботи при формулюванні висновків дослідження не повинен обмежуватися одним єдиним показником рівня статистичної значущості. Осмислені висновки повинні базуватися на розумному балансуванні p-value та інших не менш важливих параметрів, одним з яких виступає розмір ефекту. Ефект (відмінність, зв’язок, асоціація) може бути статистично значущим, а його практична (клінічна) цінність – незначною, тривіальною. «Статистично значущий» не означає «корисний», «важливий», «цінний», «значний». Тому звернення уваги психологів до питання аналізу виявленого розміру ефекту має стати обов’язковим при інтерпретації результатів дослідження.

Download Full-text

The Falsificationist Foundation for Null Hypothesis Significance Testing

Statistical and Fuzzy Approaches to Data Processing, with Applications to Econometrics and Other Areas - Studies in Computational Intelligence ◽

10.1007/978-3-030-45619-1_16 ◽

2020 ◽

pp. 219-226

Author(s):

David Trafimow

Keyword(s):

Null Hypothesis ◽

Significance Testing ◽

Null Hypothesis Significance Testing

Download Full-text