Do psychology students interpret null hypothesis significance testing critically?

Psychology Students ◽

Significance Testing ◽

Graduate Studies ◽

Null Hypothesis Testing ◽

Level Of Understanding ◽

Grade Average

The goal of the study was to descriptively analyze the understanding of null hypothesis significance testing among Croatian psychology students considering how it is usually understood in textbooks, which is subject to Bayesian and interpretative criticism. Also, the thesis represents a short overview of the discussions on the meaning of significance testing and how it is taught to students. There were 350 participants from undergraduate and graduate programs at five faculties in Croatia (Zagreb – Centre for Croatian Studies and Faculty of Humanities and Social Sciences, Rijeka, Zadar, Osijek). Another goal was to ascertain if the understanding of null hypothesis testing among psychology students can be predicted by their grades, attitudes and interests. The level of understanding of null hypothesis testing was measured by the Test of statistical significance misinterpretations (NHST test) (Oakes, 1986; Haller and Krauss, 2002). The attitudes toward null hypothesis significance testing were measured by a questionnaire that was constructed for this study. The grades were operationalized as the grade average of courses taken during undergraduate studies, and as a separate grade average of methodological courses taken during undergraduate and graduate studies. The students have shown limited understanding of null hypothesis testing – the percentage of correct answers in the NHST test was not higher than 56% for any of the six items. Croatian students have also shown less understanding on each item when compared to the German students in Haller and Krauss’s (2002) study. None of the variables – general grade average, average in the methodological courses, two variables measuring the attitude toward null hypothesis significance testing, failing at least one methodological course, and the variable of main interest in psychology – were predictive for the odds of answering the items in the NHST test correctly. The conclusion of the study is that education practices in teaching students the meaning and interpretation of null hypothesis significance testing have to be taken under consideration at Croatian psychology departments.

Failing Grade: 89% of Introduction-to-Psychology Textbooks That Define or Explain Statistical Significance Do So Incorrectly

Advances in Methods and Practices in Psychological Science ◽

10.1177/2515245919858072 ◽

2019 ◽

Vol 2 (3) ◽

pp. 233-239 ◽

Cited By ~ 3

Author(s):

Scott A. Cassidy ◽

Ralitza Dimova ◽

Benjamin Giguère ◽

Jeffrey R. Spence ◽

David J. Stanley

Keyword(s):

United States ◽

Null Hypothesis ◽

Classroom Instruction ◽

Psychology Students ◽

The United States ◽

Introductory Psychology ◽

Significance Testing ◽

Do So

Null-hypothesis significance testing (NHST) is commonly used in psychology; however, it is widely acknowledged that NHST is not well understood by either psychology professors or psychology students. In the current study, we investigated whether introduction-to-psychology textbooks accurately define and explain statistical significance. We examined 30 introductory-psychology textbooks, including the best-selling books from the United States and Canada, and found that 89% incorrectly defined or explained statistical significance. Incorrect definitions and explanations were most often consistent with the odds-against-chance fallacy. These results suggest that it is common for introduction-to-psychology students to be taught incorrect interpretations of statistical significance. We hope that our results will create awareness among authors of introductory-psychology books and provide the impetus for corrective action. To help with classroom instruction, we provide slides that correctly describe NHST and may be useful for introductory-psychology instructors.

The logic of null hypothesis testing

Behavioral and Brain Sciences ◽

10.1017/s0140525x98261164 ◽

1998 ◽

Vol 21 (2) ◽

pp. 197-198 ◽

Cited By ~ 2

Author(s):

Edward Erwin

Keyword(s):

Hypothesis Testing ◽

Null Hypothesis ◽

Power Analysis ◽

Meta Analysis ◽

Significance Testing ◽

Null Hypothesis Testing

In this commentary, I agree with Chow's treatment of null hypothesis significance testing as a noninferential procedure. However, I dispute his reconstruction of the logic of theory corroboration. I also challenge recent criticisms of NHSTP based on power analysis and meta-analysis.

The Numbers Will Love You Back in Return—I Promise

International Journal of Sports Physiology and Performance ◽

10.1123/ijspp.2016-0214 ◽

2016 ◽

Vol 11 (4) ◽

pp. 551-554 ◽

Cited By ~ 53

Author(s):

Martin Buchheit

Keyword(s):

Sample Size ◽

Null Hypothesis ◽

Clinical Medicine ◽

Significance Testing ◽

Sport Science ◽

Size Dependent ◽

Research Questions ◽

Per Se

The first sport-science-oriented and comprehensive paper on magnitude-based inferences (MBI) was published 10 y ago in the first issue of this journal. While debate continues, MBI is today well established in sport science and in other fields, particularly clinical medicine, where practical/clinical significance often takes priority over statistical significance. In this commentary, some reasons why both academics and sport scientists should abandon null-hypothesis significance testing and embrace MBI are reviewed. Apparent limitations and future areas of research are also discussed. The following arguments are presented: P values and, in turn, study conclusions are sample-size dependent, irrespective of the size of the effect; significance does not inform on magnitude of effects, yet magnitude is what matters the most; MBI allows authors to be honest with their sample size and better acknowledge trivial effects; the examination of magnitudes per se helps provide better research questions; MBI can be applied to assess changes in individuals; MBI improves data visualization; and MBI is supported by spreadsheets freely available on the Internet. Finally, recommendations to define the smallest important effect and improve the presentation of standardized effects are presented.

Tests of Statistical Significance

10.1093/oso/9780190222055.003.0003 ◽

2018 ◽

Author(s):

Brian D. Haig

Keyword(s):

Null Hypothesis ◽

Lessons Learned ◽

Significance Testing ◽

Behavioral Sciences ◽

Tests Of Significance ◽

Philosophy Of Statistics ◽

Critical Literature

Chapter 3 provides a brief overview of null hypothesis significance testing and points out its primary defects. It then outlines the neo-Fisherian account of tests of statistical significance, along with a second option contained in the philosophy of statistics known as the error-statistical philosophy, both of which are defensible. Tests of statistical significance are the most widely used means for evaluating hypotheses and theories in psychology. A massive critical literature has developed in psychology, and the behavioral sciences more generally, regarding the worth of these tests. The chapter provides a list of important lessons learned from the ongoing debates about tests of significance.

Statistical significance testing, hypothetico-deductive method, and theory evaluation

Behavioral and Brain Sciences ◽

10.1017/s0140525x00262444 ◽

2000 ◽

Vol 23 (2) ◽

pp. 292-293 ◽

Cited By ~ 1

Author(s):

Brian D. Haig

Keyword(s):

Null Hypothesis ◽

Meta Analysis ◽

Significance Testing ◽

Statistical Significance Testing ◽

Theory Evaluation ◽

Limited Role ◽

Deductive Method

Chow's endorsement of a limited role for null hypothesis significance testing is a needed corrective of research malpractice, but his decision to place this procedure in a hypothetico-deductive framework of Popperian cast is unwise. Various failures of this version of the hypothetico-deductive method have negative implications for Chow's treatment of significance testing, meta-analysis, and theory evaluation.

Costs and benefits of statistical significance tests

Behavioral and Brain Sciences ◽

10.1017/s0140525x98481160 ◽

1998 ◽

Vol 21 (2) ◽

pp. 218-219

Author(s):

Michael G. Shafto

Keyword(s):

Null Hypothesis ◽

Testing Procedure ◽

Significance Testing ◽

Costs And Benefits ◽

Significance Tests ◽

Social Scientists ◽

Conventional Tests

Chow's book provides a thorough analysis of the confusing array of issues surrounding conventional tests of statistical significance. This book should be required reading for behavioral and social scientists. Chow concludes that the null-hypothesis significance-testing procedure (NHSTP) plays a limited, but necessary, role in the experimental sciences. Another possibility is that – owing in part to its metaphorical underpinnings and convoluted logic – the NHSTP is declining in importance in those few sciences in which it ever played a role.

Hypothesis Testing in the Real World

Educational and Psychological Measurement ◽

10.1177/0013164416667984 ◽

2016 ◽

Vol 77 (4) ◽

pp. 663-672 ◽

Cited By ~ 1

Author(s):

Jeff Miller

Keyword(s):

Hypothesis Testing ◽

Everyday Life ◽

Real World ◽

Null Hypothesis ◽

Significance Testing ◽

Basic Logic ◽

The Real

Critics of null hypothesis significance testing suggest that (a) its basic logic is invalid and (b) it addresses a question that is of no interest. In contrast to (a), I argue that the underlying logic of hypothesis testing is actually extremely straightforward and compelling. To substantiate that, I present examples showing that hypothesis testing logic is routinely used in everyday life. These same examples also refute (b) by showing circumstances in which the logic of hypothesis testing addresses a question of prime interest. Null hypothesis significance testing may sometimes be misunderstood or misapplied, but these problems should be addressed by improved education.

New-day statistical thinking: A bold proposal for a radical change in practices

Journal of International Business Studies ◽

10.1057/s41267-019-00288-8 ◽

2019 ◽

Vol 51 (2) ◽

pp. 274-278 ◽

Cited By ~ 1

Author(s):

Arjen van Witteloostuijn

Keyword(s):

Null Hypothesis ◽

Radical Change ◽

Significance Testing ◽

Statistical Thinking ◽

Special Issue ◽

Statistical Significance Testing ◽

Questionable Research Practices ◽

Know How ◽

Null Hypothesis Testing

AbstractIn this commentary, I argue why we should stop engaging in null hypothesis statistical significance testing altogether. Artificial and misleading it may be, but we know how to play the p value threshold and null hypothesis-testing game. We feel secure; we love the certainty. The fly in the ointment is that the conventions have led to questionable research practices. Wasserstein, Schirm, & Lazar (Am Stat 73(sup1):1–19, 2019. 10.1080/00031305.2019.1583913) explain why, in their thought-provoking editorial introducing a special issue of The American Statistician: “As ‘statistical significance’ is used less, statistical thinking will be used more.” Perhaps we empirical researchers can together find a way to work ourselves out of the straitjacket that binds us.

The null hypothesis is always rejected with statistical tricks: Why do you need it?

Revista Interamericana de Psicología/Interamerican Journal of Psychology ◽

10.30849/rip/ijp.v53i1.1166 ◽

2019 ◽

Vol 53 (1) ◽

pp. 17-27

Author(s):

Freddy A. Paniagua

Keyword(s):

Null Hypothesis ◽

Effect Sizes ◽

Practical Significance ◽

Significance Testing ◽

Behavioral Sciences ◽

Very High

Ferguson (2015) observed that the proportion of studies supporting the experimental hypothesis and rejecting the null hypothesis is very high. This paper argues that the reason for this scenario is that researchers in the behavioral sciences have learned that the null hypothesis can always be rejected if one knows the statistical tricks to reject it (e.g., the probability of rejecting the null hypothesis increases with p = 0.05 compare to p = 0.01). Examples of the advancement of science without the need to formulate the null hypothesis are also discussed, as well as alternatives to null hypothesis significance testing-NHST (e.g., effect sizes), and the importance to distinguish the statistical significance from the practical significance of results.

A Primer on p-Value Thresholds and α-Levels – Two Different Kettles of Fish

German Journal of Agricultural Economics ◽

10.30430/70.2021.2.123-133 ◽

2021 ◽

Vol 70 (2) ◽

pp. 123-133

Author(s):

Norbert Hirschauer ◽

Sven Grüner ◽

Oliver Mußhoff ◽

Claudia Becker

Keyword(s):

Hypothesis Testing ◽

Statistical Inference ◽

Null Hypothesis ◽

Significance Testing ◽

Future Research ◽

P Value ◽

Realistic Assessment

It has often been noted that the “null-hypothesis-significance-testing” (NHST) framework is an inconsistent hybrid of Neyman-Pearson’s “hypothesis testing” and Fisher’s “significance testing” that almost inevitably causes misinterpretations. To facilitate a realistic assessment of the potential and the limits of statistical inference, we briefly recall widespread inferential errors and outline the two original approaches of these famous statisticians. Based on the understanding of their irreconcilable perspectives, we propose “going back to the roots” and using the initial evidence in the data in terms of the size and the uncertainty of the estimate for the purpose of statistical inference. Finally, we make six propositions that hopefully contribute to improving the quality of inferences in future research.