Meta-analysis, power analysis, and the null-hypothesis significance-test procedure

1998 ◽  
Vol 21 (2) ◽  
pp. 216-217 ◽  
Author(s):  
Joseph S. Rossi

Chow's (1996) defense of the null-hypothesis significance-test procedure (NHSTP) is thoughtful and compelling in many respects. Nevertheless, techniques such as meta-analysis, power analysis, effect size estimation, and confidence intervals can be useful supplements to NHSTP in furthering the cumulative nature of behavioral research, as illustrated by the history of research on the spontaneous recovery of verbal learning.

1998 ◽  
Vol 21 (2) ◽  
pp. 228-235 ◽  
Author(s):  
Siu L. Chow

Entertaining diverse assumptions about empirical research, commentators give a wide range of verdicts on the NHSTP defence in Statistical significance. The null-hypothesis significance-test procedure (NHSTP) is defended in a framework in which deductive and inductive rules are deployed in theory corroboration in the spirit of Popper's Conjectures and refutations (1968b). The defensible hypothetico-deductive structure of the framework is used to make explicit the distinctions between (1) substantive and statistical hypotheses, (2) statistical alternative and conceptual alternative hypotheses, and (3) making statistical decisions and drawing theoretical conclusions. These distinctions make it easier to show that (1) H0 can be true, (2) the effect size is irrelevant to theory corroboration, and (3) “strong” hypotheses make no difference to NHSTP. Reservations about statistical power, meta-analysis, and the Bayesian approach are still warranted.


Psychology ◽  
2019 ◽  
Author(s):  
David B. Flora

Simply put, effect size (ES) is the magnitude or strength of association between or among variables. Effect sizes (ESs) are commonly represented numerically (i.e., as parameters for population ESs and statistics for sample estimates of population ESs) but also may be communicated graphically. Although the word “effect” may imply that an ES quantifies the strength of a causal association (“cause and effect”), ESs are used more broadly to represent any empirical association between variables. Effect sizes serve three general purposes: research results reporting, power analysis, and meta-analysis. Even under the same research design, an ES that is appropriate for one of these purposes may not be ideal for another. Effect size can be conveyed graphically or numerically using either unstandardized metrics, which are interpreted relative to the original scales of the variables involved (e.g., the difference between two means or an unstandardized regression slope), or standardized metrics, which are interpreted in relative terms (e.g., Cohen’s d or multiple R2). Whereas unstandardized ESs and graphs illustrating ES are typically most effective for research reporting, that is, communicating the original findings of an empirical study, many standardized ES measures have been developed for use in power analysis and especially meta-analysis. Although the concept of ES is clearly fundamental to data analysis, ES reporting has been advocated as an essential complement to null hypothesis significance testing (NHST), or even as a replacement for NHST. A null hypothesis significance test involves making a dichotomous judgment about whether to reject a hypothesis that a true population effect equals zero. Even in the context of a traditional NHST paradigm, ES is a critical concept because of its central role in power analysis.


1998 ◽  
Vol 21 (2) ◽  
pp. 213-213
Author(s):  
Marks R. Nester

Chow's one-tailed null-hypothesis significance-test procedure, with its rationale based on the elimination of chance influences, is not appropriate for theory-corroboration experiments. Estimated effect sizes and their associated standard errors or confidence limits will always suffice.


1998 ◽  
Vol 21 (2) ◽  
pp. 194-195
Author(s):  
Charles F. Blaich

If the NHSTP procedure is essential for controlling for chance, why is there little, if any, discussion of the nature of chance by Chow and other advocates of the procedure. Also, many criticisms that Chow takes to be aimed against the NHSTP (null-hypothesis significance-test) procedure are actually directed against the kind of theory that is tested by the procedure.


2018 ◽  
Vol 22 (4) ◽  
pp. 469-476 ◽  
Author(s):  
Ian J. Davidson

The reporting and interpretation of effect sizes is often promoted as a panacea for the ramifications of institutionalized statistical rituals associated with the null-hypothesis significance test. Mechanical objectivity—conflating the use of a method with the obtainment of truth—is a useful theoretical tool for understanding the possible failure of effect size reporting ( Porter, 1995 ). This article helps elucidate the ouroboros of psychological methodology. This is the cycle of improved tools to produce trustworthy knowledge, leading to their institutionalization and adoption as forms of thinking, leading to methodologists eventually admonishing researchers for relying too heavily on rituals, finally leading to the production of more new improved quantitative tools that may follow along this circular path. Despite many critiques and warnings, research psychologists’ superficial adoption of effect sizes might preclude expert interpretation much like in the null-hypothesis significance test as widely received. One solution to this situation is bottom-up: promoting a balance of mechanical objectivity and expertise in the teaching of methods and research. This would require the acceptance and encouragement of expert interpretation within psychological science.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Elizabeth Collins ◽  
Roger Watt

Statistical power is key to planning studies if understood and used correctly. Power is the probability of obtaining a statistically significant p-value, given a set alpha, sample size, and population effect size. The literature suggests that psychology studies are underpowered due to small sample sizes, and that researchers do not hold accurate intuitions about sensible sample sizes and associated levels of power. In this study, we surveyed 214 psychological researchers, and asked them about their experiences of using a priori power analysis, effect size estimation methods, post hoc power, and their understanding of what the term “power” actually means. Power analysis use was high, although participants reported difficulties with complex research designs, and effect size estimation. Participants also typically could not accurately define power. If psychological researchers are expected to compute a priori power analyses to plan their research, clearer educational material and guidelines should be made available.


2021 ◽  
Author(s):  
Noah N'Djaye Nikolai van Dongen

The content of this dissertation spans four years of work, which was carried out in the Netherlands (Tilburg University and University of Amsterdam) and Italy (University of Turin). It is part of the ERC project “Making Scientific Inference More Objective” led by professor Jan Sprenger, for which philosophy of science and empirical research were combined. The dissertation can be summarized as a small set of modest attempts to contribute to improving scientific practice. Each of these attempts was geared towards either increasing understanding of a particular problem or making a contribution to how science can be practiced. The general focus was on philosophical nuance while remaining methodologically practicable. The five papers contained in this dissertation are both methodologically and philosophically diverse. The first three (Chapters 2 through 4) are more empirical in nature and are focused on understanding and evaluating how science is practiced: a meta-analysis of semantic intuitions research in experimental philosophy; a systematic review on essay literature on the null hypothesis significance test; and an experiment on how teams of statisticians analyze the same data. The last two (Chapters 5 and 6) are focused on the improvement of scientific practice by providing tools for the improvement of empirical research with a strong philosophical foundation: a practicable and testable definition of scientific objectivity and a Bayesian operationalization of Popper’s concept of a severe test.


1997 ◽  
Vol 8 (1) ◽  
pp. 12-15 ◽  
Author(s):  
Robert P. Abelson

Criticisms of null-hypothesis significance tests (NHSTs) are reviewed Used as formal, two-valued decision procedures, they often generate misleading conclusions However, critics who argue that NHSTs are totally meaningless because the null hypothesis is virtually always false are overstating their case Critics also neglect the whole class of valuable significance tests that assess goodness of fit of models to data Even as applied to simple mean differences, NHSTs can be rhetorically useful in defending research against criticisms that random factors adequately explain the results, or that the direction of mean difference was not demonstrated convincingly Principled argument and counterargument produce the lore, or communal understanding, in a field, which in turn helps guide new research Alternative procedures–confidence intervals, effect sizes, and meta-analysis–are discussed Although these alternatives are not totally free from criticism either, they deserve more frequent use, without an unwise ban on NHSTs


Sign in / Sign up

Export Citation Format

Share Document