Effect Sizes and "What If" Analyses as Supplements to Statistical Significance Tests

2003 ◽  
Vol 25 (4) ◽  
pp. 310-319 ◽  
Author(s):  
Susan Pedersen
Author(s):  
H. S. Styn ◽  
S. M. Ellis

The determination of significance of differences in means and of relationships between variables is of importance in many empirical studies. Usually only statistical significance is reported, which does not necessarily indicate an important (practically significant) difference or relationship. With studies based on probability samples, effect size indices should be reported in addition to statistical significance tests in order to comment on practical significance. Where complete populations or convenience samples are worked with, the determination of statistical significance is strictly speaking no longer relevant, while the effect size indices can be used as a basis to judge significance. In this article attention is paid to the use of effect size indices in order to establish practical significance. It is also shown how these indices are utilized in a few fields of statistical application and how it receives attention in statistical literature and computer packages. The use of effect sizes is illustrated by a few examples from the research literature.


1998 ◽  
Vol 15 (2) ◽  
pp. 103-118 ◽  
Author(s):  
Vinson H. Sutlive ◽  
Dale A. Ulrich

The unqualified use of statistical significance tests for interpreting the results of empirical research has been called into question by researchers in a number of behavioral disciplines. This paper reviews what statistical significance tells us and what it does not, with particular attention paid to criticisms of using the results of these tests as the sole basis for evaluating the overall significance of research findings. In addition, implications for adapted physical activity research are discussed. Based on the recent literature of other disciplines, several recommendations for evaluating and reporting research findings are made. They include calculating and reporting effect sizes, selecting an alpha level larger than the conventional .05 level, placing greater emphasis on replication of results, evaluating results in a sample size context, and employing simple research designs. Adapted physical activity researchers are encouraged to use specific modifiers when describing findings as significant.


Beverages ◽  
2020 ◽  
Vol 6 (2) ◽  
pp. 35 ◽  
Author(s):  
Beth Desira ◽  
Shaun Watson ◽  
George Van Doorn ◽  
Justin Timora ◽  
Charles Spence

Our emotions influence our perception. In order to determine whether emotion influences the perception of beer, 32 participants watched either a scene from the movie Wall-E to induce joviality, or a short clip from the Shawshank Redemption to induce sadness. The participants were then required to sample up to 250 mL of Yenda Pale Ale beer and rate it on a variety of taste and flavor characteristics (e.g., bitterness), before completing the Positive and Negative Affect Schedule-X (PANAS-X). The data were analyzed using Bayesian t-tests and Null Hypothesis Significance Tests (NHSTs). After applying conservative corrections for multiple comparisons, NHSTs failed to reach statistical significance. However, the effect sizes suggested that inducing joviality, relative to inducing sadness, resulted in the beer being rated as (a) tasting more pleasant, (b) tasting sweeter, and (c) being of higher quality. Following the induction of joviality, participants were also willing to pay more for the beer. The Bayesian analyses indicated that induced emotion can influence flavor perception for complex taste stimuli. The effect sizes and Bayesian analyses are interpreted in terms of Feelings-as-Information theory. These preliminary findings can tentatively be applied to real-world environments such as venues that serve and/or market alcohol.


2017 ◽  
Author(s):  
C Patrick Doncaster ◽  
Thomas H G Ezard

Statistical significance provides evidence for or against an explanation of a population of interest, not a description of data sampled from the population. This simple distinction gets ignored in hundreds of thousands of research publications yearly, which confuse statistical with biological significance by referring to hypothesis-testing analyses as demonstrating significant results. Here we identify three key steps to objective reporting of evidence-based analyses. Firstly, by interpreting P -values correctly as explanation not description, authors set their inferences in the context of the design of the study and its purpose to test for effects of biologically relevant size; nowhere in this process is it informative to use the word ‘significant’. Secondly, empirical effect sizes demand interpretation with respect to a size of relevance to the test hypothesis. Thirdly, even without an a priori expectation of biological relevance, authors can and should interpret significance tests with respect to effects of reliably detectable size.


Author(s):  
Scott B. Morris ◽  
Arash Shokri

To understand and communicate research findings, it is important for researchers to consider two types of information provided by research results: the magnitude of the effect and the degree of uncertainty in the outcome. Statistical significance tests have long served as the mainstream method for statistical inferences. However, the widespread misinterpretation and misuse of significance tests has led critics to question their usefulness in evaluating research findings and to raise concerns about the far-reaching effects of this practice on scientific progress. An alternative approach involves reporting and interpreting measures of effect size along with confidence intervals. An effect size is an indicator of magnitude and direction of a statistical observation. Effect size statistics have been developed to represent a wide range of research questions, including indicators of the mean difference between groups, the relative odds of an event, or the degree of correlation among variables. Effect sizes play a key role in evaluating practical significance, conducting power analysis, and conducting meta-analysis. While effect sizes summarize the magnitude of an effect, the confidence intervals represent the degree of uncertainty in the result. By presenting a range of plausible alternate values that might have occurred due to sampling error, confidence intervals provide an intuitive indicator of how strongly researchers should rely on the results from a single study.


2016 ◽  
Vol 8 (1) ◽  
pp. 18-45
Author(s):  
Klaus G. Troitzsch

This article discusses the question of whether significance tests on simulation results are meaningful at all. It is also argued that it is the effect size much more than the existence of the effect is what matters. It is the description of the distribution function of the stochastic process incorporated in the simulation model which is important. This is particularly when this distribution is far from normal, which is particularly often the case when the simulation model is nonlinear. To this end, this article uses three different agent-based models to demonstrate that the effects of input parameters on output metrics can often be made “statistically significant” on any desired level by increasing the number of runs, even for negligible effect sizes. The examples are also used to give hints as to how many runs are necessary to estimate effect sizes and how the input parameters determine output metrics.


2017 ◽  
Author(s):  
C Patrick Doncaster ◽  
Thomas H G Ezard

Statistical significance provides evidence for or against an explanation of a population of interest, not a description of data sampled from the population. This simple distinction gets ignored in hundreds of thousands of research publications yearly, which confuse statistical with biological significance by referring to hypothesis-testing analyses as demonstrating significant results. Here we identify three key steps to objective reporting of evidence-based analyses. Firstly, by interpreting P -values correctly as explanation not description, authors set their inferences in the context of the design of the study and its purpose to test for effects of biologically relevant size; nowhere in this process is it informative to use the word ‘significant’. Secondly, empirical effect sizes demand interpretation with respect to a size of relevance to the test hypothesis. Thirdly, even without an a priori expectation of biological relevance, authors can and should interpret significance tests with respect to effects of reliably detectable size.


2021 ◽  
pp. 1-6
Author(s):  
David M. Garner ◽  
Gláucia S. Barreto ◽  
Vitor E. Valenti ◽  
Franciele M. Vanderlei ◽  
Andrey A. Porto ◽  
...  

Abstract Introduction: Approximate Entropy is an extensively enforced metric to evaluate chaotic responses and irregularities of RR intervals sourced from an eletrocardiogram. However, to estimate their responses, it has one major problem – the accurate determination of tolerances and embedding dimensions. So, we aimed to overt this potential hazard by calculating numerous alternatives to detect their optimality in malnourished children. Materials and methods: We evaluated 70 subjects split equally: malnourished children and controls. To estimate autonomic modulation, the heart rate was measured lacking any physical, sensory or pharmacologic stimuli. In the time series attained, Approximate Entropy was computed for tolerance (0.1→0.5 in intervals of 0.1) and embedding dimension (1→5 in intervals of 1) and the statistical significances between the groups by their Cohen’s ds and Hedges’s gs were totalled. Results: The uppermost value of statistical significance accomplished for the effect sizes for any of the combinations was −0.2897 (Cohen’s ds) and −0.2865 (Hedges’s gs). This was achieved with embedding dimension = 5 and tolerance = 0.3. Conclusions: Approximate Entropy was able to identify a reduction in chaotic response via malnourished children. The best values of embedding dimension and tolerance of the Approximate Entropy to identify malnourished children were, respectively, embedding dimension = 5 and embedding tolerance = 0.3. Nevertheless, Approximate Entropy is still an unreliable mathematical marker to regulate this.


1998 ◽  
Vol 21 (2) ◽  
pp. 221-222
Author(s):  
Louis G. Tassinary

Chow (1996) offers a reconceptualization of statistical significance that is reasoned and comprehensive. Despite a somewhat rough presentation, his arguments are compelling and deserve to be taken seriously by the scientific community. It is argued that his characterization of literal replication, types of research, effect size, and experimental control are in need of revision.


Sign in / Sign up

Export Citation Format

Share Document