scholarly journals The use of effect size indices to determine practical significance

Author(s):  
H. S. Styn ◽  
S. M. Ellis

The determination of significance of differences in means and of relationships between variables is of importance in many empirical studies. Usually only statistical significance is reported, which does not necessarily indicate an important (practically significant) difference or relationship. With studies based on probability samples, effect size indices should be reported in addition to statistical significance tests in order to comment on practical significance. Where complete populations or convenience samples are worked with, the determination of statistical significance is strictly speaking no longer relevant, while the effect size indices can be used as a basis to judge significance. In this article attention is paid to the use of effect size indices in order to establish practical significance. It is also shown how these indices are utilized in a few fields of statistical application and how it receives attention in statistical literature and computer packages. The use of effect sizes is illustrated by a few examples from the research literature.

Author(s):  
Scott B. Morris ◽  
Arash Shokri

To understand and communicate research findings, it is important for researchers to consider two types of information provided by research results: the magnitude of the effect and the degree of uncertainty in the outcome. Statistical significance tests have long served as the mainstream method for statistical inferences. However, the widespread misinterpretation and misuse of significance tests has led critics to question their usefulness in evaluating research findings and to raise concerns about the far-reaching effects of this practice on scientific progress. An alternative approach involves reporting and interpreting measures of effect size along with confidence intervals. An effect size is an indicator of magnitude and direction of a statistical observation. Effect size statistics have been developed to represent a wide range of research questions, including indicators of the mean difference between groups, the relative odds of an event, or the degree of correlation among variables. Effect sizes play a key role in evaluating practical significance, conducting power analysis, and conducting meta-analysis. While effect sizes summarize the magnitude of an effect, the confidence intervals represent the degree of uncertainty in the result. By presenting a range of plausible alternate values that might have occurred due to sampling error, confidence intervals provide an intuitive indicator of how strongly researchers should rely on the results from a single study.


2013 ◽  
Vol 12 (3) ◽  
pp. 345-351 ◽  
Author(s):  
Jessica Middlemis Maher ◽  
Jonathan C. Markey ◽  
Diane Ebert-May

Statistical significance testing is the cornerstone of quantitative research, but studies that fail to report measures of effect size are potentially missing a robust part of the analysis. We provide a rationale for why effect size measures should be included in quantitative discipline-based education research. Examples from both biological and educational research demonstrate the utility of effect size for evaluating practical significance. We also provide details about some effect size indices that are paired with common statistical significance tests used in educational research and offer general suggestions for interpreting effect size measures. Finally, we discuss some inherent limitations of effect size measures and provide further recommendations about reporting confidence intervals.


2019 ◽  
Author(s):  
Miguel Alejandro Silan

One of the main criticisms of NHST is that statistical significance is not practical significance. And this evaluation of the practical significance of effects often take an implicit but consequential form in the field: from informal conversations among researchers when evaluating findings, to peer reviewers deciding the importance of an article. This primer seeks to make explicit what we mean when we talk about practical significance, organize what we know of it, and assert a framework for how we can evaluate and establish it. The practical significance of effects is appraised by analyzing (i.) along different levels of analysis, (ii.) across different outcomes, (iii.) across time and (iv.) across relevant moderators; which also underlie the conditions of when small effect sizes can be consequential. Practical significance is contrasted with often conflated terms including statistical significance, effect size and effect size benchmarks as well as theoretical significance. Promising directions are then presented.


2016 ◽  
Vol 51 (12) ◽  
pp. 1045-1048 ◽  
Author(s):  
Monica Lininger ◽  
Bryan L. Riemann

Objective: To describe confidence intervals (CIs) and effect sizes and provide practical examples to assist clinicians in assessing clinical meaningfulness. Background: As discussed in our first article in 2015, which addressed the difference between statistical significance and clinical meaningfulness, evaluating the clinical meaningfulness of a research study remains a challenge to many readers. In this paper, we will build on this topic by examining CIs and effect sizes. Description: A CI is a range estimated from sample data (the data we collect) that is likely to include the population parameter (value) of interest. Conceptually, this constitutes the lower and upper limits of the sample data, which would likely include, for example, the mean from the unknown population. An effect size is the magnitude of difference between 2 means. When a statistically significant difference exists between 2 means, effect size is used to describe how large or small that difference actually is. Confidence intervals and effect sizes enhance the practical interpretation of research results. Recommendations: Along with statistical significance, the CI and effect size can assist practitioners in better understanding the clinical meaningfulness of a research study.


2016 ◽  
Vol 8 (1) ◽  
pp. 18-45
Author(s):  
Klaus G. Troitzsch

This article discusses the question of whether significance tests on simulation results are meaningful at all. It is also argued that it is the effect size much more than the existence of the effect is what matters. It is the description of the distribution function of the stochastic process incorporated in the simulation model which is important. This is particularly when this distribution is far from normal, which is particularly often the case when the simulation model is nonlinear. To this end, this article uses three different agent-based models to demonstrate that the effects of input parameters on output metrics can often be made “statistically significant” on any desired level by increasing the number of runs, even for negligible effect sizes. The examples are also used to give hints as to how many runs are necessary to estimate effect sizes and how the input parameters determine output metrics.


Author(s):  
Natuya Zhuori ◽  
Yu Cai ◽  
Yan Yan ◽  
Yu Cui ◽  
Minjuan Zhao

As the trend of aging in rural China has intensified, research on the factors affecting the health of the elderly in rural areas has become a hot issue. However, the conclusions of existing studies are inconsistent and even contradictory, making it difficult to form constructive policies with practical value. To explore the reasons for the inconsistent conclusions drawn by relevant research, in this paper we constructed a meta-regression database based on 65 pieces of relevant literature published in the past 25 years. For more valid samples to reduce publication bias, we also set the statistical significance of social support to the health of the elderly in rural areas as a dependent variable. Finally, combined with multi-dimensional social support and its implications for the health of the elderly, meta-regression analysis was carried out on the results of 171 empirical studies. The results show that (1) subjective support rather than objective support can have a significant impact on the health of the elderly in rural areas, and there is no significant difference between other dimensions of social support and objective support; (2) the health status of the elderly in rural areas in samples involving western regions is more sensitive to social support than that in samples not involving the western regions; (3) among the elderly in rural areas, social support for the older male elderly is more likely to improve their health than that for the younger female elderly; and (4) besides this, both data sources and econometric models greatly affect the heterogeneity of the effect of social support on the health of the elderly in rural areas, but neither the published year nor the journal is significant. Finally, relevant policies and follow-up studies on the impact of social support on the health of the elderly in rural areas are discussed.


2021 ◽  
pp. 1-6
Author(s):  
David M. Garner ◽  
Gláucia S. Barreto ◽  
Vitor E. Valenti ◽  
Franciele M. Vanderlei ◽  
Andrey A. Porto ◽  
...  

Abstract Introduction: Approximate Entropy is an extensively enforced metric to evaluate chaotic responses and irregularities of RR intervals sourced from an eletrocardiogram. However, to estimate their responses, it has one major problem – the accurate determination of tolerances and embedding dimensions. So, we aimed to overt this potential hazard by calculating numerous alternatives to detect their optimality in malnourished children. Materials and methods: We evaluated 70 subjects split equally: malnourished children and controls. To estimate autonomic modulation, the heart rate was measured lacking any physical, sensory or pharmacologic stimuli. In the time series attained, Approximate Entropy was computed for tolerance (0.1→0.5 in intervals of 0.1) and embedding dimension (1→5 in intervals of 1) and the statistical significances between the groups by their Cohen’s ds and Hedges’s gs were totalled. Results: The uppermost value of statistical significance accomplished for the effect sizes for any of the combinations was −0.2897 (Cohen’s ds) and −0.2865 (Hedges’s gs). This was achieved with embedding dimension = 5 and tolerance = 0.3. Conclusions: Approximate Entropy was able to identify a reduction in chaotic response via malnourished children. The best values of embedding dimension and tolerance of the Approximate Entropy to identify malnourished children were, respectively, embedding dimension = 5 and embedding tolerance = 0.3. Nevertheless, Approximate Entropy is still an unreliable mathematical marker to regulate this.


1998 ◽  
Vol 21 (2) ◽  
pp. 221-222
Author(s):  
Louis G. Tassinary

Chow (1996) offers a reconceptualization of statistical significance that is reasoned and comprehensive. Despite a somewhat rough presentation, his arguments are compelling and deserve to be taken seriously by the scientific community. It is argued that his characterization of literal replication, types of research, effect size, and experimental control are in need of revision.


1998 ◽  
Vol 15 (2) ◽  
pp. 103-118 ◽  
Author(s):  
Vinson H. Sutlive ◽  
Dale A. Ulrich

The unqualified use of statistical significance tests for interpreting the results of empirical research has been called into question by researchers in a number of behavioral disciplines. This paper reviews what statistical significance tells us and what it does not, with particular attention paid to criticisms of using the results of these tests as the sole basis for evaluating the overall significance of research findings. In addition, implications for adapted physical activity research are discussed. Based on the recent literature of other disciplines, several recommendations for evaluating and reporting research findings are made. They include calculating and reporting effect sizes, selecting an alpha level larger than the conventional .05 level, placing greater emphasis on replication of results, evaluating results in a sample size context, and employing simple research designs. Adapted physical activity researchers are encouraged to use specific modifiers when describing findings as significant.


2019 ◽  
Vol 34 (6) ◽  
pp. 876-876
Author(s):  
A Walker ◽  
A Hauson ◽  
S Sarkissians ◽  
A Pollard ◽  
C Flora-Tostado ◽  
...  

Abstract Objective The Category Test (CT) has consistently been found to be sensitive at detecting the effects of alcohol on the brain. However, this test has not been as widely used in examining the effects of methamphetamine. The current meta-analysis compared effect sizes of studies that have examined performance on the CT in alcohol versus methamphetamine dependent participants. Data selection Three researchers independently searched nine databases (e.g., PsycINFO, Pubmed, ProceedingsFirst), extracted required data, and calculated effect sizes. Inclusion criteria identified studies that had (a) compared alcohol or methamphetamine dependent groups to healthy controls and (b) matched groups on either age, education, or IQ (at least 2 out of 3). Studies were excluded if participants were reported to have Axis I diagnoses (other than alcohol or methamphetamine dependence) or comorbidities known to impact neuropsychological functioning. Sixteen articles were coded and analyzed for the current study. Data synthesis Alcohol studies showed a large effect size (g = 0.745, p < 0.001) while methamphetamine studies evidenced a moderate effect size (g = 0.406, p = 0.001); both without statistically significant heterogeneity (I2 = 0). Subgroup analysis revealed a statistically significant difference between the effect sizes from alcohol versus methamphetamine studies (Q-between = 5.647, p = 0.017). Conclusions The CT is sensitive to the effects of both alcohol and methamphetamine and should be considered when examining dependent patients who might exhibit problem solving, concept formation, and set loss difficulties in everyday living.


Sign in / Sign up

Export Citation Format

Share Document