Significance tests: Necessary but not sufficient

1998 ◽  
Vol 21 (2) ◽  
pp. 221-222
Author(s):  
Louis G. Tassinary

Chow (1996) offers a reconceptualization of statistical significance that is reasoned and comprehensive. Despite a somewhat rough presentation, his arguments are compelling and deserve to be taken seriously by the scientific community. It is argued that his characterization of literal replication, types of research, effect size, and experimental control are in need of revision.

2013 ◽  
Vol 12 (3) ◽  
pp. 345-351 ◽  
Author(s):  
Jessica Middlemis Maher ◽  
Jonathan C. Markey ◽  
Diane Ebert-May

Statistical significance testing is the cornerstone of quantitative research, but studies that fail to report measures of effect size are potentially missing a robust part of the analysis. We provide a rationale for why effect size measures should be included in quantitative discipline-based education research. Examples from both biological and educational research demonstrate the utility of effect size for evaluating practical significance. We also provide details about some effect size indices that are paired with common statistical significance tests used in educational research and offer general suggestions for interpreting effect size measures. Finally, we discuss some inherent limitations of effect size measures and provide further recommendations about reporting confidence intervals.


Author(s):  
H. S. Styn ◽  
S. M. Ellis

The determination of significance of differences in means and of relationships between variables is of importance in many empirical studies. Usually only statistical significance is reported, which does not necessarily indicate an important (practically significant) difference or relationship. With studies based on probability samples, effect size indices should be reported in addition to statistical significance tests in order to comment on practical significance. Where complete populations or convenience samples are worked with, the determination of statistical significance is strictly speaking no longer relevant, while the effect size indices can be used as a basis to judge significance. In this article attention is paid to the use of effect size indices in order to establish practical significance. It is also shown how these indices are utilized in a few fields of statistical application and how it receives attention in statistical literature and computer packages. The use of effect sizes is illustrated by a few examples from the research literature.


2019 ◽  
Vol 28 (4) ◽  
pp. 468-485 ◽  
Author(s):  
Paul HP Hanel ◽  
David MA Mehler

Transparent communication of research is key to foster understanding within and beyond the scientific community. An increased focus on reporting effect sizes in addition to p value–based significance statements or Bayes Factors may improve scientific communication with the general public. Across three studies ( N = 652), we compared subjective informativeness ratings for five effect sizes, Bayes Factor, and commonly used significance statements. Results showed that Cohen’s U3 was rated as most informative. For example, 440 participants (69%) found U3 more informative than Cohen’s d, while 95 (15%) found d more informative than U3, with 99 participants (16%) finding both effect sizes equally informative. This effect was not moderated by level of education. We therefore suggest that in general, Cohen’s U3 is used when scientific findings are communicated. However, the choice of the effect size may vary depending on what a researcher wants to highlight (e.g. differences or similarities).


Author(s):  
Scott B. Morris ◽  
Arash Shokri

To understand and communicate research findings, it is important for researchers to consider two types of information provided by research results: the magnitude of the effect and the degree of uncertainty in the outcome. Statistical significance tests have long served as the mainstream method for statistical inferences. However, the widespread misinterpretation and misuse of significance tests has led critics to question their usefulness in evaluating research findings and to raise concerns about the far-reaching effects of this practice on scientific progress. An alternative approach involves reporting and interpreting measures of effect size along with confidence intervals. An effect size is an indicator of magnitude and direction of a statistical observation. Effect size statistics have been developed to represent a wide range of research questions, including indicators of the mean difference between groups, the relative odds of an event, or the degree of correlation among variables. Effect sizes play a key role in evaluating practical significance, conducting power analysis, and conducting meta-analysis. While effect sizes summarize the magnitude of an effect, the confidence intervals represent the degree of uncertainty in the result. By presenting a range of plausible alternate values that might have occurred due to sampling error, confidence intervals provide an intuitive indicator of how strongly researchers should rely on the results from a single study.


2016 ◽  
Vol 8 (1) ◽  
pp. 18-45
Author(s):  
Klaus G. Troitzsch

This article discusses the question of whether significance tests on simulation results are meaningful at all. It is also argued that it is the effect size much more than the existence of the effect is what matters. It is the description of the distribution function of the stochastic process incorporated in the simulation model which is important. This is particularly when this distribution is far from normal, which is particularly often the case when the simulation model is nonlinear. To this end, this article uses three different agent-based models to demonstrate that the effects of input parameters on output metrics can often be made “statistically significant” on any desired level by increasing the number of runs, even for negligible effect sizes. The examples are also used to give hints as to how many runs are necessary to estimate effect sizes and how the input parameters determine output metrics.


Nutrients ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 404
Author(s):  
Emma Altobelli ◽  
Paolo Matteo Angeletti ◽  
Ciro Marziliano ◽  
Marianna Mastrodomenico ◽  
Anna Rita Giuliani ◽  
...  

Diabetes mellitus is an important issue for public health, and it is growing in the world. In recent years, there has been a growing research interest on efficacy evidence of the curcumin use in the regulation of glycemia and lipidaemia. The molecular structure of curcumins allows to intercept reactive oxygen species (ROI) that are particularly harmful in chronic inflammation and tumorigenesis models. The aim of our study performed a systematic review and meta-analysis to evaluate the effect of curcumin on glycemic and lipid profile in subjects with uncomplicated type 2 diabetes. The papers included in the meta-analysis were sought in the MEDLINE, EMBASE, Scopus, Clinicaltrials.gov, Web of Science, and Cochrane Library databases as of October 2020. The sizes were pooled across studies in order to obtain an overall effect size. A random effects model was used to account for different sources of variation among studies. Cohen’s d, with 95% confidence interval (CI) was used as a measure of the effect size. Heterogeneity was assessed while using Q statistics. The ANOVA-Q test was used to value the differences among groups. Publication bias was analyzed and represented by a funnel plot. Curcumin treatment does not show a statistically significant reduction between treated and untreated patients. On the other hand, glycosylated hemoglobin, homeostasis model assessment (HOMA), and low-density lipoprotein (LDL) showed a statistically significant reduction in subjects that were treated with curcumin, respectively (p = 0.008, p < 0.001, p = 0.021). When considering HBA1c, the meta-regressions only showed statistical significance for gender (p = 0.034). Our meta-analysis seems to confirm the benefits on glucose metabolism, with results that appear to be more solid than those of lipid metabolism. However, further studies are needed in order to test the efficacy and safety of curcumin in uncomplicated type 2 diabetes.


Author(s):  
S. D. Sivasubramaniam ◽  
M. Cosentino ◽  
L. Ribeiro ◽  
F. Marino

AbstractThe data produced by the scientific community impacts on academia, clinicians, and the general public; therefore, the scientific community and other regulatory bodies have been focussing on ethical codes of conduct. Despite the measures taken by several research councils, unethical research, publishing and/or reviewing behaviours still take place. This exploratory study considers some of the current unethical practices and the reasons behind them and explores the ways to discourage these within research and other professional disciplinary bodies. These interviews/discussions with PhD students, technicians, and academics/principal investigators (PIs) (N=110) were conducted mostly in European higher education institutions including UK, Italy, Ireland, Portugal, Czech Republic and Netherlands.Through collegiate discussions, sharing experiences and by examining previously published/reported information, authors have identified several less reported behaviours. Some of these practices are mainly influenced either by the undue institutional expectations of research esteem or by changes in the journal review process. These malpractices can be divided in two categories relating to (a) methodological malpractices including data management, and (b) those that contravene publishing ethics. The former is mostly related to “committed bias”, by which the author selectively uses the data to suit their own hypothesis, methodological malpractice relates to selection of out-dated protocols that are not suited to the intended work. Although these are usually unintentional, incidences of intentional manipulations have been reported to authors of this study. For example, carrying out investigations without positive (or negative) controls; but including these from a previous study. Other methodological malpractices include unfair repetitions to gain statistical significance, or retrospective ethical approvals. In contrast, the publication related malpractices such as authorship malpractices, ethical clearance irregularities have also been reported. The findings also suggest a globalised approach with clear punitive measures for offenders is needed to tackle this problem.


2016 ◽  
Vol 21 (1) ◽  
pp. 102-115 ◽  
Author(s):  
Stephen Gorard

This paper reminds readers of the absurdity of statistical significance testing, despite its continued widespread use as a supposed method for analysing numeric data. There have been complaints about the poor quality of research employing significance tests for a hundred years, and repeated calls for researchers to stop using and reporting them. There have even been attempted bans. Many thousands of papers have now been written, in all areas of research, explaining why significance tests do not work. There are too many for all to be cited here. This paper summarises the logical problems as described in over 100 of these prior pieces. It then presents a series of demonstrations showing that significance tests do not work in practice. In fact, they are more likely to produce the wrong answer than a right one. The confused use of significance testing has practical and damaging consequences for people's lives. Ending the use of significance tests is a pressing ethical issue for research. Anyone knowing the problems, as described over one hundred years, who continues to teach, use or publish significance tests is acting unethically, and knowingly risking the damage that ensues.


Sign in / Sign up

Export Citation Format

Share Document