scholarly journals Verifying the accuracy of statistical significance testing in Campbell Collaboration systematic reviews through the use of the R package statcheck

2018 ◽  
Vol 14 (1) ◽  
pp. 1-36 ◽  
Author(s):  
Joshua R. Polanin ◽  
Michèle B. Nuijten
2020 ◽  
pp. bmjebm-2019-111257 ◽  
Author(s):  
Phoebe Rose Marson Smith ◽  
Lynda Ware ◽  
Clive Adams ◽  
Iain Chalmers

Estimates of treatment effects/differences derived from controlled comparisons are subject to uncertainty, both because of the quality of the data and the play of chance. Despite this, authors sometimes use statistical significance testing to make definitive statements that ‘no difference exists between’ treatments. A survey to assess abstracts of Cochrane reviews published in 2001/2002 identified unqualified claims of ‘no difference’ or ‘no effect’ in 259 (21.3%) out of 1212 review abstracts surveyed. We have repeated the survey to assess the frequency of such claims among the abstracts of Cochrane and other systematic reviews published in 2017. We surveyed the 643 Cochrane review abstracts published in 2017 and a random sample of 643 abstracts of other systematic reviews published in the same year. We excluded review abstracts that referred only to a protocol, lacked a conclusion or did not contain any relevant information. We took steps to reduce biases during our survey. 'No difference/no effect' was claimed in the abstracts of 36 (7.8%) of 460 Cochrane reviews and in the abstracts of 13 (6.0%) of 218 other systematic reviews. Incorrect claims of no difference/no effect of treatments were substantially less common in Cochrane reviews published in in 2017 than they were in abstracts of reviews published in 2001/2002. We hope that this reflects greater efforts to reduce biases and inconsistent judgements in the later survey as well as more careful wording of review abstracts. There are numerous other ways of wording treatment claims incorrectly. These must be addressed because they can have adverse effects on healthcare and health research.


2021 ◽  
pp. 204589402110249
Author(s):  
David D Ivy ◽  
Damien Bonnet ◽  
Rolf MF Berger ◽  
Gisela Meyer ◽  
Simin Baygani ◽  
...  

Objective: This study evaluated the efficacy and safety of tadalafil in pediatric patients with pulmonary arterial hypertension (PAH). Methods: This phase-3, international, randomized, multicenter (24 weeks double-blind placebo controlled period; 2-year, open-labelled extension period), add-on (patient’s current endothelin receptor antagonist therapy) study included pediatric patients aged <18 years with PAH. Patients received tadalafil 20 mg or 40 mg based on their weight (Heavy-weight: ≥40 kg; Middle-weight: ≥25—<40 kg) or placebo orally QD for 24 weeks. Primary endpoint was change from baseline in 6-minute walk (6MW) distance in patients aged ≥6 years at Week 24. Sample size was amended from 134 to ≥34 patients, due to serious recruitment challenges. Therefore, statistical significance testing was not performed between treatment groups. Results: Patient demographics and baseline characteristics (N=35; tadalafil=17; placebo=18) were comparable between treatment groups; median age was 14.2 years (6.2 to 17.9 years) and majority (71.4%, n=25) of patients were in HW cohort. Least square mean (SE) changes from baseline in 6MW distance at Week 24 was numerically greater with tadalafil versus placebo (60.48 [20.41] vs 36.60 [20.78] meters; placebo-adjusted mean difference [SD] 23.88 [29.11]). Safety of tadalafil treatment was as expected without any new safety concerns. During study period 1, two patients (1 in each group) discontinued due to investigator’s reported clinical worsening, and no deaths were reported. Conclusions: The statistical significance testing was not performed between the treatment groups due to low sample size, however, the study results show positive trend in improvement in non invasive measurements, commonly utilized by clinicians to evaluate the disease status for children with PAH. Safety of tadalafil treatment was as expected without any new safety signals.


2016 ◽  
Vol 21 (1) ◽  
pp. 102-115 ◽  
Author(s):  
Stephen Gorard

This paper reminds readers of the absurdity of statistical significance testing, despite its continued widespread use as a supposed method for analysing numeric data. There have been complaints about the poor quality of research employing significance tests for a hundred years, and repeated calls for researchers to stop using and reporting them. There have even been attempted bans. Many thousands of papers have now been written, in all areas of research, explaining why significance tests do not work. There are too many for all to be cited here. This paper summarises the logical problems as described in over 100 of these prior pieces. It then presents a series of demonstrations showing that significance tests do not work in practice. In fact, they are more likely to produce the wrong answer than a right one. The confused use of significance testing has practical and damaging consequences for people's lives. Ending the use of significance tests is a pressing ethical issue for research. Anyone knowing the problems, as described over one hundred years, who continues to teach, use or publish significance tests is acting unethically, and knowingly risking the damage that ensues.


2013 ◽  
Vol 12 (3) ◽  
pp. 345-351 ◽  
Author(s):  
Jessica Middlemis Maher ◽  
Jonathan C. Markey ◽  
Diane Ebert-May

Statistical significance testing is the cornerstone of quantitative research, but studies that fail to report measures of effect size are potentially missing a robust part of the analysis. We provide a rationale for why effect size measures should be included in quantitative discipline-based education research. Examples from both biological and educational research demonstrate the utility of effect size for evaluating practical significance. We also provide details about some effect size indices that are paired with common statistical significance tests used in educational research and offer general suggestions for interpreting effect size measures. Finally, we discuss some inherent limitations of effect size measures and provide further recommendations about reporting confidence intervals.


Sign in / Sign up

Export Citation Format

Share Document