The Misuse of ‘No Significant Difference’ in British Orthopaedic Literature

INTRODUCTION Many studies published in medical journals do not consider the statistical power required to detect a meaningful difference between study groups. As a result, these studies are often underpowered: the sample size may not be large enough to pick up a statistically significant difference (or other effect of interest) of a given size between the study groups. Therefore, the conclusion that there is no statistically significant difference between groups cannot be made unless a study has been shown to have sufficient power. The aim of this study was to establish the prevalence of negative studies with inadequate statistical power in British journals to which orthopaedic surgeons regularly submit. MATERIALS AND METHODS We assessed all papers in the last consecutive six issues prior to the start of the study (April 2005) in The Journal of Bone and Joint Surgery (British), Injury, and Annals of the Royal College of Surgeons of England. We sought published evidence that a power analysis had been performed in association with the main hypothesis of the paper. RESULTS There were a total of 170 papers in which a statistical comparison of two or more groups was undertaken. Of these 170 papers, 49 (28.8%) stated as their primary conclusion that there was no statistically significant difference between the groups studied. Of these 49 papers, only 3 (6.1%) had performed a power analysis demonstrating adequate sample size. CONCLUSIONS These results demonstrate that the majority of negative studies in the British orthopaedic literature that we have looked at have not performed the statistical analysis necessary to reach their stated conclusions. In order to remedy this, we recommend that the journals sampled include the following guidance in their instructions to authors: the statement ‘no statistically significant difference was found between study groups’ should be accompanied by the results of a power analysis.

Download Full-text

Recommendations in pre-registrations and internal review board proposals promote formal power analyses but do not increase sample size

10.31234/osf.io/b3uwd ◽

2019 ◽

Author(s):

Marjan Bakker ◽

Coosje Lisabet Sterre Veldkamp ◽

Olmo Van den Akker ◽

Marcel A. L. M. van Assen ◽

Elise Anne Victoire Crompvoets ◽

...

Keyword(s):

Sample Size ◽

Power Analysis ◽

Statistical Power ◽

Hypothesis Test ◽

Open Science ◽

Medium Effect ◽

Main Hypothesis ◽

Institutional Review ◽

Formal Power ◽

Power Analyses

In this preregistered study, we investigated whether the statistical power of a study is higher when researchers are asked to make a formal power analysis before collecting data. We compared the sample size descriptions from two sources: (i) a sample of pre-registrations created according to the guidelines for the Center for Open Science Preregistration Challenge (PCRs) and a sample of institutional review board (IRB) proposals from Tilburg School of Behavior and Social Sciences, which both include a recommendation to do a formal power analysis, and (ii) a sample of pre-registrations created according to the guidelines for Open Science Framework Standard Pre-Data Collection Registrations (SPRs) in which no guidance on sample size planning is given. We found that PCRs and IRBs (72%) more often included sample size decisions based on power analyses than the SPRs (45%). However, this did not result in larger planned sample sizes. The determined sample size of the PCRs and IRB proposals (Md = 90.50) was not higher than the determined sample size of the SPRs (Md = 126.00; W = 3389.5, p = 0.936). Typically, power analyses in the registrations were conducted with G*power, assuming a medium effect size, α = .05 and a power of .80. Only 20% of the power analyses contained enough information to fully reproduce the results and only 62% of these power analyses pertained to the main hypothesis test in the pre-registration. Therefore, we see ample room for improvements in the quality of the registrations and we offer several recommendations to do so.

Download Full-text

Statistical Power Analysis can Improve Fisheries Research and Management

Canadian Journal of Fisheries and Aquatic Sciences ◽

10.1139/f90-001 ◽

1990 ◽

Vol 47 (1) ◽

pp. 2-15 ◽

Cited By ~ 444

Author(s):

Randall M. Peterman

Keyword(s):

Sample Size ◽

Power Analysis ◽

Statistical Power ◽

Small Sample Size ◽

Industrial Effluent ◽

Small Sample ◽

Detectable Effect ◽

Type Ii ◽

Sampling Variability ◽

Statistical Power Analysis

Ninety-eight percent of recently surveyed papers in fisheries and aquatic sciences that did not reject some null hypothesis (H0) failed to report β, the probability of making a type II error (not rejecting H0 when it should have been), or statistical power (1 – β). However, 52% of those papers drew conclusions as if H0 were true. A false H0 could have been missed because of a low-power experiment, caused by small sample size or large sampling variability. Costs of type II errors can be large (for example, for cases that fail to detect harmful effects of some industrial effluent or a significant effect of fishing on stock depletion). Past statistical power analyses show that abundance estimation techniques usually have high β and that only large effects are detectable. I review relationships among β, power, detectable effect size, sample size, and sampling variability. I show how statistical power analysis can help interpret past results and improve designs of future experiments, impact assessments, and management regulations. I make recommendations for researchers and decision makers, including routine application of power analysis, more cautious management, and reversal of the burden of proof to put it on industry, not management agencies.

Download Full-text

A Multi-faceted Mess: A Review of Statistical Power Analysis in Psychology Journal Articles

10.31234/osf.io/3bdfu ◽

2019 ◽

Cited By ~ 2

Author(s):

Rob Cribbie ◽

Nataly Beribisky ◽

Udi Alter

Keyword(s):

Sample Size ◽

Effect Size ◽

Power Analysis ◽

Statistical Power ◽

Type I Error ◽

A Priori ◽

Type I ◽

Specific Level ◽

Maximum Sample Size ◽

Power Analyses

Many bodies recommend that a sample planning procedure, such as traditional NHST a priori power analysis, is conducted during the planning stages of a study. Power analysis allows the researcher to estimate how many participants are required in order to detect a minimally meaningful effect size at a specific level of power and Type I error rate. However, there are several drawbacks to the procedure that render it “a mess.” Specifically, the identification of the minimally meaningful effect size is often difficult but unavoidable for conducting the procedure properly, the procedure is not precision oriented, and does not guide the researcher to collect as many participants as feasibly possible. In this study, we explore how these three theoretical issues are reflected in applied psychological research in order to better understand whether these issues are concerns in practice. To investigate how power analysis is currently used, this study reviewed the reporting of 443 power analyses in high impact psychology journals in 2016 and 2017. It was found that researchers rarely use the minimally meaningful effect size as a rationale for the chosen effect in a power analysis. Further, precision-based approaches and collecting the maximum sample size feasible are almost never used in tandem with power analyses. In light of these findings, we offer that researchers should focus on tools beyond traditional power analysis when sample planning, such as collecting the maximum sample size feasible.

Download Full-text

Sample Size Determination and Statistical Power Analysis in PLS Using R: An Annotated Tutorial

Communications of the Association for Information Systems ◽

10.17705/1cais.03603 ◽

2015 ◽

Vol 36 ◽

Cited By ~ 1

Author(s):

Miguel Aguirre-Urreta ◽

Mikko Rönkkö

Keyword(s):

Sample Size ◽

Power Analysis ◽

Statistical Power ◽

Sample Size Determination ◽

Size Determination ◽

Statistical Power Analysis

Download Full-text

Application of statistical power analysis – How to determine the right sample size in human health, comfort and productivity research

Building and Environment ◽

10.1016/j.buildenv.2009.11.002 ◽

2010 ◽

Vol 45 (5) ◽

pp. 1202-1213 ◽

Cited By ~ 61

Author(s):

Li Lan ◽

Zhiwei Lian

Keyword(s):

Sample Size ◽

Human Health ◽

Power Analysis ◽

Statistical Power ◽

Statistical Power Analysis ◽

The Right

Download Full-text

Towards a simplified justification of the chosen sample size

10.31234/osf.io/kmx75 ◽

2021 ◽

Author(s):

Nick J. Broers ◽

Henry Otgaar

Keyword(s):

Sample Size ◽

Effect Size ◽

Power Analysis ◽

Statistical Power ◽

Minimal Effect ◽

Expected Effect ◽

Sufficient Statistical Power

Since the early work of Cohen (1962) psychological researchers have become aware of the importance of doing a power analysis to ensure that the predicted effect will be detectable with sufficient statistical power. APA guidelines require researchers to provide a justification of the chosen sample size with reference to the expected effect size; an expectation that should be based on previous research. However, we argue that a credible estimate of an expected effect size is only reasonable under two conditions: either the new study forms a direct replication of earlier work or the outcome scale makes use of meaningful and familiar units that allow for the quantification of a minimal effect of psychological interest. In practice neither of these conditions is usually met. We propose a different rationale for a power analysis that will ensure that researchers will be able to justify their sample size as meaningful and adequate.

Download Full-text

Exploring Sex Differences in the Effectiveness of Telehealth-Based Health Coaching in Weight Management in an Employee Population

American Journal of Health Promotion ◽

10.1177/0890117120943363 ◽

2020 ◽

pp. 089011712094336

Author(s):

Kelly E. Johnson ◽

Michelle K. Alencar ◽

Brian Miller ◽

Elizabeth Gutierrez ◽

Patricia Dionicio

Keyword(s):

Analysis Of Variance ◽

Power Analysis ◽

Block Design ◽

Subjective Rating ◽

Program Satisfaction ◽

Sufficient Power ◽

Significant Difference ◽

Factor Combination ◽

Initial Body Mass Index ◽

Number Of Visits

Purpose: To explore a telehealth-based lifestyle therapeutics (THBC) program on weight loss (WL) and program satisfaction in an employer population. Design: This study was a collaboration between inHealth Lifestyle Therapeutics and a large national employer group including 685 participants (296 women [64% obese] and 389 men [62% obese]). Measures: Percent WL and subjective rating (Perceived Program Value measured by a questionnaire) were assessed. Intervention: Average number of visits was 3.1 ± 0.4; each visit ranged between 20 and 45 minutes. Analysis: This study utilized a 2 × 2 block design using analysis of variance techniques based on sex (male and female) and initial body mass index (BMI) category (overweight and obese) tested at P ≤ .05. Results: There was no statistical difference in %WL between by sex ( F 1,681 = 0.398, P = .528) nor an interaction between sex and BMI ( F 1,681 = 0.809, P = .369). There was a statistically significant difference in %WL from pre to post program across initial BMI category ( F 1,681 = 13.707, P ≤ .001) with obese participants losing an average of 1.1% (0.5%-1.6%) more than overweight participants (overweight 2.5% [2.1%-3.0%] vs obese 3.6% [3.2%-3.9%]). Obese participants were 1.15 (1.07-1.25) times more likely to lose weight compared to overweight participants. Analysis of variance power analysis indicated sufficient power on minimum factor combination n = 106 ( Effect Size = 0.282). Conclusion: Results support the efficacy THBC in supporting WL with no reported differences between men and women, while having a high perceived value for employee participants.

Download Full-text

Circadian patterns of rat anterior pituitary and target gland hormones in serum: Determination of the appropriate sample size by statistical power analysis

Psychoneuroendocrinology ◽

10.1016/0306-4530(80)90025-6 ◽

1980 ◽

Vol 5 (3) ◽

pp. 209-224 ◽

Cited By ~ 29

Author(s):

Russell E. Poland ◽

Robert T. Rubin ◽

Morton E. Weichsel

Keyword(s):

Sample Size ◽

Anterior Pituitary ◽

Power Analysis ◽

Statistical Power ◽

Statistical Power Analysis ◽

Circadian Patterns

Download Full-text

Power and sample size calculations for fMRI studies based on the prevalence of active peaks

10.1101/049429 ◽

2016 ◽

Cited By ~ 18

Author(s):

Joke Durnez ◽

Jasper Degryse ◽

Beatrijs Moerkerke ◽

Ruth Seurinck ◽

Vanessa Sochat ◽

...

Keyword(s):

Low Power ◽

Sample Size ◽

Power Analysis ◽

Statistical Power ◽

Real Data ◽

Brain Regions ◽

List Type ◽

True Effect ◽

Human Connectome Project ◽

Two Parameters

HighlightsThe manuscript presents a method to calculate sample sizes for fMRI experimentsThe power analysis is based on the estimation of the mixture distribution of null and active peaksThe methodology is validated with simulated and real data.1AbstractMounting evidence over the last few years suggest that published neuroscience research suffer from low power, and especially for published fMRI experiments. Not only does low power decrease the chance of detecting a true effect, it also reduces the chance that a statistically significant result indicates a true effect (Ioannidis, 2005). Put another way, findings with the least power will be the least reproducible, and thus a (prospective) power analysis is a critical component of any paper. In this work we present a simple way to characterize the spatial signal in a fMRI study with just two parameters, and a direct way to estimate these two parameters based on an existing study. Specifically, using just (1) the proportion of the brain activated and (2) the average effect size in activated brain regions, we can produce closed form power calculations for given sample size, brain volume and smoothness. This procedure allows one to minimize the cost of an fMRI experiment, while preserving a predefined statistical power. The method is evaluated and illustrated using simulations and real neuroimaging data from the Human Connectome Project. The procedures presented in this paper are made publicly available in an online web-based toolbox available at www.neuropowertools.org.

Download Full-text

Sample Size Justification

10.31234/osf.io/9d3yf ◽

2021 ◽

Author(s):

Daniel Lakens

Keyword(s):

Sample Size ◽

Effect Size ◽

Power Analysis ◽

Resource Constraints ◽

A Priori ◽

Research Area ◽

Effect Sizes ◽

Sufficient Power ◽

A Priori Power Analysis ◽

Sample Size Justification

An important step when designing a study is to justify the sample size that will be collected. The key aim of a sample size justification is to explain how the collected data is expected to provide valuable information given the inferential goals of the researcher. In this overview article six approaches are discussed to justify the sample size in a quantitative empirical study: 1) collecting data from (an)almost) the entire population, 2) choosing a sample size based on resource constraints, 3) performing an a-priori power analysis, 4) planning for a desired accuracy, 5) using heuristics, or 6) explicitly acknowledging the absence of a justification. An important question to consider when justifying sample sizes is which effect sizes are deemed interesting, and the extent to which the data that is collected informs inferences about these effect sizes. Depending on the sample size justification chosen, researchers could consider 1) what the smallest effect size of interest is, 2) which minimal effect size will be statistically significant, 3) which effect sizes they expect (and what they base these expectations on), 4) which effect sizes would be rejected based on a confidence interval around the effect size, 5) which ranges of effects a study has sufficient power to detect based on a sensitivity power analysis, and 6) which effect sizes are plausible in a specific research area. Researchers can use the guidelines presented in this article to improve their sample size justification, and hopefully, align the informational value of a study with their inferential goals.

Download Full-text