scholarly journals Power and sample size calculations for fMRI studies based on the prevalence of active peaks

2016 ◽  
Author(s):  
Joke Durnez ◽  
Jasper Degryse ◽  
Beatrijs Moerkerke ◽  
Ruth Seurinck ◽  
Vanessa Sochat ◽  
...  

HighlightsThe manuscript presents a method to calculate sample sizes for fMRI experimentsThe power analysis is based on the estimation of the mixture distribution of null and active peaksThe methodology is validated with simulated and real data.1AbstractMounting evidence over the last few years suggest that published neuroscience research suffer from low power, and especially for published fMRI experiments. Not only does low power decrease the chance of detecting a true effect, it also reduces the chance that a statistically significant result indicates a true effect (Ioannidis, 2005). Put another way, findings with the least power will be the least reproducible, and thus a (prospective) power analysis is a critical component of any paper. In this work we present a simple way to characterize the spatial signal in a fMRI study with just two parameters, and a direct way to estimate these two parameters based on an existing study. Specifically, using just (1) the proportion of the brain activated and (2) the average effect size in activated brain regions, we can produce closed form power calculations for given sample size, brain volume and smoothness. This procedure allows one to minimize the cost of an fMRI experiment, while preserving a predefined statistical power. The method is evaluated and illustrated using simulations and real neuroimaging data from the Human Connectome Project. The procedures presented in this paper are made publicly available in an online web-based toolbox available at www.neuropowertools.org.

2019 ◽  
Author(s):  
Maximilien Chaumon ◽  
Aina Puce ◽  
Nathalie George

AbstractStatistical power is key for robust, replicable science. Here, we systematically explored how numbers of trials and subjects affect statistical power in MEG sensor-level data. More specifically, we simulated “experiments” using the MEG resting-state dataset of the Human Connectome Project (HCP). We divided the data in two conditions, injected a dipolar source at a known anatomical location in the “signal condition”, but not in the “noise condition”, and detected significant differences at sensor level with classical paired t-tests across subjects. Group-level detectability of these simulated effects varied drastically with anatomical origin. We thus examined in detail which spatial properties of the sources affected detectability, looking specifically at the distance from closest sensor and orientation of the source, and at the variability of these parameters across subjects. In line with previous single-subject studies, we found that the most detectable effects originate from source locations that are closest to the sensors and oriented tangentially with respect to the head surface. In addition, cross-subject variability in orientation also affected group-level detectability, boosting detection in regions where this variability was small and hindering detection in regions where it was large. Incidentally, we observed a considerable covariation of source position, orientation, and their cross-subject variability in individual brain anatomical space, making it difficult to assess the impact of each of these variables independently of one another. We thus also performed simulations where we controlled spatial properties independently of individual anatomy. These additional simulations confirmed the strong impact of distance and orientation and further showed that orientation variability across subjects affects detectability, whereas position variability does not.Importantly, our study indicates that strict unequivocal recommendations as to the ideal number of trials and subjects for any experiment cannot be realistically provided for neurophysiological studies. Rather, it highlights the importance of considering the spatial constraints underlying expected sources of activity while designing experiments.HighlightsAdequate sample size (number of subjects and trials) is key to robust neuroscienceWe simulated evoked MEG experiments and examined sensor-level detectabilityStatistical power varied by source distance, orientation & between-subject variabilityConsider source detectability at sensor-level when designing MEG studiesSample size for MEG studies? Consider source with lowest expected statistical power


1990 ◽  
Vol 47 (1) ◽  
pp. 2-15 ◽  
Author(s):  
Randall M. Peterman

Ninety-eight percent of recently surveyed papers in fisheries and aquatic sciences that did not reject some null hypothesis (H0) failed to report β, the probability of making a type II error (not rejecting H0 when it should have been), or statistical power (1 – β). However, 52% of those papers drew conclusions as if H0 were true. A false H0 could have been missed because of a low-power experiment, caused by small sample size or large sampling variability. Costs of type II errors can be large (for example, for cases that fail to detect harmful effects of some industrial effluent or a significant effect of fishing on stock depletion). Past statistical power analyses show that abundance estimation techniques usually have high β and that only large effects are detectable. I review relationships among β, power, detectable effect size, sample size, and sampling variability. I show how statistical power analysis can help interpret past results and improve designs of future experiments, impact assessments, and management regulations. I make recommendations for researchers and decision makers, including routine application of power analysis, more cautious management, and reversal of the burden of proof to put it on industry, not management agencies.


2019 ◽  
Author(s):  
Rob Cribbie ◽  
Nataly Beribisky ◽  
Udi Alter

Many bodies recommend that a sample planning procedure, such as traditional NHST a priori power analysis, is conducted during the planning stages of a study. Power analysis allows the researcher to estimate how many participants are required in order to detect a minimally meaningful effect size at a specific level of power and Type I error rate. However, there are several drawbacks to the procedure that render it “a mess.” Specifically, the identification of the minimally meaningful effect size is often difficult but unavoidable for conducting the procedure properly, the procedure is not precision oriented, and does not guide the researcher to collect as many participants as feasibly possible. In this study, we explore how these three theoretical issues are reflected in applied psychological research in order to better understand whether these issues are concerns in practice. To investigate how power analysis is currently used, this study reviewed the reporting of 443 power analyses in high impact psychology journals in 2016 and 2017. It was found that researchers rarely use the minimally meaningful effect size as a rationale for the chosen effect in a power analysis. Further, precision-based approaches and collecting the maximum sample size feasible are almost never used in tandem with power analyses. In light of these findings, we offer that researchers should focus on tools beyond traditional power analysis when sample planning, such as collecting the maximum sample size feasible.


2008 ◽  
Vol 90 (1) ◽  
pp. 58-61 ◽  
Author(s):  
SA Sexton ◽  
N Ferguson ◽  
C Pearce ◽  
DM Ricketts

INTRODUCTION Many studies published in medical journals do not consider the statistical power required to detect a meaningful difference between study groups. As a result, these studies are often underpowered: the sample size may not be large enough to pick up a statistically significant difference (or other effect of interest) of a given size between the study groups. Therefore, the conclusion that there is no statistically significant difference between groups cannot be made unless a study has been shown to have sufficient power. The aim of this study was to establish the prevalence of negative studies with inadequate statistical power in British journals to which orthopaedic surgeons regularly submit. MATERIALS AND METHODS We assessed all papers in the last consecutive six issues prior to the start of the study (April 2005) in The Journal of Bone and Joint Surgery (British), Injury, and Annals of the Royal College of Surgeons of England. We sought published evidence that a power analysis had been performed in association with the main hypothesis of the paper. RESULTS There were a total of 170 papers in which a statistical comparison of two or more groups was undertaken. Of these 170 papers, 49 (28.8%) stated as their primary conclusion that there was no statistically significant difference between the groups studied. Of these 49 papers, only 3 (6.1%) had performed a power analysis demonstrating adequate sample size. CONCLUSIONS These results demonstrate that the majority of negative studies in the British orthopaedic literature that we have looked at have not performed the statistical analysis necessary to reach their stated conclusions. In order to remedy this, we recommend that the journals sampled include the following guidance in their instructions to authors: the statement ‘no statistically significant difference was found between study groups’ should be accompanied by the results of a power analysis.


1973 ◽  
Vol 10 (3) ◽  
pp. 225-229 ◽  
Author(s):  
Jacob Cohen

I was most pleased by the recent publication by Brewer, “On the Power of Statistical Tests in the American Educational Research Journal” 1972, and understandably delighted with his heavy reliance, in accomplishing his survey, on my power handbook ( Cohen, 1969 ). I strongly agree with his stress on the importance of power analysis. Further, his survey’s confirmation of my finding of a decade ago ( Cohen, 1962 ; 1965 ) that the neglect of power analysis results in generally low power is very useful, although not surprising. Unfortunately, however, some conceptual errors in the article may seriously mislead educational researchers, and undermine our shared goal of promulgating power analysis. Hence, this note.


2021 ◽  
Author(s):  
Nick J. Broers ◽  
Henry Otgaar

Since the early work of Cohen (1962) psychological researchers have become aware of the importance of doing a power analysis to ensure that the predicted effect will be detectable with sufficient statistical power. APA guidelines require researchers to provide a justification of the chosen sample size with reference to the expected effect size; an expectation that should be based on previous research. However, we argue that a credible estimate of an expected effect size is only reasonable under two conditions: either the new study forms a direct replication of earlier work or the outcome scale makes use of meaningful and familiar units that allow for the quantification of a minimal effect of psychological interest. In practice neither of these conditions is usually met. We propose a different rationale for a power analysis that will ensure that researchers will be able to justify their sample size as meaningful and adequate.


2019 ◽  
Author(s):  
Marjan Bakker ◽  
Coosje Lisabet Sterre Veldkamp ◽  
Olmo Van den Akker ◽  
Marcel A. L. M. van Assen ◽  
Elise Anne Victoire Crompvoets ◽  
...  

In this preregistered study, we investigated whether the statistical power of a study is higher when researchers are asked to make a formal power analysis before collecting data. We compared the sample size descriptions from two sources: (i) a sample of pre-registrations created according to the guidelines for the Center for Open Science Preregistration Challenge (PCRs) and a sample of institutional review board (IRB) proposals from Tilburg School of Behavior and Social Sciences, which both include a recommendation to do a formal power analysis, and (ii) a sample of pre-registrations created according to the guidelines for Open Science Framework Standard Pre-Data Collection Registrations (SPRs) in which no guidance on sample size planning is given. We found that PCRs and IRBs (72%) more often included sample size decisions based on power analyses than the SPRs (45%). However, this did not result in larger planned sample sizes. The determined sample size of the PCRs and IRB proposals (Md = 90.50) was not higher than the determined sample size of the SPRs (Md = 126.00; W = 3389.5, p = 0.936). Typically, power analyses in the registrations were conducted with G*power, assuming a medium effect size, α = .05 and a power of .80. Only 20% of the power analyses contained enough information to fully reproduce the results and only 62% of these power analyses pertained to the main hypothesis test in the pre-registration. Therefore, we see ample room for improvements in the quality of the registrations and we offer several recommendations to do so.


Sign in / Sign up

Export Citation Format

Share Document