scholarly journals Estimating the statistical power to detect set size effects in contralateral delay activity

2020 ◽  
Author(s):  
William Xiang Quan Ngiam ◽  
Kirsten C. S. Adam ◽  
Colin Quirk ◽  
Edward K. Vogel ◽  
Ed Awh

The contralateral delay activity (CDA) is an event-related potential component commonly used to examine the online processes of visual working memory. Here, we provide a robust analysis of the statistical power that is needed to achieve reliable and reproducible results with the CDA. Using two very large EEG datasets that examined the contrast between CDA amplitude with set sizes 2 and 6 items (Unsworth et al., 2015) and set sizes 2 and 4 items (Hakim et al., 2019), we present a subsampling analysis that estimates the statistical power achieved with varying numbers of subjects and trials based on the proportion of significant tests in 10,000 iterations. We also generated simulated data using Bayesian multilevel modelling to estimate power beyond the bounds of the original datasets. The number of trials and subjects required depends critically on the effect size. Detecting the presence of the CDA – a reliable difference between contralateral and ipsilateral electrodes during the memory period – required only 30-50 clean trials with a sample of 25 subjects to achieve approximately 80% statistical power. However, for detecting a difference in CDA amplitude between two set sizes, a substantially larger number of trials and subjects was required; approximately 400 clean trials with 25 subjects to achieve 80% power. Thus, to achieve robust tests of how CDA activity differs across conditions, it is essential to be mindful of the estimated effect size. We recommend researchers designing experiments to detect set size differences in the CDA collect substantially more trials per subject.

2021 ◽  
Vol 58 (5) ◽  
Author(s):  
William X. Q. Ngiam ◽  
Kirsten C. S. Adam ◽  
Colin Quirk ◽  
Edward K. Vogel ◽  
Edward Awh

2012 ◽  
Vol 60 (6) ◽  
pp. 381 ◽  
Author(s):  
Evan Watkins ◽  
Julian Di Stefano

Hypotheses relating to the annual frequency distribution of mammalian births are commonly tested using a goodness-of-fit procedure. Several interacting factors influence the statistical power of these tests, but no power studies have been conducted using scenarios derived from biological hypotheses. Corresponding to theories relating reproductive output to seasonal resource fluctuation, we simulated data reflecting a winter reduction in birth frequency to test the effect of four factors (sample size, maximum effect size, the temporal pattern of response and the number of categories used for analysis) on the power of three goodness-of-fit procedures – the G and Chi-square tests and Watson’s U2 test. Analyses resulting in high power all had a large maximum effect size (60%) and were associated with a sample size of 200 on most occasions. The G-test was the most powerful when data were analysed using two temporal categories (winter and other) while Watson’s U2 test achieved the highest power when 12 monthly categories were used. Overall, the power of most modelled scenarios was low. Consequently, we recommend using power analysis as a research planning tool, and have provided a spreadsheet enabling a priori power calculations for the three tests considered.


2021 ◽  
Author(s):  
Enrico Fucci ◽  
Arnaud Poublan-couzardot ◽  
Oussama Abdoun ◽  
Antoine Lutz

The auditory mismatch negativity (MMN) is a well characterized event-related potential component which has gained recent attention in theoretical models describing the impact of various styles of mindfulness meditation on attentional processes and perceptual inference. Previous findings highlighted a differential modulation of the MMN amplitude by meditation states and degrees of expertise. In the present study, we attempted to replicate results from the recent literature with a data sample that allowed for increased statistical power compared to previous experiments. Relying on traditional frequentist analysis, we found no effects of meditation states and expertise on the auditory MMN amplitude, non-replicating our previous work (Fucci et al., 2018). Using a Bayesian approach, we found strong evidence against an interaction effect on the MMN amplitude between expertise groups and meditation states and only moderate evidence in favour of a weak effect of expertise during focused attention practice. On the other hand, we replicated previous evidence of increased alpha oscillatory power during meditation practices compared to a control state. We discuss our null findings in relation to factors that could undermine the replicability of previous research on this subject, namely low statistical power, use of flexible analysis methods and a possible publication bias leading to a misrepresentation of the available evidence.


2018 ◽  
Author(s):  
Simon Leipold ◽  
Chantal Oderbolz ◽  
Marielle Greber ◽  
Lutz Jaencke

Musicians with absolute pitch effortlessly identify the pitch of a sound without an external reference. Previous neuroscientific studies on absolute pitch have typically had small samples sizes and low statistical power, making them susceptible for false positive findings. In a seminal study, Itoh et al. (2005) reported the elicitation of an absolute pitch-specific event-related potential component during tone listening - the AP negativity. Additionally, they identified several components as correlates of relative pitch, the ability to identify relations between pitches. Here, we attempted to replicate the main findings of Itoh et al.'s study in a large sample of musicians (n = 104) using both frequentist and Bayesian inference. We were not able to replicate the presence of an AP negativity during tone listening in individuals with high levels of absolute pitch, but we partially replicated the findings concerning the correlates of relative pitch. Our results are consistent with several previous studies reporting an absence of differences between musicians with and without absolute pitch in early auditory evoked potential components. We conclude that replication studies form a crucial part in assessing extraordinary findings, even more so in small fields where a single finding can have a large impact on further research.


2019 ◽  
Author(s):  
Kyla D. Gibney ◽  
George Kypriotakis ◽  
Paul M. Cinciripini ◽  
Jason D. Robinson ◽  
Jennifer A. Minnix ◽  
...  

ABSTRACTThe late positive potential (LPP) is a common measurement used to study emotional processes of subjects in event-related potential (ERP) paradigms. Despite its extensive use in affective neuroscience, there is presently no gold standard for how to appropriately power ERP studies using the LPP in within-subject and between-subjects experimental designs. The present study investigates how the number of trials, number of subjects, and magnitude of the effect size affect statistical power in analyses of the LPP. Using Monte Carlo simulations of ERP experiments with varying numbers of trials, subjects, and effect sizes, we measured the probability of obtaining a statistically significant effect in 1,489 different experiments repeated 1,000 times each. Predictably, our results showed that statistical power increases with increasing numbers of trials and subjects and at larger effect sizes. In addition, we found that higher levels of statistical power can be achieved with lower numbers of subjects and trials and at lower effect sizes in within-subject than in between-subjects designs. Furthermore, we found that, as subjects are added to an experiment, the slope of the relationship between effect size and statistical power increases and shifts to the left until the power asymptotes to nearly 100% at higher effect sizes. This suggests that adding more subjects greatly increases statistical power at lower effect sizes (<1 µV) compared with more robust (>1.5 µV) effect sizes.


2018 ◽  
Author(s):  
Blair Saunders ◽  
Michael Inzlicht

Many clinical neuroscience investigations have suggested that trait anxiety is associated with increased neural reactivity to mistakes in the form of an event-related potential called the error-related negativity (ERN). Several recent meta-analyses indicated that the anxiety-ERN association was of a small-to-medium effect size, however, these prior investigations did not comprehensively adjust effect sizes for publication bias. Here, in an updated meta-analysis (k=58, N=3819), we found support for an uncorrected effect size of r =-.19, and applied a range of methods to test for and correct publication bias (trim-and-fill, PET, PEESE, Peters’ test, three-parameter selection model). The majority of bias-correction methods suggested that the correlation between anxiety and the ERN is non-zero, but smaller than the uncorrected effect size (average adjusted effect size: r =-.12, range: r =-.05 to -.18). Moderation analyses also revealed more robust effects for clinical anxiety and anxious samples characterised by worry, however, it should be noted that these larger effects were also associated with elevated indicators of publication bias relative to the overall analysis. Mixed anxiety and sub-clinical anxiety were not associated with the amplitude of the ERN. Our results suggest that the anxiety-ERN relationship survives multiple corrections for publication bias, albeit not among all sub-types and populations of anxiety. Nevertheless, only 50% of the studies included in our analysis reported significant results, indicating that future research exploring the anxiety-ERN relationship would benefit from increased statistical power.


2019 ◽  
Vol 227 (4) ◽  
pp. 261-279 ◽  
Author(s):  
Frank Renkewitz ◽  
Melanie Keiner

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.


2019 ◽  
Author(s):  
Curtis David Von Gunten ◽  
Bruce D Bartholow

A primary psychometric concern with laboratory-based inhibition tasks has been their reliability. However, a reliable measure may not be necessary or sufficient for reliably detecting effects (statistical power). The current study used a bootstrap sampling approach to systematically examine how the number of participants, the number of trials, the magnitude of an effect, and study design (between- vs. within-subject) jointly contribute to power in five commonly used inhibition tasks. The results demonstrate the shortcomings of relying solely on measurement reliability when determining the number of trials to use in an inhibition task: high internal reliability can be accompanied with low power and low reliability can be accompanied with high power. For instance, adding additional trials once sufficient reliability has been reached can result in large gains in power. The dissociation between reliability and power was particularly apparent in between-subject designs where the number of participants contributed greatly to power but little to reliability, and where the number of trials contributed greatly to reliability but only modestly (depending on the task) to power. For between-subject designs, the probability of detecting small-to-medium-sized effects with 150 participants (total) was generally less than 55%. However, effect size was positively associated with number of trials. Thus, researchers have some control over effect size and this needs to be considered when conducting power analyses using analytic methods that take such effect sizes as an argument. Results are discussed in the context of recent claims regarding the role of inhibition tasks in experimental and individual difference designs.


Sign in / Sign up

Export Citation Format

Share Document