scholarly journals Estimating statistical power for event-related potential studies using the late positive potential

2019 ◽  
Author(s):  
Kyla D. Gibney ◽  
George Kypriotakis ◽  
Paul M. Cinciripini ◽  
Jason D. Robinson ◽  
Jennifer A. Minnix ◽  
...  

ABSTRACTThe late positive potential (LPP) is a common measurement used to study emotional processes of subjects in event-related potential (ERP) paradigms. Despite its extensive use in affective neuroscience, there is presently no gold standard for how to appropriately power ERP studies using the LPP in within-subject and between-subjects experimental designs. The present study investigates how the number of trials, number of subjects, and magnitude of the effect size affect statistical power in analyses of the LPP. Using Monte Carlo simulations of ERP experiments with varying numbers of trials, subjects, and effect sizes, we measured the probability of obtaining a statistically significant effect in 1,489 different experiments repeated 1,000 times each. Predictably, our results showed that statistical power increases with increasing numbers of trials and subjects and at larger effect sizes. In addition, we found that higher levels of statistical power can be achieved with lower numbers of subjects and trials and at lower effect sizes in within-subject than in between-subjects designs. Furthermore, we found that, as subjects are added to an experiment, the slope of the relationship between effect size and statistical power increases and shifts to the left until the power asymptotes to nearly 100% at higher effect sizes. This suggests that adding more subjects greatly increases statistical power at lower effect sizes (<1 µV) compared with more robust (>1.5 µV) effect sizes.

2019 ◽  
Vol 57 (2) ◽  
Author(s):  
Kyla D. Gibney ◽  
George Kypriotakis ◽  
Paul M. Cinciripini ◽  
Jason D. Robinson ◽  
Jennifer A. Minnix ◽  
...  

2019 ◽  
Vol 50 (12) ◽  
pp. 2085-2095
Author(s):  
William Vallet ◽  
Antoine Hone-Blanchet ◽  
Jerome Brunelin

AbstractBackgroundIndividuals with psychopathic traits display deficits in emotional processing. A key event-related potential component involved in emotional processing is the late positive potential (LPP). In healthy controls, LPP amplitude is greater in response to negative stimuli than to positive or neutral stimuli. In the current study, we aimed to compare LPP amplitudes between individuals with psychopathic traits and control subjects when presented with negative, positive or neutral stimuli. We hypothesized that LPP amplitude evoked by emotional stimuli would be reduced in individuals with psychopathic traits compared to healthy controls.MethodsAfter a systematic review of the literature, we conducted a meta-analysis to compare LPP amplitude elicited by emotional stimuli in individuals with psychopathic traits and healthy controls.ResultsIndividuals with psychopathic traits showed significantly reduced LPP amplitude evoked by negative stimuli (mean effect size = −0.47; 95% CI −0.60 to −0.33; p < 0.005) compared to healthy controls. No significant differences between groups were observed for the processing of positive (mean effect size = −0.15; 95% CI −0.42 to 0.12; p = 0.28) and neutral stimuli (mean effect size = −0.12; 95% CI 0.31 to 0.07; p = 0.21).ConclusionsMeasured by LPP amplitude, individuals with psychopathic traits displayed abnormalities in the processing of emotional stimuli with negative valence whereas processing of stimuli with positive and neutral valence was unchanged as compared with healthy controls.


2021 ◽  
Vol 3 (1) ◽  
pp. 61-89
Author(s):  
Stefan Geiß

Abstract This study uses Monte Carlo simulation techniques to estimate the minimum required levels of intercoder reliability in content analysis data for testing correlational hypotheses, depending on sample size, effect size and coder behavior under uncertainty. The ensuing procedure is analogous to power calculations for experimental designs. In most widespread sample size/effect size settings, the rule-of-thumb that chance-adjusted agreement should be ≥.80 or ≥.667 corresponds to the simulation results, resulting in acceptable α and β error rates. However, this simulation allows making precise power calculations that can consider the specifics of each study’s context, moving beyond one-size-fits-all recommendations. Studies with low sample sizes and/or low expected effect sizes may need coder agreement above .800 to test a hypothesis with sufficient statistical power. In studies with high sample sizes and/or high expected effect sizes, coder agreement below .667 may suffice. Such calculations can help in both evaluating and in designing studies. Particularly in pre-registered research, higher sample sizes may be used to compensate for low expected effect sizes and/or borderline coding reliability (e.g. when constructs are hard to measure). I supply equations, easy-to-use tables and R functions to facilitate use of this framework, along with example code as online appendix.


2017 ◽  
Author(s):  
Clarissa F. D. Carneiro ◽  
Thiago C. Moulin ◽  
Malcolm R. Macleod ◽  
Olavo B. Amaral

AbstractProposals to increase research reproducibility frequently call for focusing on effect sizes instead of p values, as well as for increasing the statistical power of experiments. However, it is unclear to what extent these two concepts are indeed taken into account in basic biomedical science. To study this in a real-case scenario, we performed a systematic review of effect sizes and statistical power in studies on learning of rodent fear conditioning, a widely used behavioral task to evaluate memory. Our search criteria yielded 410 experiments comparing control and treated groups in 122 articles. Interventions had a mean effect size of 29.5%, and amnesia caused by memory-impairing interventions was nearly always partial. Mean statistical power to detect the average effect size observed in well-powered experiments with significant differences (37.2%) was 65%, and was lower among studies with non-significant results. Only one article reported a sample size calculation, and our estimated sample size to achieve 80% power considering typical effect sizes and variances (15 animals per group) was reached in only 12.2% of experiments. Actual effect sizes correlated with effect size inferences made by readers on the basis of textual descriptions of results only when findings were non-significant, and neither effect size nor power correlated with study quality indicators, number of citations or impact factor of the publishing journal. In summary, effect sizes and statistical power have a wide distribution in the rodent fear conditioning literature, but do not seem to have a large influence on how results are described or cited. Failure to take these concepts into consideration might limit attempts to improve reproducibility in this field of science.


2020 ◽  
Vol 63 (5) ◽  
pp. 1572-1580
Author(s):  
Laura Gaeta ◽  
Christopher R. Brydges

Purpose The purpose was to examine and determine effect size distributions reported in published audiology and speech-language pathology research in order to provide researchers and clinicians with more relevant guidelines for the interpretation of potentially clinically meaningful findings. Method Cohen's d, Hedges' g, Pearson r, and sample sizes ( n = 1,387) were extracted from 32 meta-analyses in journals in speech-language pathology and audiology. Percentile ranks (25th, 50th, 75th) were calculated to determine estimates for small, medium, and large effect sizes, respectively. The median sample size was also used to explore statistical power for small, medium, and large effect sizes. Results For individual differences research, effect sizes of Pearson r = .24, .41, and .64 were found. For group differences, Cohen's d /Hedges' g = 0.25, 0.55, and 0.93. These values can be interpreted as small, medium, and large effect sizes in speech-language pathology and audiology. The majority of published research was inadequately powered to detect a medium effect size. Conclusions Effect size interpretations from published research in audiology and speech-language pathology were found to be underestimated based on Cohen's (1988, 1992) guidelines. Researchers in the field should consider using Pearson r = .25, .40, and .65 and Cohen's d /Hedges' g = 0.25, 0.55, and 0.95 as small, medium, and large effect sizes, respectively, and collect larger sample sizes to ensure that both significant and nonsignificant findings are robust and replicable.


2021 ◽  
Author(s):  
Kleber Neves ◽  
Pedro Batista Tan ◽  
Olavo Bohrer Amaral

Diagnostic screening models for the interpretation of null hypothesis significance test (NHST) results have been influential in highlighting the effect of selective publication on the reproducibility of the published literature, leading to John Ioannidis’ much-cited claim that most published research findings are false. These models, however, are typically based on the assumption that hypotheses are dichotomously true or false, without considering that effect sizes for different hypotheses are not the same. To address this limitation, we develop a simulation model that overcomes this by modeling effect sizes explicitly using different continuous distributions, while retaining other aspects of previous models such as publication bias and the pursuit of statistical significance. Our results show that the combination of selective publication, bias, low statistical power and unlikely hypotheses consistently leads to high proportions of false positives, irrespective of the effect size distribution assumed. Using continuous effect sizes also allows us to evaluate the degree of effect size overestimation and prevalence of estimates with the wrong signal in the literature, showing that the same factors that drive false-positive results also lead to errors in estimating effect size direction and magnitude. Nevertheless, the relative influence of these factors on different metrics varies depending on the distribution assumed for effect sizes. The model is made available as an R ShinyApp interface, allowing one to explore features of the literature in various scenarios.


2017 ◽  
Vol 28 (11) ◽  
pp. 1547-1562 ◽  
Author(s):  
Samantha F. Anderson ◽  
Ken Kelley ◽  
Scott E. Maxwell

The sample size necessary to obtain a desired level of statistical power depends in part on the population value of the effect size, which is, by definition, unknown. A common approach to sample-size planning uses the sample effect size from a prior study as an estimate of the population value of the effect to be detected in the future study. Although this strategy is intuitively appealing, effect-size estimates, taken at face value, are typically not accurate estimates of the population effect size because of publication bias and uncertainty. We show that the use of this approach often results in underpowered studies, sometimes to an alarming degree. We present an alternative approach that adjusts sample effect sizes for bias and uncertainty, and we demonstrate its effectiveness for several experimental designs. Furthermore, we discuss an open-source R package, BUCSS, and user-friendly Web applications that we have made available to researchers so that they can easily implement our suggested methods.


2007 ◽  
Vol 25 (23) ◽  
pp. 3482-3487 ◽  
Author(s):  
Philippe L. Bedard ◽  
Monika K. Krzyzanowska ◽  
Melania Pintilie ◽  
Ian F. Tannock

Purpose To investigate the prevalence of underpowered randomized controlled trials (RCTs) presented at American Society of Clinical Oncology (ASCO) annual meetings. Methods We surveyed all two-arm phase III RCTs presented at ASCO annual meetings from 1995 to 2003 for which negative results were obtained. Post hoc calculations were performed using a power of 80% and an α level of .05 (two sided) to determine sample sizes required to detect small, medium, and large effect sizes. For studies reporting a proportion or time-to-event as primary end point, effect size was expressed as an odds ratio (OR) or hazard ratio (HR), respectively, with a small effect size defined as OR/HR ≥ 1.3, medium effect size defined as OR/HR ≥ 1.5, and large effect size defined as OR/HR ≥ 2.0. Logistic regression was used to identify factors associated with lack of statistical power. Results Of 423 negative RCTs for which post hoc sample size calculations could be performed, 45 (10.6%), 138 (32.6%), and 233 (55.1%) had adequate sample size to detect small, medium, and large effect sizes, respectively. Only 35 negative RCTs (7.1%) reported a reason for inadequate sample size. In a multivariable model, studies that were presented at oral sessions (P = .0038), multicenter studies supported by a cooperative group (P < .0001), and studies with time to event as primary outcome (P < .0001) were more likely to have adequate sample size. Conclusion More than half of negative RCTs presented at ASCO annual meetings do not have an adequate sample to detect a medium-size treatment effect.


2021 ◽  
Author(s):  
Ymkje Anna de Vries ◽  
Robert A Schoevers ◽  
Julian Higgins ◽  
Marcus Munafo ◽  
Jojanneke Bastiaansen

Background: Previous research has suggested that statistical power is suboptimal in many biomedical disciplines, but it is unclear whether power is better in trials for particular interventions, disorders, or outcome types. We therefore performed a detailed examination of power in trials of psychotherapy, pharmacotherapy, and complementary and alternative medicine (CAM) for mood, anxiety, and psychotic disorders. Methods: We extracted data from the Cochrane Database of Systematic Reviews (Mental Health). We focused on continuous efficacy outcomes and estimated power to detect standardized effect sizes (SMD=0.20-0.80, primary effect size SMD=0.40) and the meta-analytic effect size (ESMA). We performed meta-regression to estimate the influence of including underpowered studies in meta-analyses. Results: We included 216 reviews with 8809 meta-analyses and 36540 studies. Statistical power for continuous efficacy outcomes was very low across intervention and disorder types (overall median [IQR] power for SMD=0.40: 0.33 [0.19-0.54]; for ESMA: 0.15 [0.07-0.44]), only reaching conventionally acceptable levels (80%) for SMD=0.80. Median power to detect the ESMA was higher in TAU/waitlist-controlled (0.54-0.66) or placebo-controlled (0.15-0.40) trials than in trials comparing active treatments (0.07-0.10). Meta-regression indicated that adequately-powered studies produced smaller effect sizes than underpowered studies (B=-0.06, p=0.008). Conclusions: Power to detect both fixed and meta-analytic effect sizes in clinical trials in psychiatry was low across all interventions and disorders examined. As underpowered studies produced larger effect sizes than adequately-powered studies, these results confirm the need to increase sample sizes and to reduce reporting bias against studies reporting null results to improve the reliability of the published literature.


Sign in / Sign up

Export Citation Format

Share Document