scholarly journals An N-Pact Factor for Clinical Psychological Research

2018 ◽  
Author(s):  
Kathleen Wade Reardon ◽  
Avante J Smack ◽  
Kathrin Herzhoff ◽  
Jennifer L Tackett

Although an emphasis on adequate sample size and statistical power has a long history in clinical psychological science (Cohen, 1992), increased attention to the replicability of scientific findings has again turned attention to the importance of statistical power (Bakker, van Dijk, & Wicherts, 2012). These recent efforts have not yet circled back to modern clinical psychological research, despite the continued importance of sample size and power in producing a credible body of evidence. As one step in this process of scientific self-examination, the present study estimated an N-pact Factor (the statistical power of published empirical studies to detect typical effect sizes; Fraley & Vazire, 2014) in two leading clinical journals (the Journal of Abnormal Psychology; JAP, and the Journal of Consulting and Clinical Psychology; JCCP) for the years 2000, 2005, 2010, and 2015. Study sample size, as one proxy for statistical power, is a useful focus because it allows direct comparisons with other subfields and may highlight some of the core methodological differences between clinical and other areas (e.g., hard-to-reach populations, greater emphasis on correlational designs). We found that, across all years examined, the average median sample size in clinical research is 179 participants (175 for JAP and 182 for JCCP). The power to detect a small-medium effect size of .20 is just below 80% for both journals. Although the clinical N-pact factor was higher than that estimated for social psychology, the statistical power in clinical journals is still limited to detect many effects of interest to clinical psychologists, with little evidence of improvement in sample sizes over time.

2007 ◽  
Vol 25 (18_suppl) ◽  
pp. 6516-6516
Author(s):  
P. Bedard ◽  
M. K. Krzyzanowska ◽  
M. Pintilie ◽  
I. F. Tannock

6516 Background: Underpowered randomized clinical trials (RCTs) may expose participants to risks and burdens of research without scientific merit. We investigated the prevalence of underpowered RCTs presented at ASCO annual meetings. Methods: We surveyed all two-arm parallel phase III RCTs presented at the ASCO annual meeting from 1995–2003 where differences for the primary endpoint were non-statistically significant. Post hoc calculations were performed using a power of 80% and a=0.05 (two-sided) to determine the sample size required to detect a small, medium, and large effect size between the two groups. For studies reporting a proportion or time to event as a primary endpoint, effect size was expressed as an odds ratio (OR) or hazard ratio (HR) respectively, with a small effect size defined as OR/HR=1.3, medium effect size OR/HR=1.5, and large effect OR/HR=2.0. Logistic regression was used to identify factors associated with lack of statistical power. Results: Of 423 negative RCTs for which post hoc sample size calculations could be performed, 45 (10.6%), 138 (32.6%), and 333 (78.7%) had adequate sample size to detect small, medium, and large effect sizes respectively. Only 35 negative RCTs (7.1%) reported a reason for inadequate sample size. In a multivariable model, studies presented at plenary or oral sessions (p<0.0001) and multicenter studies supported by a co-operative group were more likely to have adequate sample size (p<0.0001). Conclusion: Two-thirds of negative RCTs presented at the ASCO annual meeting do not have an adequate sample to detect a medium-sized treatment effect. Most underpowered negative RCTs do not report a sample size calculation or reasons for inadequate patient accrual. No significant financial relationships to disclose.


2019 ◽  
Author(s):  
Peter E Clayson ◽  
Kaylie Amanda Carbine ◽  
Scott Baldwin ◽  
Michael J. Larson

Methodological reporting guidelines for studies of event-related potentials (ERPs) were updated in Psychophysiology in 2014. These guidelines facilitate the communication of key methodological parameters (e.g., preprocessing steps). Failing to report key parameters represents a barrier to replication efforts, and difficultly with replicability increases in the presence of small sample sizes and low statistical power. We assessed whether guidelines are followed and estimated the average sample size and power in recent research. Reporting behavior, sample sizes, and statistical designs were coded for 150 randomly-sampled articles from five high-impact journals that frequently publish ERP research from 2011 to 2017. An average of 63% of guidelines were reported, and reporting behavior was similar across journals, suggesting that gaps in reporting is a shortcoming of the field rather than any specific journal. Publication of the guidelines paper had no impact on reporting behavior, suggesting that editors and peer reviewers are not enforcing these recommendations. The average sample size per group was 21. Statistical power was conservatively estimated as .72-.98 for a large effect size, .35-.73 for a medium effect, and .10-.18 for a small effect. These findings indicate that failing to report key guidelines is ubiquitous and that ERP studies are primarily powered to detect large effects. Such low power and insufficient following of reporting guidelines represent substantial barriers to replication efforts. The methodological transparency and replicability of studies can be improved by the open sharing of processing code and experimental tasks and by a priori sample size calculations to ensure adequately powered studies.


2020 ◽  
pp. 28-63
Author(s):  
A. G. Vinogradov

The article belongs to a special modern genre of scholar publications, so-called tutorials – articles devoted to the application of the latest methods of design, modeling or analysis in an accessible format in order to disseminate best practices. The article acquaints Ukrainian psychologists with the basics of using the R programming language to the analysis of empirical research data. The article discusses the current state of world psychology in connection with the Crisis of Confidence, which arose due to the low reproducibility of empirical research. This problem is caused by poor quality of psychological measurement tools, insufficient attention to adequate sample planning, typical statistical hypothesis testing practices, and so-called “questionable research practices.” The tutorial demonstrates methods for determining the sample size depending on the expected magnitude of the effect size and desired statistical power, performing basic variable transformations and statistical analysis of psychological research data using language and environment R. The tutorial presents minimal system of R functions required to carry out: modern analysis of reliability of measurement scales, sample size calculation, point and interval estimation of effect size for four the most widespread in psychology designs for the analysis of two variables’ interdependence. These typical problems include finding the differences between the means and variances in two or more samples, correlations between continuous and categorical variables. Practical information on data preparation, import, basic transformations, and application of basic statistical methods in the cloud version of RStudio is provided.


2019 ◽  
Author(s):  
Marjan Bakker ◽  
Coosje Lisabet Sterre Veldkamp ◽  
Olmo Van den Akker ◽  
Marcel A. L. M. van Assen ◽  
Elise Anne Victoire Crompvoets ◽  
...  

In this preregistered study, we investigated whether the statistical power of a study is higher when researchers are asked to make a formal power analysis before collecting data. We compared the sample size descriptions from two sources: (i) a sample of pre-registrations created according to the guidelines for the Center for Open Science Preregistration Challenge (PCRs) and a sample of institutional review board (IRB) proposals from Tilburg School of Behavior and Social Sciences, which both include a recommendation to do a formal power analysis, and (ii) a sample of pre-registrations created according to the guidelines for Open Science Framework Standard Pre-Data Collection Registrations (SPRs) in which no guidance on sample size planning is given. We found that PCRs and IRBs (72%) more often included sample size decisions based on power analyses than the SPRs (45%). However, this did not result in larger planned sample sizes. The determined sample size of the PCRs and IRB proposals (Md = 90.50) was not higher than the determined sample size of the SPRs (Md = 126.00; W = 3389.5, p = 0.936). Typically, power analyses in the registrations were conducted with G*power, assuming a medium effect size, α = .05 and a power of .80. Only 20% of the power analyses contained enough information to fully reproduce the results and only 62% of these power analyses pertained to the main hypothesis test in the pre-registration. Therefore, we see ample room for improvements in the quality of the registrations and we offer several recommendations to do so.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Anderson Souza Oliveira ◽  
Cristina Ioana Pirscoveanu

AbstractLow reproducibility and non-optimal sample sizes are current concerns in scientific research, especially within human movement studies. Therefore, this study aimed to examine the implications of different sample sizes and number of steps on data variability and statistical outcomes from kinematic and kinetics running biomechanical variables. Forty-four participants ran overground using their preferred technique (normal) and minimizing the contact sound volume (silent). Running speed, peak vertical, braking forces, and vertical average loading rate were extracted from > 40 steps/runner. Data stability was computed using a sequential estimation technique. Statistical outcomes (p values and effect sizes) from the comparison normal vs silent running were extracted from 100,000 random samples, using various combinations of sample size (from 10 to 40 runners) and number of steps (from 5 to 40 steps). The results showed that only 35% of the study sample could reach average stability using up to 10 steps across all biomechanical variables. The loading rate was consistently significantly lower during silent running compared to normal running, with large effect sizes across all combinations. However, variables presenting small or medium effect sizes (running speed and peak braking force), required > 20 runners to reach significant differences. Therefore, varying sample sizes and number of steps are shown to influence the normal vs silent running statistical outcomes in a variable-dependent manner. Based on our results, we recommend that studies involving analysis of traditional running biomechanical variables use a minimum of 25 participants and 25 steps from each participant to provide appropriate data stability and statistical power.


2007 ◽  
Vol 25 (23) ◽  
pp. 3482-3487 ◽  
Author(s):  
Philippe L. Bedard ◽  
Monika K. Krzyzanowska ◽  
Melania Pintilie ◽  
Ian F. Tannock

Purpose To investigate the prevalence of underpowered randomized controlled trials (RCTs) presented at American Society of Clinical Oncology (ASCO) annual meetings. Methods We surveyed all two-arm phase III RCTs presented at ASCO annual meetings from 1995 to 2003 for which negative results were obtained. Post hoc calculations were performed using a power of 80% and an α level of .05 (two sided) to determine sample sizes required to detect small, medium, and large effect sizes. For studies reporting a proportion or time-to-event as primary end point, effect size was expressed as an odds ratio (OR) or hazard ratio (HR), respectively, with a small effect size defined as OR/HR ≥ 1.3, medium effect size defined as OR/HR ≥ 1.5, and large effect size defined as OR/HR ≥ 2.0. Logistic regression was used to identify factors associated with lack of statistical power. Results Of 423 negative RCTs for which post hoc sample size calculations could be performed, 45 (10.6%), 138 (32.6%), and 233 (55.1%) had adequate sample size to detect small, medium, and large effect sizes, respectively. Only 35 negative RCTs (7.1%) reported a reason for inadequate sample size. In a multivariable model, studies that were presented at oral sessions (P = .0038), multicenter studies supported by a cooperative group (P < .0001), and studies with time to event as primary outcome (P < .0001) were more likely to have adequate sample size. Conclusion More than half of negative RCTs presented at ASCO annual meetings do not have an adequate sample to detect a medium-size treatment effect.


2021 ◽  
Vol 12 ◽  
Author(s):  
Guoxia Wang ◽  
Yi Wang ◽  
Xiaosong Gai

Mental contrasting with implementation intentions (MCII) is a self-regulation strategy that enhances goal attainment. This meta-analysis evaluated the efficacy of MCII for goal attainment and explored potential moderators. A total of 21 empirical studies with 24 independent effect sizes (15,907 participants) were included in the analysis. Results showed that MCII to be effective for goal attainment with a small to medium effect size (g = 0.336). The effect was mainly moderated by intervention style. Specifically, studies with interventions based on interactions between participants and experimenters (g = 0.465) had stronger effects than studies with interventions based on interactions between participants and documents (g = 0.277). The results revealed that MCII is a brief and effective strategy for goal attainment with a small to moderate effect; however, because of some publication bias, the actual effect sizes may be smaller. Due to small number of studies in this meta-analysis, additional studies are needed to determine the role of moderator variables.


2008 ◽  
Vol 51 (1) ◽  
pp. 84-88
Author(s):  
I. Doğan ◽  
N. Doğan

Abstract. This study aims on to evaluate the statistical power of 40 published articles in Turkey. In order to maintain consistency, only the most common statistical test (t-test) was power analysed, alpha was held at a uniform value of 0.05 and Cohen’s (1977) definitions of small, medium and large effect sizes were adopted. The Sample Power software was used in all analysis. A total of 40 articles, 24 of them include the gender effect and 16 of them include birth type effect on the birth weight of lambs, were used. An average power estimate for small effect size was 0.062 and 0.05, for medium effect size was 0.534 and 0.347 and for large effect size was 0.598 and 0.854. The statistical power appears to be adequate only for large effect size criteria in these studies. If 0.80 was taken as an adequate power, the results indicate, for gender and birth type factors that none of the studies had adequate power to detect a small effect size, % 57.1 for gender factors and % 83.3 for birth type did not have adequate power to detect a medium effect size, and % 60.9 for gender factor and % 28.2 for birth type factor did not have adequate power to detect even a large effect size. The application of the power analysis is recommended.


2019 ◽  
Vol 227 (4) ◽  
pp. 261-279 ◽  
Author(s):  
Frank Renkewitz ◽  
Melanie Keiner

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.


2017 ◽  
Vol 27 (4) ◽  
pp. 439-457 ◽  
Author(s):  
John Wettersten

Two sharply separated traditions in the philosophy of science and in thought psychology began with Otto Selz’s psychology. The first tradition began with Karl Popper; it has been developed by many others. The developers of the second tradition have included Julius Bahle, Adriaan de Groot, Herbert Simon, and Gerd Gigerenzer. The first tradition has ignored empirical studies of thought processes. The second tradition is widely based on Simon’s inductivist philosophy. The first tradition can be improved by integrating empirical studies of rationality into its research. The second tradition can be improved by replacing its inductivist assumptions with a fallibilist framework.


Sign in / Sign up

Export Citation Format

Share Document