scholarly journals Do Effect Sizes in Psychology Laboratory Experiments Mean Anything in Reality?

Author(s):  
Roy Baumeister

The artificial environment of a psychological laboratory experiment offers an excellent method for testing whether a causal relationship exists, — but it is mostly useless for predicting the size and power of such effects in normal life. In comparison with effects out in the world, laboratory effects are often artificially large, because the laboratory situation is set up precisely to capture this effect, with extraneous factors screened out. Equally problematic, laboratory effects are often artificially small, given practical and ethical constraints that make laboratory situations watered-down echoes of what happens in life. Furthermore, in many cases the very notion of a true effect size (as if it were constant across different manipulations and dependent variables) is absurd. These problems are illustrated with examples from the author’s own research programs. It is also revealing that experimental effect sizes, though often quite precisely calculated and proudly integrated into meta-analyses, have attracted almost zero attention in terms of substantive theory about human mental processes and behavior. At best, effect sizes from laboratory experiments provide information that could help other researchers to design their experiments, — but that means effect sizes are shop talk, not information about reality. It is recommended that researchers shift toward a more realistic appreciation of how little can be learned about human mind and behavior from effect sizes in laboratory studies.

2020 ◽  
Author(s):  
Roy Baumeister

The artificial environment of the psychological laboratory experiment offers an excellent method for testing whether a causal relationship exists — but it is mostly useless for predicting the size and power of such effects out in the world. A laboratory effect may be artificially inflated or deflated in comparison with the same causal process outside the laboratory. Indeed, in many cases the very notion of a true effect size (regardless of type of manipulation or dependent variable) is absurd. At best, effect sizes from laboratory experiments provide information that could help other researchers to design their experiments — but that means effect sizes are shop talk, not information about reality. NOTE: Comment articles are invited at the journal.


2017 ◽  
Author(s):  
Robbie Cornelis Maria van Aert ◽  
Jelte M. Wicherts ◽  
Marcel A. L. M. van Assen

Publication bias is a substantial problem for the credibility of research in general and of meta-analyses in particular, as it yields overestimated effects and may suggest the existence of non-existing effects. Although there is consensus that publication bias exists, how strongly it affects different scientific literatures is currently less well-known. We examined evidence of publication bias in a large-scale data set of primary studies that were included in 83 meta-analyses published in Psychological Bulletin (representing meta-analyses from psychology) and 499 systematic reviews from the Cochrane Database of Systematic Reviews (CDSR; representing meta-analyses from medicine). Publication bias was assessed on all homogeneous subsets (3.8% of all subsets of meta-analyses published in Psychological Bulletin) of primary studies included in meta-analyses, because publication bias methods do not have good statistical properties if the true effect size is heterogeneous. The Monte-Carlo simulation study revealed that the creation of homogeneous subsets resulted in challenging conditions for publication bias methods since the number of effect sizes in a subset was rather small (median number of effect sizes equaled 6). No evidence of bias was obtained using the publication bias tests. Overestimation was minimal but statistically significant, providing evidence of publication bias that appeared to be similar in both fields. These and other findings, in combination with the small percentages of statistically significant primary effect sizes (28.9% and 18.9% for subsets published in Psychological Bulletin and CDSR), led to the conclusion that evidence for publication bias in the studied homogeneous subsets is weak, but suggestive of mild publication bias in both psychology and medicine.


2018 ◽  
pp. 53-72
Author(s):  
Melissa K. Holt ◽  
Alana M. Vivolo-Kantor ◽  
Joshua R. Polanin ◽  
Kristin M. Holland ◽  
Sarah DeGue ◽  
...  

BACKGROUND AND OBJECTIVES Over the last decade there has been increased attention to the association between bullying involvement (as a victim, perpetrator, or bully-victim) and suicidal ideation/behaviors. We conducted a meta-analysis to estimate the association between bullying involvement and suicidal ideation and behaviors. METHODS We searched multiple online databases and reviewed reference sections of articles derived from searches to identify cross-sectional studies published through July 2013. Using search terms associated with bullying, suicide, and youth, 47 studies (38.3% from the United States, 61.7% in non-US samples) met inclusion criteria. Seven observers independently coded studies and met in pairs to reach consensus. RESULTS Six different meta-analyses were conducted by using 3 predictors (bullying victimization, bullying perpetration, and bully/victim status) and 2 outcomes (suicidal ideation and suicidal behaviors). A total of 280 effect sizes were extracted and multilevel, random effects meta-analyses were performed. Results indicated that each of the predictors were associated with risk for suicidal ideation and behavior (range, 2.12 [95% confidence interval (CI), 1.67–2.69] to 4.02 [95% CI, 2.39–6.76]). Significant heterogeneity remained across each analysis. The bullying perpetration and suicidal behavior effect sizes were moderated by the study’s country of origin; the bully/victim status and suicidal ideation results were moderated by bullying assessment method. CONCLUSIONS Findings demonstrated that involvement in bullying in any capacity is associated with suicidal ideation and behavior. Future research should address mental health implications of bullying involvement to prevent suicidal ideation/behavior.


2020 ◽  
Author(s):  
Anton Olsson-Collentine ◽  
Marcel A. L. M. van Assen ◽  
Jelte M. Wicherts

We examined the evidence for heterogeneity (of effect sizes) when only minor changes to sample population and settings were made between studies and explored the association between heterogeneity and average effect size in a sample of 68 meta-analyses from thirteen pre-registered multi-lab direct replication projects in social and cognitive psychology. Amongst the many examined effects, examples include the Stroop effect, the “verbal overshadowing” effect, and various priming effects such as “anchoring” effects. We found limited heterogeneity; 48/68 (71%) meta-analyses had non-significant heterogeneity, and most (49/68; 72%) were most likely to have zero to small heterogeneity. Power to detect small heterogeneity (as defined by Higgins, 2003) was low for all projects (mean 43%), but good to excellent for medium and large heterogeneity. Our findings thus show little evidence of widespread heterogeneity in direct replication studies in social and cognitive psychology, suggesting that minor changes in sample population and settings are unlikely to affect research outcomes in these fields of psychology. We also found strong correlations between observed average effect sizes (standardized mean differences and log odds ratios) and heterogeneity in our sample. Our results suggest that heterogeneity and moderation of effects is unlikely for a zero average true effect size, but increasingly likely for larger average true effect size.


2018 ◽  
Author(s):  
Robbie Cornelis Maria van Aert

More and more scientific research gets published nowadays, asking for statistical methods that enable researchers to get an overview of the literature in a particular research field. For that purpose, meta-analysis methods were developed that can be used for statistically combining the effect sizes from independent primary studies on the same topic. My dissertation focuses on two issues that are crucial when conducting a meta-analysis: publication bias and heterogeneity in primary studies’ true effect sizes. Accurate estimation of both the meta-analytic effect size as well as the between-study variance in true effect size is crucial since the results of meta-analyses are often used for policy making. Publication bias distorts the results of a meta-analysis since it refers to situations where publication of a primary study depends on its results. We developed new meta-analysis methods, p-uniform and p-uniform*, which estimate effect sizes corrected for publication bias and also test for publication bias. Although the methods perform well in many conditions, these and the other existing methods are shown not to perform well when researchers use questionable research practices. Additionally, when publication bias is absent or limited, traditional methods that do not correct for publication bias outperform p¬-uniform and p-uniform*. Surprisingly, we found no strong evidence for the presence of publication bias in our pre-registered study on the presence of publication bias in a large-scale data set consisting of 83 meta-analyses and 499 systematic reviews published in the fields of psychology and medicine. We also developed two methods for meta-analyzing a statistically significant published original study and a replication of that study, which reflects a situation often encountered by researchers. One method is a frequentist whereas the other method is a Bayesian statistical method. Both methods are shown to perform better than traditional meta-analytic methods that do not take the statistical significance of the original study into account. Analytical studies of both methods also show that sometimes the original study is better discarded for optimal estimation of the true effect size. Finally, we developed a program for determining the required sample size in a replication analogous to power analysis in null hypothesis testing. Computing the required sample size with the method revealed that large sample sizes (approximately 650 participants) are required to be able to distinguish a zero from a small true effect.Finally, in the last two chapters we derived a new multi-step estimator for the between-study variance in primary studies’ true effect sizes, and examined the statistical properties of two methods (Q-profile and generalized Q-statistic method) to compute the confidence interval of the between-study variance in true effect size. We proved that the multi-step estimator converges to the Paule-Mandel estimator which is nowadays one of the recommended methods to estimate the between-study variance in true effect sizes. Two Monte-Carlo simulation studies showed that the coverage probabilities of Q-profile and generalized Q-statistic method can be substantially below the nominal coverage rate if the assumptions underlying the random-effects meta-analysis model were violated.


2019 ◽  
Vol 227 (4) ◽  
pp. 261-279 ◽  
Author(s):  
Frank Renkewitz ◽  
Melanie Keiner

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.


2019 ◽  
Author(s):  
Shinichi Nakagawa ◽  
Malgorzata Lagisz ◽  
Rose E O'Dea ◽  
Joanna Rutkowska ◽  
Yefeng Yang ◽  
...  

‘Classic’ forest plots show the effect sizes from individual studies and the aggregate effect from a meta-analysis. However, in ecology and evolution meta-analyses routinely contain over 100 effect sizes, making the classic forest plot of limited use. We surveyed 102 meta-analyses in ecology and evolution, finding that only 11% use the classic forest plot. Instead, most used a ‘forest-like plot’, showing point estimates (with 95% confidence intervals; CIs) from a series of subgroups or categories in a meta-regression. We propose a modification of the forest-like plot, which we name the ‘orchard plot’. Orchard plots, in addition to showing overall mean effects and CIs from meta-analyses/regressions, also includes 95% prediction intervals (PIs), and the individual effect sizes scaled by their precision. The PI allows the user and reader to see the range in which an effect size from a future study may be expected to fall. The PI, therefore, provides an intuitive interpretation of any heterogeneity in the data. Supplementing the PI, the inclusion of underlying effect sizes also allows the user to see any influential or outlying effect sizes. We showcase the orchard plot with example datasets from ecology and evolution, using the R package, orchard, including several functions for visualizing meta-analytic data using forest-plot derivatives. We consider the orchard plot as a variant on the classic forest plot, cultivated to the needs of meta-analysts in ecology and evolution. Hopefully, the orchard plot will prove fruitful for visualizing large collections of heterogeneous effect sizes regardless of the field of study.


2019 ◽  
Author(s):  
Amanda Kvarven ◽  
Eirik Strømland ◽  
Magnus Johannesson

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.


2014 ◽  
pp. 626-635 ◽  
Author(s):  
Florian Emerstorfer ◽  
Christer Bergwall ◽  
Walter Hein ◽  
Mats Bengtsson ◽  
John P. Jensen

The investigations presented in this work were carried out in order to further deepen the knowledge about nitrite pathways in the area of sugar beet extraction. The article consists of two parts with different experimental set-up: the first part focuses on laboratory trials in which the fate of nitrate and nitrite was studied in a so-called mini-fermenter. These trials were carried out using juice from the hot part of the cossette mixer of an Agrana sugar factory in Austria. In the experiments, two common sugar factory disinfectants were used in order to study microbial as well as microbial-chemical effects on nitrite formation and degradation caused by bacteria present in the juice. The trials demonstrated that the direct microbial effect (denitrification) on nitrite degradation is more pronounced than the indirect microbial-chemical effect coming from pH value decrease by these bacteria and subsequent nitrite loss. The second part describes the findings from laboratory experiments and full scale factory trials using a mobile laboratory set-up based on insulated stainless steel containers and spectrophotometric detection of nitrite in various factory juices. The trials were made at two Nordzucker factories located in Finland (factory A) and Sweden (factory B). The inhibiting effect of the two common sugar factory disinfectants on nitrite formation was evaluated in laboratory trials, whereas the full scale trials focused on one disinfectant. Other trials to evaluate potential contamination sources of thermophilic nitrite producing bacteria to the extraction system, reactivation of nitrite producing bacteria in raw juice and the effect of a pH gradient on bacterial nitrite activity in cossette mixer juice are also reported.


Author(s):  
Donald L. Bliwise ◽  
Michael K. Scullin

Possible associations between sleep and cognition are provocative across different domains and hold the promise of prevention or reversibility. A vast array of studies has been reported. Evidence is suggestive but hardly definitive. We provide an overview of this literature, adopting the framework of Hill’s perspective on epidemiological causation. With rare exception, formal meta-analyses have yet to appear. Apparent consistency of findings suggests relationships, but the diversity of findings involving specific components of cognitive function raises interpretative caution. Large effect sizes have been noted, but small-to-moderate effects predominate. Natural history data are similarly enticing, and studies of biological plausibility and gradient indicate likely neurobiological substrates. Perhaps the ultimate population-health criterion, demonstration of reversibility of impairment, remains elusive at best. This area offers an exciting topic for future work.


Sign in / Sign up

Export Citation Format

Share Document