scholarly journals Estimating statistical power, posterior probability and publication bias of psychological research using the observed replication rate

2018 ◽  
Vol 5 (9) ◽  
pp. 181190 ◽  
Author(s):  
Michael Ingre ◽  
Gustav Nilsonne

In this paper, we show how Bayes' theorem can be used to better understand the implications of the 36% reproducibility rate of published psychological findings reported by the Open Science Collaboration. We demonstrate a method to assess publication bias and show that the observed reproducibility rate was not consistent with an unbiased literature. We estimate a plausible range for the prior probability of this body of research, suggesting expected statistical power in the original studies of 48–75%, producing (positive) findings that were expected to be true 41–62% of the time. Publication bias was large, assuming a literature with 90% positive findings, indicating that negative evidence was expected to have been observed 55–98 times before one negative result was published. These findings imply that even when studied associations are truly NULL, we expect the literature to be dominated by statistically significant findings.

2019 ◽  
Vol 227 (4) ◽  
pp. 261-279 ◽  
Author(s):  
Frank Renkewitz ◽  
Melanie Keiner

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.


Author(s):  
Eitel J.M. Lauria

Bayesian methods provide a probabilistic approach to machine learning. The Bayesian framework allows us to make inferences from data using probability models for values we observe and about which we want to draw some hypotheses. Bayes theorem provides the means of calculating the probability of a hypothesis (posterior probability) based on its prior probability, the probability of the observations and the likelihood that the observational data fit the hypothesis.


2004 ◽  
Vol 20 (4) ◽  
pp. 488-492 ◽  
Author(s):  
Gert Jan van der Wilt ◽  
Maroeska Rovers ◽  
Huub Straatman ◽  
Sjoukje van der Bij ◽  
Paul van den Broek ◽  
...  

Objectives: The observed posterior probability distributions regarding the benefits of surgery for otitis media with effusion (OME) with expected probability distributions, using Bayes' theorem are compared.Methods: Postal questionnaires were used to assess prior and posterior probability distributions among ear-nose-throat (ENT) surgeons in the Netherlands.Results: In their prior probability estimates, ENT surgeons were quite optimistic with respect to the effectiveness of tube insertion in the treatment of OME. The trial showed no meaningful benefit of tubes on hearing and language development. Posterior probabilities calculated on the basis of prior probability estimates and trial results differed widely from those, elicited empirically 1 year after completion of the trial and dissemination of the results.Conclusions: ENT surgeons did not adjust their opinion about the benefits of surgical treatment of glue ears to the extent that they should have done according to Bayes' theorem. Users of the results of Bayesian analyses, notably policy-makers, should realize that Bayes' theorem is prescriptive and not necessarily descriptively correct. Health policy decisions should not be based on the untested assumption that health-care professionals use new evidence to adjust their subjective beliefs in a Bayesian manner.


Author(s):  
Giacomo Bignardi ◽  
Edwin S. Dalmaijer ◽  
Alexander Anwyl-Irvine ◽  
Duncan E. Astle

AbstractCollecting experimental cognitive data with young children usually requires undertaking one-on-one assessments, which can be both expensive and time-consuming. In addition, there is increasing acknowledgement of the importance of collecting larger samples for improving statistical power Button et al. (Nature Reviews Neuroscience 14(5), 365–376, 2013), and reproducing exploratory findings Open Science Collaboration (Science, 349(6251), aac4716–aac4716 2015). One way both of these goals can be achieved more easily, even with a small team of researchers, is to utilize group testing. In this paper, we evaluate the results from a novel tablet application developed for the Resilience in Education and Development (RED) Study. The RED-app includes 12 cognitive tasks designed for groups of children aged 7 to 13 to independently complete during a 1-h school lesson. The quality of the data collected was high despite the lack of one-on-one engagement with participants. Most outcomes from the tablet showed moderate or high reliability, estimated using internal consistency metrics. Tablet-measured cognitive abilities also explained more than 50% of variance in teacher-rated academic achievement. Overall, the results suggest that tablet-based, group cognitive assessments of children are an efficient, reliable, and valid method of collecting the large datasets that modern psychology requires. We have open-sourced the scripts and materials used to make the application, so that they can be adapted and used by others.


2019 ◽  
Author(s):  
Roger W. Strong ◽  
George Alvarez

The replication crisis has brought about an increased focus on improving the reproducibility of psychological research (Open Science Collaboration, 2015). Although some failed replications reflect false-positives in original research findings, many are likely the result of low statistical power, which can cause failed replications even when an effect is real, no questionable research practices are used, and an experiment’s methodology is repeated perfectly. The present paper describes a simulation method (bootstrap resampling) that can be used in combination with pilot data or synthetic data to produce highly powered experimental designs. Unlike other commonly used power analysis approaches (e.g., G*Power), bootstrap resampling can be adapted to any experimental design to account for various factors that influence statistical power, including sample size, number of trials per condition, and participant exclusion criteria. Ignoring some of these factors (e.g., by using G*Power) can overestimate the power of a study or replication, increasing the likelihood that your findings will not replicate. By demonstrating how these factors influence the consistency of experimental findings, this paper provides examples of how simulation can be used to improve statistical power and reproducibility. Further, we provide a MATLAB toolbox that can be used to implement these simulation-based methods on existing pilot data (https://harvard-visionlab.github.io/power-sim).


Author(s):  
Therese M. Donovan ◽  
Ruth M. Mickey

Chapter 4 introduces the concept of Bayesian inference. The chapter discusses the scientific method, and illustrates how Bayes’ Theorem can be used for scientific inference. Bayesian inference is the use of Bayes’ Theorem to draw conclusions about a set of mutually exclusive and exhaustive alternative hypotheses by linking prior knowledge about each hypothesis with new data. The result is updated probabilities for each hypothesis of interest. By the end of this chapter, the reader will understand the concepts of induction and deduction, prior probability of a hypothesis, likelihood of the observed data, and posterior probability of a hypothesis, given the data.


Paleobiology ◽  
10.1666/13074 ◽  
2014 ◽  
Vol 40 (4) ◽  
pp. 584-607 ◽  
Author(s):  
John Alroy

Determining whether a species has gone extinct is a central problem in both paleobiology and conservation biology. Past literature has mostly employed equations that yield confidence intervals around the endpoints of temporal ranges. These frequentist methods calculate the chance of not having seen a species lately given that it is still alive (a conditional probability). However, any reasonable person would instead want to know the chance that a species is extinct given that it has not been seen (the posterior probability). Here, I present a simple Bayesian equation that estimates posteriors. It uninterestingly assumes that the sampling probability equals the frequency of sightings within the range. It interestingly sets the prior probability of going extinct during any one time interval (E) by assuming that extinction is an exponential decay process and there is a 50% chance a species has gone extinct by the end of its observed range. The range is first adjusted for undersampling by using a routine equation. Bayes' theorem is then used to compute the posterior for interval 1 (ε1), which becomes the prior for interval 2. The following posterior ε2again incorporates E because extinction might have happened instead during interval 2. The posteriors are called “creeping-shadow-of-a-doubt values” to emphasize the uniquely iterative nature of the calculation. Simulations show that the method is highly accurate and precise given moderate to high sampling probabilities and otherwise conservative, robust to random variation in sampling, and able to detect extinction pulses after a short lag. Improving the method by having it consider clustering of sightings makes it highly resilient to trends in sampling. Example calculations involving recently extinct Costa Rican frogs and Maastrichtian ammonites show that the method helps to evaluate the status of critically endangered species and identify species likely to have gone extinct below some stratigraphic horizon.


2019 ◽  
Author(s):  
Jennifer L Tackett ◽  
Josh Miller

As psychological research comes under increasing fire for the crisis of replicability, attention has turned to methods and practices that facilitate (or hinder) a more replicable and veridical body of empirical evidence. These trends have focused on “open science” initiatives, including an emphasis on replication, transparency, and data sharing. Despite this broader movement in psychology, clinical psychologists and psychiatrists have been largely absent from the broader conversation on documenting the extent of existing problems as well as generating solutions to problematic methods and practices in our area (Tackett et al., 2017). The goal of the current special section was to bring together psychopathology researchers to explore these and related areas as they pertain to the types of research conducted in clinical psychology and allied disciplines.


2018 ◽  
Vol 46 (1) ◽  
pp. 72-79 ◽  
Author(s):  
W. Burt Thompson

When a psychologist announces a new research finding, it is often based on a rejected null hypothesis. However, if that hypothesis is true, the claim is a false alarm. Many students mistakenly believe that the probability of committing a false alarm equals alpha, the criterion for statistical significance, which is typically set at 5%. Instructors should take specific steps to dispel this belief because it leads students to misinterpret statistical test results and it reinforces the more general misconception that results can be interpreted in isolation, without reference to theory or prior research. In the present study, students worked with a web app that shows how the false alarm rate is a function of the prior probability of an effect, statistical power, and alpha. Quiz scores suggest the activity helps correct the misconception, which can improve how students conduct and interpret research.


2016 ◽  
Author(s):  
Frank Bosco ◽  
Joshua Carp ◽  
James G. Field ◽  
Hans IJzerman ◽  
Melissa Lewis ◽  
...  

Open Science Collaboration (in press). Maximizing the reproducibility of your research. In S. O. Lilienfeld & I. D. Waldman (Eds.), Psychological Science Under Scrutiny: Recent Challenges and Proposed Solutions. New York, NY: Wiley.


Sign in / Sign up

Export Citation Format

Share Document