Bayesian Statistics in Archaeology

Null hypothesis significance testing (NHST) is the most common statistical framework used by scientists, including archaeologists. Owing to increasing dissatisfaction, however, Bayesian inference has become an alternative to these methods. In this article, we review the application of Bayesian statistics to archaeology. We begin with a simple example to demonstrate the differences in applying NHST and Bayesian inference to an archaeological problem. Next, we formally define NHST and Bayesian inference, provide a brief historical overview of their development, and discuss the advantages and limitations of each method. A review of Bayesian inference and archaeology follows, highlighting the applications of Bayesian methods to chronological, bioarchaeological, zooarchaeological, ceramic, lithic, and spatial analyses. We close by considering the future applications of Bayesian statistics to archaeological research.

Download Full-text

The Bayesian Inferential Paradigm in Archaeology

10.31235/osf.io/eb5gs ◽

2021 ◽

Author(s):

Erik Otarola-Castillo ◽

Meissa G Torquato ◽

Caitlin E. Buck

Keyword(s):

Bayesian Statistics ◽

Null Hypothesis ◽

Radiocarbon Dating ◽

Relevant Literature ◽

Significance Testing ◽

Simple Explanation ◽

Spatial Analyses ◽

Null Hypothesis Significance Testing ◽

Statistical Framework ◽

Research Problems

Archaeologists often use data and quantitative statistical methods to evaluate their ideas. Although there are various statistical frameworks for decision-making in archaeology and science in general, in this chapter, we provide a simple explanation of Bayesian statistics. To contextualize the Bayesian statistical framework, we briefly compare it to the more widespread null hypothesis significance testing (NHST) approach. We also provide a simple example to illustrate how archaeologists use data and the Bayesian framework to compare hypotheses and evaluate their uncertainty. We then review how archaeologists have applied Bayesian statistics to solve research problems related to radiocarbon dating and chronology, lithic, ceramic, zooarchaeological, bioarchaeological, and spatial analyses. Because recent work has reviewed Bayesian applications in archaeology from the 1990s up to 2017, this work considers the relevant literature published since 2017.

Download Full-text

Scientific Self-Correction: The Bayesian Way

10.31234/osf.io/daw3q ◽

2019 ◽

Author(s):

Felipe Romero ◽

Jan Sprenger

Keyword(s):

Bayesian Inference ◽

Bayesian Statistics ◽

Experimental Research ◽

Null Hypothesis ◽

Effect Sizes ◽

Significance Testing ◽

Null Hypothesis Significance Testing ◽

Scientific Disciplines ◽

Replication Crisis ◽

Reliable Knowledge

The enduring replication crisis in many scientific disciplines casts doubt on the ability of science to self-correct its findings and to produce reliable knowledge. Amongst a variety of possible methodological, social, and statistical reforms to address the crisis, we focus on replacing null hypothesis significance testing (NHST) with Bayesian inference. On the basis of a simulation study for meta-analytic aggregation of effect sizes, we study the relative advantages of this Bayesian reform, and its interaction with widespread limitations in experimental research. Moving to Bayesian statistics will not solve the replication crisis single-handely, but would eliminate important sources of effect size overestimation for the conditions we study.

Download Full-text

Degrees of Corroboration: An Antidote to the Replication Crisis

10.31234/osf.io/fdkqg ◽

2019 ◽

Author(s):

Jan Sprenger

Keyword(s):

Bayesian Inference ◽

Statistical Inference ◽

Confidence Intervals ◽

Publication Bias ◽

Null Hypothesis ◽

Significance Testing ◽

Epistemic Authority ◽

Hypothesis Tests ◽

Null Hypothesis Significance Testing ◽

Replication Crisis

The replication crisis poses an enormous challenge to the epistemic authority of science and the logic of statistical inference in particular. Two prominent features of Null Hypothesis Significance Testing (NHST) arguably contribute to the crisis: the lack of guidance for interpreting non-significant results and the impossibility of quantifying support for the null hypothesis. In this paper, I argue that also popular alternatives to NHST, such as confidence intervals and Bayesian inference, do not lead to a satisfactory logic of evaluating hypothesis tests. As an alternative, I motivate and explicate the concept of corroboration of the null hypothesis. Finally I show how degrees of corroboration give an interpretation to non-significant results, combat publication bias and mitigate the replication crisis.

Download Full-text

Adopting a Meta-Generative Way of Thinking in the Field of Education via the Use of Bayesian Methods: A Multimethod Approach in a Post-Truth and COVID-19 Era1

International Journal of Multiple Research Approaches ◽

10.29034/ijmra.v12n1editorial2 ◽

2020 ◽

pp. 4-19 ◽

Cited By ~ 1

Author(s):

Prathiba Natesan Batley ◽

Peter Boedeker ◽

Anthony J. Onwuegbuzie

Keyword(s):

Bayesian Methods ◽

Null Hypothesis ◽

Quantitative Research ◽

Meta Analysis ◽

Research Synthesis ◽

Research Data ◽

Policy Implications ◽

Significance Testing ◽

Null Hypothesis Significance Testing ◽

Multimethod Approach

In this editorial, we introduce the multimethod concept of thinking meta-generatively, which we define as directly integrating findings from the extant literature during the data collection, analysis, and interpretation phases of primary studies. We demonstrate that meta-generative thinking goes further than do other research synthesis techniques (e.g., meta-analysis) because it involves meta-synthesis not only across studies but also within studies—thereby representing a multimethod approach. We describe how meta-generative thinking can be maximized/optimized with respect to quantitative research data/findings via the use of Bayesian methodology that has been shown to be superior to the inherently flawed null hypothesis significance testing. We contend that Bayesian meta-generative thinking is essential, given the potential for divisiveness and far-reaching sociopolitical, educational, and health policy implications of findings that lack generativity in a post-truth and COVID-19 era.

Download Full-text

Things We Still Haven’t Learned (So Far)

Journal of Sport and Exercise Psychology ◽

10.1123/jsep.2015-0015 ◽

2015 ◽

Vol 37 (4) ◽

pp. 449-461 ◽

Cited By ~ 19

Author(s):

Andreas Ivarsson ◽

Mark B. Andersen ◽

Andreas Stenling ◽

Urban Johnson ◽

Magnus Lindwall

Keyword(s):

Bayesian Statistics ◽

Null Hypothesis ◽

Historical Background ◽

Effect Sizes ◽

Significance Testing ◽

Exercise Psychology ◽

Null Hypothesis Significance Testing ◽

The Past ◽

The Common ◽

Sport And Exercise

Null hypothesis significance testing (NHST) is like an immortal horse that some researchers have been trying to beat to death for over 50 years but without any success. In this article we discuss the flaws in NHST, the historical background in relation to both Fisher’s and Neyman and Pearson’s statistical ideas, the common misunderstandings of what p < 05 actually means, and the 2010 APA publication manual’s clear, but most often ignored, instructions to report effect sizes and to interpret what they all mean in the real world. In addition, we discuss how Bayesian statistics can be used to overcome some of the problems with NHST. We then analyze quantitative articles published over the past three years (2012–2014) in two top-rated sport and exercise psychology journals to determine whether we have learned what we should have learned decades ago about our use and meaningful interpretations of statistics.

Download Full-text

Providing Evidence for the Null Hypothesis in Functional Magnetic Resonance Imaging Using Group-Level Bayesian Inference

Frontiers in Neuroinformatics ◽

10.3389/fninf.2021.738342 ◽

2021 ◽

Vol 15 ◽

Author(s):

Ruslan Masharipov ◽

Irina Knyazeva ◽

Yaroslav Nikolaev ◽

Alexander Korotkov ◽

Michael Didur ◽

...

Keyword(s):

Bayesian Inference ◽

Null Hypothesis ◽

Group Analysis ◽

Statistical Parametric Mapping ◽

Simulated Data ◽

Significance Testing ◽

Null Hypothesis Significance Testing ◽

Effect Assessment ◽

Null Effect ◽

Region Of Practical Equivalence

Classical null hypothesis significance testing is limited to the rejection of the point-null hypothesis; it does not allow the interpretation of non-significant results. This leads to a bias against the null hypothesis. Herein, we discuss statistical approaches to ‘null effect’ assessment focusing on the Bayesian parameter inference (BPI). Although Bayesian methods have been theoretically elaborated and implemented in common neuroimaging software packages, they are not widely used for ‘null effect’ assessment. BPI considers the posterior probability of finding the effect within or outside the region of practical equivalence to the null value. It can be used to find both ‘activated/deactivated’ and ‘not activated’ voxels or to indicate that the obtained data are not sufficient using a single decision rule. It also allows to evaluate the data as the sample size increases and decide to stop the experiment if the obtained data are sufficient to make a confident inference. To demonstrate the advantages of using BPI for fMRI data group analysis, we compare it with classical null hypothesis significance testing on empirical data. We also use simulated data to show how BPI performs under different effect sizes, noise levels, noise distributions and sample sizes. Finally, we consider the problem of defining the region of practical equivalence for BPI and discuss possible applications of BPI in fMRI studies. To facilitate ‘null effect’ assessment for fMRI practitioners, we provide Statistical Parametric Mapping 12 based toolbox for Bayesian inference.

Download Full-text

Bayesian alternatives to null hypothesis significance testing in biomedical research: a non-technical introduction to Bayesian inference with JASP

BMC Medical Research Methodology ◽

10.1186/s12874-020-00980-6 ◽

2020 ◽

Vol 20 (1) ◽

Cited By ~ 3

Author(s):

Riko Kelter

Keyword(s):

Bayesian Inference ◽

Biomedical Research ◽

Null Hypothesis ◽

Significance Testing ◽

Null Hypothesis Significance Testing

Download Full-text

Evidence for the null hypothesis in functional magnetic resonance imaging using group-level Bayesian inference

10.1101/2021.06.02.446711 ◽

2021 ◽

Author(s):

Ruslan Masharipov ◽

Yaroslav Nikolaev ◽

Alexander Korotkov ◽

Michael Didur ◽

Denis Cherednichenko ◽

...

Keyword(s):

Bayesian Inference ◽

Sample Size ◽

Null Hypothesis ◽

Bayesian Model ◽

Significance Testing ◽

Large Sample Size ◽

Null Hypothesis Significance Testing ◽

Parameter Inference ◽

Effect Assessment ◽

Null Effect

Classical null hypothesis significance testing is limited to the rejection of the point-null hypothesis; it does not allow the interpretation of non-significant results. Moreover, studies with a sufficiently large sample size will find statistically significant results even when the effect is negligible and may be considered practically equivalent to the null effect. This leads to a publication bias against the null hypothesis. There are two main approaches to assess null effects: shifting from the point-null to the interval-null hypothesis and considering the practical significance in the frequentist approach; using the Bayesian parameter inference based on posterior probabilities, or the Bayesian model inference based on Bayes factors. Herein, we discuss these statistical methods with particular focus on the application of the Bayesian parameter inference, as it is conceptually connected to both frequentist and Bayesian model inferences. Although Bayesian methods have been theoretically elaborated and implemented in commonly used neuroimaging software, they are not widely used for null effect assessment. To demonstrate the advantages of using the Bayesian parameter inference, we compared it with classical null hypothesis significance testing for fMRI data group analysis. We also consider the problem of choosing a threshold for a practically significant effect and discuss possible applications of Bayesian parameter inference in fMRI studies. We argue that Bayesian inference, which directly provides evidence for both the null and alternative hypotheses, may be more intuitive and convenient for practical use than frequentist inference, which only provides evidence against the null hypothesis. Moreover, it may indicate that the obtained data are not sufficient to make a confident inference. Because interim analysis is easy to perform using Bayesian inference, one can evaluate the data as the sample size increases and decide to terminate the experiment if the obtained data are sufficient to make a confident inference. To facilitate the application of the Bayesian parameter inference to null effect assessment, scripts with a simple GUI were developed.

Download Full-text

To P or Not to P: Backing Bayesian Statistics

Otolaryngology ◽

10.1177/0194599817739260 ◽

2017 ◽

Vol 157 (6) ◽

pp. 915-918 ◽

Cited By ~ 7

Author(s):

Farrel J. Buchinsky ◽

Neil K. Chadha

Keyword(s):

Bayesian Statistics ◽

Biomedical Research ◽

Null Hypothesis ◽

Study Data ◽

Significance Testing ◽

Posterior Probabilities ◽

Null Hypothesis Significance Testing ◽

P Values ◽

Prior Probabilities

In biomedical research, it is imperative to differentiate chance variation from truth before we generalize what we see in a sample of subjects to the wider population. For decades, we have relied on null hypothesis significance testing, where we calculate P values for our data to decide whether to reject a null hypothesis. This methodology is subject to substantial misinterpretation and errant conclusions. Instead of working backward by calculating the probability of our data if the null hypothesis were true, Bayesian statistics allow us instead to work forward, calculating the probability of our hypothesis given the available data. This methodology gives us a mathematical means of incorporating our “prior probabilities” from previous study data (if any) to produce new “posterior probabilities.” Bayesian statistics tell us how confidently we should believe what we believe. It is time to embrace and encourage their use in our otolaryngology research.

Download Full-text

A Primer on Bayesian Analysis for Experimental Psychopathologists

Journal of Experimental Psychopathology ◽

10.5127/jep.057316 ◽

2017 ◽

Vol 8 (2) ◽

pp. 140-157 ◽

Cited By ~ 18

Author(s):

Angelos-Miltiadis Krypotos ◽

Tessa F. Blanken ◽

Inna Arnaudova ◽

Dora Matzke ◽

Tom Beckers

Keyword(s):

Bayesian Analysis ◽

Bayesian Statistics ◽

Present Article ◽

Null Hypothesis ◽

Significance Testing ◽

Null Hypothesis Significance Testing ◽

Data Set ◽

Experimental Psychopathology ◽

Continued Use ◽

Main Message

The principal goals of experimental psychopathology (EPP) research are to offer insights into the pathogenic mechanisms of mental disorders and to provide a stable ground for the development of clinical interventions. The main message of the present article is that those goals are better served by the adoption of Bayesian statistics than by the continued use of null-hypothesis significance testing (NHST). In the first part of the article we list the main disadvantages of NHST and explain why those disadvantages limit the conclusions that can be drawn from EPP research. Next, we highlight the advantages of Bayesian statistics. To illustrate, we then pit NHST and Bayesian analysis against each other using an experimental data set from our lab. Finally, we discuss some challenges when adopting Bayesian statistics. We hope that the present article will encourage experimental psychopathologists to embrace Bayesian statistics, which could strengthen the conclusions drawn from EPP research.

Download Full-text