Bayesian Statistics for Beginners
Latest Publications


TOTAL DOCUMENTS

20
(FIVE YEARS 20)

H-INDEX

1
(FIVE YEARS 1)

Published By Oxford University Press

9780198841296, 9780191876820

Author(s):  
Therese M. Donovan ◽  
Ruth M. Mickey

In the “Once-ler Problem,” the decision tree is introduced as a very useful technique that can be used to answer a variety of questions and assist in making decisions. This chapter builds on the “Lorax Problem” introduced in Chapter 19, where Bayesian networks were introduced. A decision tree is a graphical representation of the alternatives in a decision. It is closely related to Bayesian networks except that the decision problem takes the shape of a tree instead. The tree itself consists of decision nodes, chance nodes, and end nodes, which provide an outcome. In a decision tree, probabilities associated with chance nodes are conditional probabilities, which Bayes’ Theorem can be used to estimate or update. The calculation of expected values (or expected utility) of competing alternative decisions is provided on a step-by-step basis with an example from The Lorax.


Author(s):  
Therese M. Donovan ◽  
Ruth M. Mickey

This chapter introduces Markov Chain Monte Carlo (MCMC) with Gibbs sampling, revisiting the “Maple Syrup Problem” of Chapter 12, where the goal was to estimate the two parameters of a normal distribution, μ‎ and σ‎. Chapter 12 used the normal-normal conjugate to derive the posterior distribution for the unknown parameter μ‎; the parameter σ‎ was assumed to be known. This chapter uses MCMC with Gibbs sampling to estimate the joint posterior distribution of both μ‎ and σ‎. Gibbs sampling is a special case of the Metropolis–Hastings algorithm. The chapter describes MCMC with Gibbs sampling step by step, which requires (1) computing the posterior distribution of a given parameter, conditional on the value of the other parameter, and (2) drawing a sample from the posterior distribution. In this chapter, Gibbs sampling makes use of the conjugate solutions to decompose the joint posterior distribution into full conditional distributions for each parameter.


Author(s):  
Therese M. Donovan ◽  
Ruth M. Mickey

In this chapter, the “Shark Attack Problem” (Chapter 11) is revisited. Markov Chain Monte Carlo (MCMC) is introduced as another way to determine a posterior distribution of λ‎, the mean number of shark attacks per year. The MCMC approach is so versatile that it can be used to solve almost any kind of parameter estimation problem. The chapter highlights the Metropolis algorithm in detail and illustrates its application, step by step, for the “Shark Attack Problem.” The posterior distribution generated in Chapter 11 using the gamma-Poisson conjugate is compared with the MCMC posterior distribution to show how successful the MCMC method can be. By the end of the chapter, the reader should also understand the following concepts: tuning parameter, MCMC inference, traceplot, and moment matching.


Author(s):  
Therese M. Donovan ◽  
Ruth M. Mickey

This chapter provides a very brief introduction to Bayesian model selection. The “Survivor Problem” is expanded in this chapter, where the focus is now on comparing two models that predict how long a contestant will last in a game of Survivor: one model uses years of formal education as a predictor, and a second model uses grit as a predictor. Gibbs sampling is used for parameter estimation. Deviance Information Criterion (commonly abbreviated as DIC) is used as a guide for model selection. Details of how this measure is computed are described. The chapter also discusses model assessment (model fit) and Occam’s razor.


Author(s):  
Therese M. Donovan ◽  
Ruth M. Mickey

While one of the most common uses of Bayes’ Theorem is in the statistical analysis of a dataset (i.e., statistical modeling), this chapter examines another application of Gibbs sampling: parameter estimation for simple linear regression. In the “Survivor Problem,” the chapter considers the relationship between how many days a contestant lasts in a reality-show competition as a function of how many years of formal education they have. This chapter is a bit more complicated than the previous chapter because it involves estimation of the joint posterior distribution of three parameters. As in earlier chapters, the estimation process is described in detail on a step-by-step basis. Finally, the posterior predictive distribution is estimated and discussed. By the end of the chapter, the reader will have a firm understanding of the following concepts: linear equation, sums of squares, posterior predictive distribution, and linear regression with Markov Chain Monte Carlo and Gibbs sampling.


Author(s):  
Therese M. Donovan ◽  
Ruth M. Mickey

The purpose of this chapter is to illustrate some of the things that can go wrong in Markov Chain Monte Carlo (MCMC) analysis and to introduce some diagnostic tools that help identify whether the results of such an analysis can be trusted. The goal of a Bayesian MCMC analysis is to estimate the posterior distribution while skipping the integration required in the denominator of Bayes’ Theorem. The MCMC approach does this by breaking the problem into small, bite-sized pieces, allowing the posterior distribution to be built bit by bit. The main challenge, however, is that several things might go wrong in the process. Several diagnostic tests can be applied to ensure that an MCMC analysis provides an adequate estimate of the posterior distribution. Such diagnostics are required of all MCMC analyses and include tuning, burn-in, and pruning.


Author(s):  
Therese M. Donovan ◽  
Ruth M. Mickey

In this chapter, Bayesian methods are used to estimate the two parameters that identify a normal distribution, μ‎ and σ‎. Many Bayesian analyses consider alternative parameter values as hypotheses. The prior distribution for an unknown parameter can be represented by a continuous probability density function when the number of hypotheses is infinite. In the “Maple Syrup Problem,” a normal distribution is used as the prior distribution of μ‎, the mean number of millions of gallons of maple syrup produced in Vermont in a year. The amount of syrup produced in multiple years is determined, and assumed to follow a normal distribution with known σ‎. The prior distribution is updated to the posterior distribution in light of this new information. In short, a normal prior distribution + normally distributed data → normal posterior distribution.


Author(s):  
Therese M. Donovan ◽  
Ruth M. Mickey

This chapter introduces the beta-binomial conjugate. There are special cases where a Bayesian prior probability distribution for an unknown parameter of interest can be quickly updated to a posterior distribution of the same form as the prior. In the “White House Problem,” a beta distribution is used to set the priors for all hypotheses of p, the probability that a famous person can get into the White House without an invitation. Binomial data are then collected, and provide the number of times a famous person gained entry out of a fixed number of attempts. The prior distribution is updated to a posterior distribution (also a beta distribution) in light of this new information. In short, a beta prior distribution for the unknown parameter + binomial data → beta posterior distribution for the unknown parameter, p. The beta distribution is said to be “conjugate to” the binomial distribution.


Author(s):  
Therese M. Donovan ◽  
Ruth M. Mickey

This chapter focuses on probability mass functions. One of the primary uses of Bayesian inference is to estimate parameters. To do so, it is necessary to first build a good understanding of probability distributions. This chapter introduces the idea of a random variable and presents general concepts associated with probability distributions for discrete random variables. It starts off by discussing the concept of a function and goes on to describe how a random variable is a type of function. The binomial distribution and the Bernoulli distribution are then used as examples of the probability mass functions (pmf’s). The pmfs can be used to specify prior distributions, likelihoods, likelihood profiles and/or posterior distributions in Bayesian inference.


Author(s):  
Therese M. Donovan ◽  
Ruth M. Mickey

The “Birthday Problem” expands consideration from two hypotheses to multiple, discrete hypotheses. In this chapter, interest is in determining the posterior probability that a woman named Mary was born in a given month; there are twelve alternative hypotheses. Furthermore, consideration is given to assigning prior probabilities. The priors represent a priori probabilities that each alternative hypothesis is correct, where a priori means “prior to data collection,” and can be “informative” or “non-informative.” A Bayesian analysis cannot be conducted without using a prior distribution, whether that is an informative prior distribution or a non-informative prior distribution. The chapter discusses objective priors, subjective priors, and prior sensitivity analysis. In addition, the concept of likelihood is explored more deeply.


Sign in / Sign up

Export Citation Format

Share Document