scholarly journals Critically evaluating the theory and performance of Bayesian analysis of macroevolutionary mixtures

2016 ◽  
Vol 113 (34) ◽  
pp. 9569-9574 ◽  
Author(s):  
Brian R. Moore ◽  
Sebastian Höhna ◽  
Michael R. May ◽  
Bruce Rannala ◽  
John P. Huelsenbeck

Bayesian analysis of macroevolutionary mixtures (BAMM) has recently taken the study of lineage diversification by storm. BAMM estimates the diversification-rate parameters (speciation and extinction) for every branch of a study phylogeny and infers the number and location of diversification-rate shifts across branches of a tree. Our evaluation of BAMM reveals two major theoretical errors: (i) the likelihood function (which estimates the model parameters from the data) is incorrect, and (ii) the compound Poisson process prior model (which describes the prior distribution of diversification-rate shifts across branches) is incoherent. Using simulation, we demonstrate that these theoretical issues cause statistical pathologies; posterior estimates of the number of diversification-rate shifts are strongly influenced by the assumed prior, and estimates of diversification-rate parameters are unreliable. Moreover, the inability to correctly compute the likelihood or to correctly specify the prior for rate-variable trees precludes the use of Bayesian approaches for testing hypotheses regarding the number and location of diversification-rate shifts using BAMM.

2020 ◽  
Author(s):  
Craig Stow

<p>The historical adoption of Bayesian approaches was limited by two main impediments: 1) the requirement for subjective prior information, and 2) the unavailability of analytical solutions for all but a few simple model forms. However, water quality modeling has always been subjective; selecting point values for model parameters, undertaking some “judicious diddling” to adjust them so that model output more closely matches observed data, and declaring the model to be “reasonable” is a long-standing practice. Water quality modeling in a Bayesian framework can actually reduce this subjectivity as it provides a rigorous and transparent approach for model parameter estimation. The second impediment, lack of analytical solutions, has for many applications, been largely reduced by the increasing availability of fast, cheap computing and concurrent evolution of efficient algorithms to sample the posterior distribution. In water quality modeling, however, the increasing computational availability may be reinforcing the dichotomy between probabilistic and “process-based” models. When I was a graduate student we couldn’t do both process and probability because computers weren’t fast enough. However, current computers unimaginably faster and we still rarely do both. It seems that our increasing computational capacity has been absorbed either in more complex and highly resolved, but still deterministic, process models, or more structurally complex probabilistic models (like hierarchical models) that are still light process. In principal, Bayes Theorem is quite general; any model could constitute the likelihood function, but practically, running Monte Carlo-based methods on simulation models that require hours, days, or even longer to run is not feasible. Developing models that capture the essential (and best understood processes) and that still allow a meaningful uncertainty analysis is an area that invites renewed attention.</p>


2021 ◽  
Vol 9 ◽  
Author(s):  
Kyle T. Spikes ◽  
Mrinal K. Sen

Rock-physics models relate rock properties to elastic properties through non-unique relationships and often in the presence of seismic data that contain significant noise. A set of inputs define the rock-physics model, and any errors in that model map directly into uncertainty in target seismic-scale amplitudes, velocities, or inverted impedances. An important aspect of using rock-physics models in this manner is to determine and understand the significance of the inputs into a rock-physics model under consideration. Such analysis enables the design of prior distributions that are informative within a reservoir-characterization formulation. We use the framework of Bayesian analysis to find internal dependencies and correlations among the inputs. This process requires the assignments of prior distributions, and calculation of the likelihood function, whose product is the posterior distribution. The data are well-log data that come from a hydrocarbon-bearing set of sands from the Gulf of Mexico. The rock-physics model selected is the soft-sand model, which is applicable to the data from the reservoir sands. Results from the Bayesian algorithm are multivariate histograms that demonstrate the most frequent values of the inputs given the data. Four analyses are applied to different subsets of the reservoir sands, and each reveals some correlations among certain model inputs. This quantitative approach points out the significance of a singular or joint set of rock-physics model parameters.


Motor Control ◽  
2016 ◽  
Vol 20 (3) ◽  
pp. 255-265
Author(s):  
Yin-Hua Chen ◽  
Isabella Verdinelli ◽  
Paola Cesari

This paper carries out a full Bayesian analysis for a data set examined in Chen & Cesari (2015). These data were collected for assessing people’s ability in evaluating short intervals of time. Chen & Cesari (2015) showed evidence of the existence of two independent internal clocks for evaluating time intervals below and above the second. We reexamine here, the same question by performing a complete statistical Bayesian analysis of the data. The Bayesian approach can be used to analyze these data thanks to the specific trial design. Data were obtained from evaluation of time ranges from two groups of individuals. More specifically, information gathered from a nontrained group (considered as baseline) allowed us to build a prior distribution for the parameter(s) of interest, and data from the trained group determined the likelihood function. This paper’s main goals are (i) showing how the Bayesian inferential method can be used in statistical analyses and (ii) showing that the Bayesian methodology gives additional support to the findings presented in Chen & Cesari (2015) regarding the existence of two internal clocks in assessing duration of time intervals.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-19
Author(s):  
Yen Liang Tung ◽  
Zubair Ahmad ◽  
Omid Kharazmi ◽  
Clement Boateng Ampadu ◽  
E.H. Hafez ◽  
...  

Modelling data in applied areas particularly in reliability engineering is a prominent research topic. Statistical models play a vital role in modelling reliability data and are useful for further decision-making policies. In this paper, we study a new class of distributions with one additional shape parameter, called a new generalized exponential-X family. Some of its properties are taken into account. The maximum likelihood approach is adopted to obtain the estimates of the model parameters. For assessing the performance of these estimators, a comprehensive Monte Carlo simulation study is carried out. The usefulness of the proposed family is demonstrated by means of a real-life application representing the failure times of electronic components. The fitted results show that the new generalized exponential-X family provides a close fit to data. Finally, considering the failure times data, the Bayesian analysis and performance of Gibbs sampling are discussed. The diagnostics measures such as the Raftery–Lewis, Geweke, and Gelman–Rubin are applied to check the convergence of the algorithm.


2014 ◽  
Author(s):  
Michael R May ◽  
Brian R Moore

Evolutionary biologists have long been fascinated by the extreme differences in species numbers across branches of the Tree of Life. This has motivated the development of statistical phy- logenetic methods for detecting shifts in the rate of lineage diversification (speciation – extinction). One of the most frequently used methods—implemented in the program MEDUSA—explores a set of diversification-rate models, where each model uniquely assigns branches of the phylogeny to a set of one or more diversification-rate categories. Each candidate model is first fit to the data, and the Akaike Information Criterion (AIC) is then used to identify the optimal diversification model. Surprisingly, the statistical behavior of this popular method is completely unknown, which is a concern in light of the poor performance of the AIC as a means of choosing among models in other phylogenetic comparative contexts, and also because of the ad hoc algorithm used to visit models. Here, we perform an extensive simulation study demonstrating that, as implemented, MEDUSA (1) has an extremely high Type I error rate (on average, spurious diversification-rate shifts are identi- fied 42% of the time), and (2) provides severely biased parameter estimates (on average, estimated net-diversification and relative-extinction rates are 183% and 20% of their true values, respectively). We performed simulation experiments to reveal the source(s) of these pathologies, which include (1) the use of incorrect critical thresholds for model selection, and (2) errors in the likelihood function. Understanding the statistical behavior of MEDUSA is critical both to empirical researchers—in order to clarify whether these methods can reliably be applied to empirical datasets—and to theoretical biologists—in order to clarify whether new methods are required, and to reveal the specific problems that need to be solved in order to develop more reliable approaches for detecting shifts in the rate of lineage diversification.


2001 ◽  
Vol 17 (1) ◽  
pp. 114-122 ◽  
Author(s):  
Steven H. Sheingold

Decision making in health care has become increasingly reliant on information technology, evidence-based processes, and performance measurement. It is therefore a time at which it is of critical importance to make data and analyses more relevant to decision makers. Those who support Bayesian approaches contend that their analyses provide more relevant information for decision making than do classical or “frequentist” methods, and that a paradigm shift to the former is long overdue. While formal Bayesian analyses may eventually play an important role in decision making, there are several obstacles to overcome if these methods are to gain acceptance in an environment dominated by frequentist approaches. Supporters of Bayesian statistics must find more accommodating approaches to making their case, especially in finding ways to make these methods more transparent and accessible. Moreover, they must better understand the decision-making environment they hope to influence. This paper discusses these issues and provides some suggestions for overcoming some of these barriers to greater acceptance.


1997 ◽  
Vol 77 (3) ◽  
pp. 333-344 ◽  
Author(s):  
M. I. Sheppard ◽  
D. E. Elrick ◽  
S. R. Peterson

The nuclear industry uses computer models to calculate and assess the impact of its present and future releases to the environment, both from operating reactors and from existing licensed and planned waste management facilities. We review four soil models varying in complexity that could be useful for environmental impact assessment. The goal of this comparison is to direct the combined use of these models in order to preserve simplicity, yet increase the rigor of Canadian environmental assessment calculations involving soil transport pathways. The four models chosen are: the Soil Chemical Exchange and Migration of Radionuclides (SCEMR1) model; the Baes and Sharp/Preclosure PREAC soil model, both used in Canada's nuclear fuel waste management program; the Convection-Dispersion Equation (CDE) model, commonly used in contaminant transport applications; and the Canadian Standards Association (CSA) derived release limit model used for normal operations at nuclear facilities. We discuss how each model operates, its timestep and depth increment options and the limitations of each of the models. Major model assumptions are discussed and the performance of these models is compared quantitatively for a scenario involving surface deposition or irrigation. A sensitivity analysis of the CDE model illustrates the influence of the important model parameters: the amount of infiltrating water, V; the hydrodynamic dispersion coefficient, D; and the soil retention or partition coefficient, Kd. The important parameters in the other models are also identified. This work shows we need tested, robust, mechanistic unsaturated soil models with easily understood and measurable inputs, including data for the sensitive or important model parameters for Canada's priority contaminants. Soil scientists need to assist industry and its regulators by recommending a selection of models and supporting them with the provision of validation data to ensure high-quality environmental risk assessments are carried out in Canada. Key words: Soil transport models, environmental impact assessments, model structure, complexity and performance, radionuclides 137Cs, 90Sr, 129I


2007 ◽  
Vol 97 (3) ◽  
pp. 2516-2524 ◽  
Author(s):  
Anne C. Smith ◽  
Sylvia Wirth ◽  
Wendy A. Suzuki ◽  
Emery N. Brown

Accurate characterizations of behavior during learning experiments are essential for understanding the neural bases of learning. Whereas learning experiments often give subjects multiple tasks to learn simultaneously, most analyze subject performance separately on each individual task. This analysis strategy ignores the true interleaved presentation order of the tasks and cannot distinguish learning behavior from response preferences that may represent a subject's biases or strategies. We present a Bayesian analysis of a state-space model for characterizing simultaneous learning of multiple tasks and for assessing behavioral biases in learning experiments with interleaved task presentations. Under the Bayesian analysis the posterior probability densities of the model parameters and the learning state are computed using Monte Carlo Markov Chain methods. Measures of learning, including the learning curve, the ideal observer curve, and the learning trial translate directly from our previous likelihood-based state-space model analyses. We compare the Bayesian and current likelihood–based approaches in the analysis of a simulated conditioned T-maze task and of an actual object–place association task. Modeling the interleaved learning feature of the experiments along with the animal's response sequences allows us to disambiguate actual learning from response biases. The implementation of the Bayesian analysis using the WinBUGS software provides an efficient way to test different models without developing a new algorithm for each model. The new state-space model and the Bayesian estimation procedure suggest an improved, computationally efficient approach for accurately characterizing learning in behavioral experiments.


2017 ◽  
Vol 14 (18) ◽  
pp. 4295-4314 ◽  
Author(s):  
Dan Lu ◽  
Daniel Ricciuto ◽  
Anthony Walker ◽  
Cosmin Safta ◽  
William Munger

Abstract. Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results in a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. The result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.


2016 ◽  
Author(s):  
Kassian Kobert ◽  
Alexandros Stamatakis ◽  
Tomáš Flouri

The phylogenetic likelihood function is the major computational bottleneck in several applications of evolutionary biology such as phylogenetic inference, species delimitation, model selection and divergence times estimation. Given the alignment, a tree and the evolutionary model parameters, the likelihood function computes the conditional likelihood vectors for every node of the tree. Vector entries for which all input data are identical result in redundant likelihood operations which, in turn, yield identical conditional values. Such operations can be omitted for improving run-time and, using appropriate data structures, reducing memory usage. We present a fast, novel method for identifying and omitting such redundant operations in phylogenetic likelihood calculations, and assess the performance improvement and memory saving attained by our method. Using empirical and simulated data sets, we show that a prototype implementation of our method yields up to 10-fold speedups and uses up to 78% less memory than one of the fastest and most highly tuned implementations of the phylogenetic likelihood function currently available. Our method is generic and can seamlessly be integrated into any phylogenetic likelihood implementation.


Sign in / Sign up

Export Citation Format

Share Document