scholarly journals Using Parsimony-Guided Tree Proposals to Accelerate Convergence in Bayesian Phylogenetic Inference

2019 ◽  
Author(s):  
Chi Zhang ◽  
John P. Huelsenbeck ◽  
Fredrik Ronquist

AbstractSampling across tree space is one of the major challenges in Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) algorithms. Standard MCMC tree moves consider small random perturbations of the topology, and select from candidate trees at random or based on the distance between the old and new topologies. MCMC algorithms using such moves tend to get trapped in tree space, making them slow in finding the globally most probable trees (known as ‘convergence’) and in estimating the correct proportions of the different types of them (known as ‘mixing’). Here, we introduce a new class of moves, which propose trees based on their parsimony scores. The proposal distribution derived from the parsimony scores is a quickly computable albeit rough approximation of the conditional posterior distribution over candidate trees. We demonstrate with simulations that parsimony-guided moves correctly sample the uniform distribution of topologies from the prior. We then evaluate their performance against standard moves using six challenging empirical datasets, for which we were able to obtain accurate reference estimates of the posterior using long MCMC runs, a mix of topology proposals, and Metropolis coupling. On these datasets, ranging in size from 357 to 934 taxa and from 1,740 to 5,681 sites, we find that single chains using parsimony-guided moves usually converge an order of magnitude faster than chains using standard moves. They also exhibit better mixing, that is, they cover the most probable trees more quickly. Our results show that tree moves based on quick and dirty estimates of the posterior probability can significantly outperform standard moves. Future research will have to show to what extent the performance of such moves can be improved further by finding better ways of approximating the posterior probability, taking the trade-off between accuracy and speed into account.

2020 ◽  
Vol 69 (5) ◽  
pp. 1016-1032 ◽  
Author(s):  
Chi Zhang ◽  
John P Huelsenbeck ◽  
Fredrik Ronquist

Abstract Sampling across tree space is one of the major challenges in Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) algorithms. Standard MCMC tree moves consider small random perturbations of the topology, and select from candidate trees at random or based on the distance between the old and new topologies. MCMC algorithms using such moves tend to get trapped in tree space, making them slow in finding the globally most probable trees (known as “convergence”) and in estimating the correct proportions of the different types of them (known as “mixing”). Here, we introduce a new class of moves, which propose trees based on their parsimony scores. The proposal distribution derived from the parsimony scores is a quickly computable albeit rough approximation of the conditional posterior distribution over candidate trees. We demonstrate with simulations that parsimony-guided moves correctly sample the uniform distribution of topologies from the prior. We then evaluate their performance against standard moves using six challenging empirical data sets, for which we were able to obtain accurate reference estimates of the posterior using long MCMC runs, a mix of topology proposals, and Metropolis coupling. On these data sets, ranging in size from 357 to 934 taxa and from 1740 to 5681 sites, we find that single chains using parsimony-guided moves usually converge an order of magnitude faster than chains using standard moves. They also exhibit better mixing, that is, they cover the most probable trees more quickly. Our results show that tree moves based on quick and dirty estimates of the posterior probability can significantly outperform standard moves. Future research will have to show to what extent the performance of such moves can be improved further by finding better ways of approximating the posterior probability, taking the trade-off between accuracy and speed into account. [Bayesian phylogenetic inference; MCMC; parsimony; tree proposal.]


2017 ◽  
Author(s):  
R. Biczok ◽  
P. Bozsoky ◽  
P. Eisenmann ◽  
J. Ernst ◽  
T. Ribizel ◽  
...  

AbstractMotivationThe presence of terraces in phylogenetic tree space, that is, a potentially large number of distinct tree topologies that have exactly the same analytical likelihood score, was first described by Sanderson et al, (2011). However, popular software tools for maximum likelihood and Bayesian phylogenetic inference do not yet routinely report, if inferred phylogenies reside on a terrace, or not. We believe, this is due to the unavailability of an efficient library implementation to (i) determine if a tree resides on a terrace, (ii) calculate how many trees reside on a terrace, and (iii) enumerate all trees on a terrace.ResultsIn our bioinformatics programming practical we developed two efficient and independent C++ implementations of the SUPERB algorithm by Constantinescu and Sankoff (1995) for counting and enumerating the trees on a terrace. Both implementations yield exactly the same results and are more than one order of magnitude faster and require one order of magnitude less memory than a previous 3rd party python implementation.AvailabilityThe source codes are available under GNU GPL at https://github.com/[email protected]


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Ralf Buckley ◽  
Paula Brough ◽  
Leah Hague ◽  
Alienor Chauvenet ◽  
Chris Fleming ◽  
...  

Abstract We evaluate methods to calculate the economic value of protected areas derived from the improved mental health of visitors. A conservative global estimate using quality-adjusted life years, a standard measure in health economics, is US$6 trillion p.a. This is an order of magnitude greater than the global value of protected area tourism, and two to three orders greater than global aggregate protected area management agency budgets. Future research should: refine this estimate using more precise methods; consider interactions between health and conservation policies and budgets at national scales; and examine links between personalities and protected area experiences at individual scale.


2002 ◽  
Vol 51 (5) ◽  
pp. 740-753 ◽  
Author(s):  
Richard E. Miller ◽  
Thomas R. Buckley ◽  
Paul S. Manos

2000 ◽  
Vol 42 (7-8) ◽  
pp. 315-322 ◽  
Author(s):  
B. N. Jacobsen ◽  
T. Guildal

Management aspects for control of environmental contaminants has widened from being focussed on heavy metals to a broader approach including specific organic compounds, inhibition of sensitive bacteria or algae, and newly identified environmental issues, e.g., endocrine disruption and antibiotic resistance. Studies conducted at the Avedøre WWTP confirm the relevance of such newly discovered environmental problems, however, the order of magnitude of the effects do not seem alarming. It is recommended in future research to establish links between occurrence of specific organic compounds and heavy metals to various measures of toxicity and bioaccumulation. Also data for specific biodegradation rates in WWTPs represent a bottleneck for simulating fate of specific organic compounds in the plants.


2012 ◽  
Vol 03 (01) ◽  
pp. 1250002 ◽  
Author(s):  
LUKE M. BRANDER ◽  
KATRIN REHDANZ ◽  
RICHARD S. J. TOL ◽  
PIETER J. H. VAN BEUKERING

Because ocean acidification has only recently been recognized as a problem caused by CO2 emissions, impact studies are still rare and estimates of the economic impact are absent. This paper estimates the economic impact of ocean acidification on coral reefs which are generally considered to be economically as well as ecologically important ecosystems. First, we conduct an impact assessment in which atmospheric concentration of CO2 is linked to ocean acidity causing coral reef area loss. Next, a meta-analytic value transfer is applied to determine the economic value of coral reefs around the world. Finally, these two analyses are combined to estimate the economic impact of ocean acidification on coral reefs for the four IPCC marker scenarios. We find that the annual economic impact rapidly escalates over time, because the scenarios have rapid economic growth in the relevant countries and coral reefs are a luxury good. Nonetheless, the annual value in 2100 in still only a fraction of total income, one order of magnitude smaller than the previously estimated impact of climate change. Although the estimated impact is uncertain, the estimated confidence interval spans one order of magnitude only. Future research should seek to extend the estimates presented here to other impacts of ocean acidification and investigate the implications of our findings for climate policy.


Author(s):  
Lídia Kuan ◽  
Frederico Pratas ◽  
Leonel Sousa ◽  
Pedro Tomás

MrBayes is a popular software package for Bayesian phylogenetic inference, which uses an iterative approach to derive an evolutionary tree for a collection of species whose DNA sequences are known. Computationally, MrBayes is characterized by a large number of iterations, each composed of a set of tasks that isolated are not very time-consuming, but are globally computationally demanding. To accelerate the latest MrBayes 3.2, this paper presents MrBayes sMC3, which relies on the computational power of an heterogeneous CPU+GPU platform. For this, MrBayes sMC3 exploits both task and data-level parallelism while minimizing the overheads associated with kernel launches and CPU-GPU data transfers. Experimental results indicate that the proposed parallel approach, together with the proposed set of optimizations, allow for an application acceleration of up to 10× regarding the original MrBayes, and up to 3× regarding the Beagle Library. Furthermore, by analyzing the convergence rate of MrBayes sMC3 with that of the state-of-the-art approaches, a significant reduction in execution time is observed.


Sign in / Sign up

Export Citation Format

Share Document