Stepwise Bayesian Phylogenetic Inference
AbstractThe ideal approach to Bayesian phylogenetic inference is to estimate all parameters of interest jointly in a single hierarchical model. However, this is often not feasible in practice due to the high computational cost that would be incurred. Instead, phylogenetic pipelines generally consist of chained analyses, whereby a single point estimate from a given analysis is used as input for the next analysis in the chain (e.g., a single multiple sequence alignment is used to estimate a gene tree). In this framework, uncertainty is not propagated from step to step in the chain, which can lead to inaccurate or spuriously certain results. Here, we formally develop and test the stepwise approach to Bayesian inference, which uses importance sampling to generate observations for the next step of an analysis pipeline from the posterior produced in the previous step. We show that this approach is identical to the joint approach given sufficient information in the data and in the importance sample. This is demonstrated using both a toy example and an analysis pipeline for inferring divergence times using a relaxed clock model. The stepwise approach presented here not only accounts for uncertainty between analysis steps, but also allows for greater flexibility in program choice (and hence model availability) and can be more computationally efficient than the traditional joint approach when multiple models are being tested.