scholarly journals The Past Sure Is Tense: On Interpreting Phylogenetic Divergence Time Estimates

2017 ◽  
Author(s):  
Joseph W. Brown ◽  
Stephen A. Smith

AbstractDivergence time estimation — the calibration of a phylogeny to geological time — is an integral first step in modelling the tempo of biological evolution (traits and lineages). However, despite increasingly sophisticated methods to infer divergence times from molecular genetic sequences, the estimated age of many nodes across the tree of life contrast significantly and consistently with timeframes conveyed by the fossil record. This is perhaps best exemplified by crown angiosperms, where molecular clock (Triassic) estimates predate the oldest (Early Cretaceous) undisputed angiosperm fossils by tens of millions of years or more. While the incompleteness of the fossil record is a common concern, issues of data limitation and model inadequacy are viable (if underexplored) alternative explanations. In this vein, Beaulieu et al. (2015) convincingly demonstrated how methods of divergence time inference can be misled by both (i) extreme state-dependent molecular substitution rate heterogeneity and (ii) biased sampling of representative major lineages. These results demonstrate the impact of (potentially common) model violations. Here, we suggest another potential challenge: that the configuration of the statistical inference problem (i.e., the parameters, their relationships, and associated priors) alone may preclude the reconstruction of the paleontological timeframe for the crown age of angiosperms. We demonstrate, through sampling from the joint prior (formed by combining the tree (diversification) prior with the calibration densities specified for fossil-calibrated nodes) that with no data present at all, that, an Early Cretaceous crown angiosperms is rejected (i.e., has essentially zero probability). More worrisome, however, is that, for the 24 nodes calibrated by fossils, almost all have indistinguishable marginal prior and posterior age distributions when employing routine lognormal fossil calibration priors. These results indicate that there is inadequate information in the data to overrule the joint prior. Given that these calibrated nodes are strategically placed in disparate regions of the tree, they act to anchor the tree scaffold, and so the posterior inference for the tree as a whole is largely determined by the pseudo-data present in the (often arbitrary) calibration densities. We recommend, as for any Bayesian analysis, that marginal prior and posterior distributions be carefully compared to determine whether signal is coming from the data or prior belief, especially for parameters of direct interest. This recommendation is not novel. However, given how rarely such checks are carried out in evolutionary biology, it bears repeating. Our results demonstrate the fundamental importance of prior/posterior comparisons in any Bayesian analysis, and we hope that they further encourage both researchers and journals to consistently adopt, this crucial step as standard practice. Finally, we note that the results presented here do not refute the biological modelling concerns identified by Beaulieu et al. (2015). Both sets of issues remain apposite to the goals of accurate divergence time estimation, and only by considering them in tandem can we move forward more confidently. [marginal priors; information content; diptych; divergence time estimation; fossil record; BEAST; angiosperms.]

2020 ◽  
Author(s):  
Tom Carruthers ◽  
Robert W Scotland

Abstract Understanding and representing uncertainty is crucial in academic research, because it enables studies to build on the conclusions of previous studies, leading to robust advances in a particular field. Here, we evaluate the nature of uncertainty and the manner by which it is represented in divergence time estimation, a field that is fundamental to many aspects of macroevolutionary research, and where there is evidence that uncertainty has been seriously underestimated. We address this issue in the context of methods used in divergence time estimation, and with respect to the manner by which time-calibrated phylogenies are interpreted. With respect to methods, we discuss how the assumptions underlying different methods may not adequately reflect uncertainty about molecular evolution, the fossil record, or diversification rates. Therefore, divergence time estimates may not adequately reflect uncertainty, and may be directly contradicted by subsequent findings. For the interpretation of time-calibrated phylogenies, we discuss how the use of time-calibrated phylogenies for reconstructing general evolutionary timescales leads to inferences about macroevolution that are highly sensitive to methodological limitations in how uncertainty is accounted for. By contrast, we discuss how the use of time-calibrated phylogenies to test specific hypotheses leads to inferences about macroevolution that are less sensitive to methodological limitations. Given that many biologists wish to use time-calibrated phylogenies to reconstruct general evolutionary timescales, we conclude that the development of methods of divergence time estimation that adequately account for uncertainty is necessary.


2019 ◽  
Vol 69 (4) ◽  
pp. 660-670 ◽  
Author(s):  
Tom Carruthers ◽  
Michael J Sanderson ◽  
Robert W Scotland

Abstract Rate variation adds considerable complexity to divergence time estimation in molecular phylogenies. Here, we evaluate the impact of lineage-specific rates—which we define as among-branch-rate-variation that acts consistently across the entire genome. We compare its impact to residual rates—defined as among-branch-rate-variation that shows a different pattern of rate variation at each sampled locus, and gene-specific rates—defined as variation in the average rate across all branches at each sampled locus. We show that lineage-specific rates lead to erroneous divergence time estimates, regardless of how many loci are sampled. Further, we show that stronger lineage-specific rates lead to increasing error. This contrasts to residual rates and gene-specific rates, where sampling more loci significantly reduces error. If divergence times are inferred in a Bayesian framework, we highlight that error caused by lineage-specific rates significantly reduces the probability that the 95% highest posterior density includes the correct value, and leads to sensitivity to the prior. Use of a more complex rate prior—which has recently been proposed to model rate variation more accurately—does not affect these conclusions. Finally, we show that the scale of lineage-specific rates used in our simulation experiments is comparable to that of an empirical data set for the angiosperm genus Ipomoea. Taken together, our findings demonstrate that lineage-specific rates cause error in divergence time estimates, and that this error is not overcome by analyzing genomic scale multilocus data sets. [Divergence time estimation; error; rate variation.]


2004 ◽  
Vol 359 (1450) ◽  
pp. 1477-1483 ◽  
Author(s):  
Thomas J. Near ◽  
Michael J. Sanderson

Estimates of species divergence times using DNA sequence data are playing an increasingly important role in studies of evolution, ecology and biogeography. Most work has centred on obtaining appropriate kinds of data and developing optimal estimation procedures, whereas somewhat less attention has focused on the calibration of divergences using fossils. Case studies with multiple fossil calibration points provide important opportunities to examine the divergence time estimation problem in new ways. We discuss two cross–validation procedures that address different aspects of inference in divergence time estimation. ‘Fossil cross–validation’ is a procedure used to identify the impact of different individual calibrations on overall estimation. This can identify fossils that have an exceptionally large error effect and may warrant further scrutiny. ‘Fossil–based model cross–validation’ is an entirely different procedure that uses fossils to identify the optimal model of molecular evolution in the context of rate smoothing or other inference methods. Both procedures were applied to two recent studies: an analysis of monocot angiosperms with eight fossil calibrations and an analysis of placental mammals with nine fossil calibrations. In each case, fossil calibrations could be ranked from most to least influential, and in one of the two studies, the fossils provided decisive evidence about the optimal molecular evolutionary model.


PLoS ONE ◽  
2019 ◽  
Vol 14 (5) ◽  
pp. e0217959 ◽  
Author(s):  
Hussam Zaher ◽  
Robert W. Murphy ◽  
Juan Camilo Arredondo ◽  
Roberta Graboski ◽  
Paulo Roberto Machado-Filho ◽  
...  

PLoS ONE ◽  
2011 ◽  
Vol 6 (11) ◽  
pp. e27138 ◽  
Author(s):  
Sebastián Duchêne ◽  
Frederick I. Archer ◽  
Julia Vilstrup ◽  
Susana Caballero ◽  
Phillip A. Morin

2017 ◽  
Author(s):  
Mario dos Reis ◽  
Gregg F. Gunnell ◽  
José Barba-Montoya ◽  
Alex Wilkins ◽  
Ziheng Yang ◽  
...  

AbstractPrimates have long been a test case for the development of phylogenetic methods for divergence time estimation. Despite a large number of studies, however, the timing of origination of crown Primates relative to the K-Pg boundary and the timing of diversification of the main crown groups remain controversial. Here we analysed a dataset of 372 taxa (367 Primates and 5 outgroups, 61 thousand base pairs) that includes nine complete primate genomes (3.4 million base pairs). We systematically explore the effect of different interpretations of fossil calibrations and molecular clock models on primate divergence time estimates. We find that even small differences in the construction of fossil calibrations can have a noticeable impact on estimated divergence times, especially for the oldest nodes in the tree. Notably, choice of molecular rate model (auto-correlated or independently distributed rates) has an especially strong effect on estimated times, with the independent rates model producing considerably more ancient estimates for the deeper nodes in the phylogeny. We implement thermodynamic integration, combined with Gaussian quadrature, in the program MCMCTree, and use it to calculate Bayes factors for clock models. Bayesian model selection indicates that the auto-correlated rates model fits the primate data substantially better, and we conclude that time estimates under this model should be preferred. We show that for eight core nodes in the phylogeny, uncertainty in time estimates is close to the theoretical limit imposed by fossil uncertainties. Thus, these estimates are unlikely to be improved by collecting additional molecular sequence data. All analyses place the origin of Primates close to the K-Pg boundary, either in the Cretaceous or straddling the boundary into the Palaeogene.


2018 ◽  
Author(s):  
Joëlle Barido-Sottani ◽  
Gabriel Aguirre-Fernández ◽  
Melanie Hopkins ◽  
Tanja Stadler ◽  
Rachel Warnock

AbstractFossil information is essential for estimating species divergence times, and can be integrated into Bayesian phylogenetic inference using the fossilized birth-death (FBD) process. An important aspect of palaeontological data is the uncertainty surrounding specimen ages, which can be handled in different ways during inference. The most common approach is to fix fossil ages to a point estimate within the known age interval. Alternatively, age uncertainty can be incorporated by using priors, and fossil ages are then directly sampled as part of the inference. This study presents a comparison of alternative approaches for handling fossil age uncertainty in analysis using the FBD process. Based on simulations, we find that fixing fossil ages to the midpoint or a random point drawn from within the stratigraphic age range leads to biases in divergence time estimates, while sampling fossil ages leads to estimates that are similar to inferences that employ the correct ages of fossils. Second, we show a comparison using an empirical dataset of extant and fossil cetaceans, which confirms that different methods of handling fossil age uncertainty lead to large differences in estimated node ages. Stratigraphic age uncertainty should thus not be ignored in divergence time estimation and instead should be incorporated explicitly.


2015 ◽  
Vol 282 (1798) ◽  
pp. 20141013 ◽  
Author(s):  
Rachel C. M. Warnock ◽  
James F. Parham ◽  
Walter G. Joyce ◽  
Tyler R. Lyson ◽  
Philip C. J. Donoghue

Calibration is the rate-determining step in every molecular clock analysis and, hence, considerable effort has been expended in the development of approaches to distinguish good from bad calibrations. These can be categorized into a priori evaluation of the intrinsic fossil evidence, and a posteriori evaluation of congruence through cross-validation. We contrasted these competing approaches and explored the impact of different interpretations of the fossil evidence upon Bayesian divergence time estimation. The results demonstrate that a posteriori approaches can lead to the selection of erroneous calibrations. Bayesian posterior estimates are also shown to be extremely sensitive to the probabilistic interpretation of temporal constraints. Furthermore, the effective time priors implemented within an analysis differ for individual calibrations when employed alone and in differing combination with others. This compromises the implicit assumption of all calibration consistency methods, that the impact of an individual calibration is the same when used alone or in unison with others. Thus, the most effective means of establishing the quality of fossil-based calibrations is through a priori evaluation of the intrinsic palaeontological, stratigraphic, geochronological and phylogenetic data. However, effort expended in establishing calibrations will not be rewarded unless they are implemented faithfully in divergence time analyses.


2011 ◽  
Vol 8 (1) ◽  
pp. 156-159 ◽  
Author(s):  
Rachel C. M. Warnock ◽  
Ziheng Yang ◽  
Philip C. J. Donoghue

Calibration is a critical step in every molecular clock analysis but it has been the least considered. Bayesian approaches to divergence time estimation make it possible to incorporate the uncertainty in the degree to which fossil evidence approximates the true time of divergence. We explored the impact of different approaches in expressing this relationship, using arthropod phylogeny as an example for which we established novel calibrations. We demonstrate that the parameters distinguishing calibration densities have a major impact upon the prior and posterior of the divergence times, and it is critically important that users evaluate the joint prior distribution of divergence times used by their dating programmes. We illustrate a procedure for deriving calibration densities in Bayesian divergence dating through the use of soft maximum constraints.


Zootaxa ◽  
2009 ◽  
Vol 2216 (1) ◽  
pp. 22-36 ◽  
Author(s):  
JESSICA L. WARE ◽  
JOHN P. SIMAIKA ◽  
MICHAEL J. SAMWAYS

Syncordulia (Odonata: Anisoptera: Libelluloidea) inhabits mostly cool mountainous streams in the Cape Floristic Region of South Africa. It is found at low densities in geographically restricted areas. Syncordulia is endemic to South Africa and, until recently, only two species were known, S. venator (Barnard, 1933) and S. gracilis (Burmeister 1839), both considered Vulnerable by the World Conservation Union (IUCN). Two new species, S. serendipator Dijsktra, Samways & Simaika 2007 and S. legator Dijsktra, Samways & Simaika 2007, were described from previously unrecognized museum specimens and new field collections. Here we corroborate the validity of these two new species using multiple genes and propose intergeneric relationships within Syncordulia. Molecular data from two independent gene fragments (nuclear 28S and ribosomal and cytochrome oxidase subunit I mitochondrial data) were sequenced and/or downloaded from GenBank for 7 libelluloid families, including 12 Syncordulia specimens (2 Syncordulia gracilis, 4 S. serendipator, 2 S. legator and 4 S. venator). The lower libelluloid group GSI (sensu Ware et al. 2007), a diverse group of non– corduliine taxa, is strongly supported as monophyletic. Syncordulia is well supported by both methods of phylogenetic analyses as a monophyletic group deeply nested within the GSI clade. A DIVA biogeographical analysis suggests that the ancestor to the genus Syncordulia may have arisen consequent to the break–up of Gondwana (>120 Mya). Divergence time estimates suggest that Syncordulia diverged well after the breakup of Gondwana, approximately 60 million years ago (Mya), which coincides with the divergence of several Cape fynbos taxa, between 86 – 60 Mya. DIVA analyses suggest that the present distributions of Syncordulia may be the result of dispersal events. We relate these phylogenetic data to the historical biogeography of the genus and to the importance of conservation action.


Sign in / Sign up

Export Citation Format

Share Document