scholarly journals Strategies for Partitioning Clock Models in Phylogenomic Dating: Application to the Angiosperm Evolutionary Timescale

2017 ◽  
Author(s):  
Charles S. P. Foster ◽  
Simon Y. W. Ho

AbstractEvolutionary timescales can be inferred from molecular sequence data using a Bayesian phylogenetic approach. In these methods, the molecular clock is often calibrated using fossil data. The uncertainty in these fossil calibrations is important because it determines the limiting posterior distribution for divergence-time estimates as the sequence length tends to infinity. Here we investigate how the accuracy and precision of Bayesian divergence-time estimates improve with the increased clock-partitioning of genome-scale data into clock-subsets. We focus on a data set comprising plastome-scale sequences of 52 angiosperm taxa. There was little difference among the Bayesian date estimates whether we chose clock-subsets based on patterns of among-lineage rate heterogeneity or relative rates across genes, or by random assignment. Increasing the degree of clock-partitioning usually led to an improvement in the precision of divergence-time estimates, but this increase was asymptotic to a limit presumably imposed by fossil calibrations. Our clock-partitioning approaches yielded highly precise age estimates for several key nodes in the angiosperm phylogeny. For example, when partitioning the data into 20 clock-subsets based on patterns of among-lineage rate heterogeneity, we inferred crown angiosperms to have arisen 198–178 Ma. This demonstrates that judicious clock-partitioning can improve the precision of molecular dating based on phylogenomic data, but the meaning of this increased precision should be considered critically.

2006 ◽  
Vol 2 (4) ◽  
pp. 543-547 ◽  
Author(s):  
Per G.P Ericson ◽  
Cajsa L Anderson ◽  
Tom Britton ◽  
Andrzej Elzanowski ◽  
Ulf S Johansson ◽  
...  

Patterns of diversification and timing of evolution within Neoaves, which includes almost 95% of all bird species, are virtually unknown. On the other hand, molecular data consistently indicate a Cretaceous origin of many neoavian lineages and the fossil record seems to support an Early Tertiary diversification. Here, we present the first well-resolved molecular phylogeny for Neoaves, together with divergence time estimates calibrated with a large number of stratigraphically and phylogenetically well-documented fossils. Our study defines several well-supported clades within Neoaves. The calibration results suggest that Neoaves, after an initial split from Galloanseres in Mid-Cretaceous, diversified around or soon after the K/T boundary. Our results thus do not contradict palaeontological data and show that there is no solid molecular evidence for an extensive pre-Tertiary radiation of Neoaves.


2017 ◽  
Author(s):  
Mario dos Reis ◽  
Gregg F. Gunnell ◽  
José Barba-Montoya ◽  
Alex Wilkins ◽  
Ziheng Yang ◽  
...  

AbstractPrimates have long been a test case for the development of phylogenetic methods for divergence time estimation. Despite a large number of studies, however, the timing of origination of crown Primates relative to the K-Pg boundary and the timing of diversification of the main crown groups remain controversial. Here we analysed a dataset of 372 taxa (367 Primates and 5 outgroups, 61 thousand base pairs) that includes nine complete primate genomes (3.4 million base pairs). We systematically explore the effect of different interpretations of fossil calibrations and molecular clock models on primate divergence time estimates. We find that even small differences in the construction of fossil calibrations can have a noticeable impact on estimated divergence times, especially for the oldest nodes in the tree. Notably, choice of molecular rate model (auto-correlated or independently distributed rates) has an especially strong effect on estimated times, with the independent rates model producing considerably more ancient estimates for the deeper nodes in the phylogeny. We implement thermodynamic integration, combined with Gaussian quadrature, in the program MCMCTree, and use it to calculate Bayes factors for clock models. Bayesian model selection indicates that the auto-correlated rates model fits the primate data substantially better, and we conclude that time estimates under this model should be preferred. We show that for eight core nodes in the phylogeny, uncertainty in time estimates is close to the theoretical limit imposed by fossil uncertainties. Thus, these estimates are unlikely to be improved by collecting additional molecular sequence data. All analyses place the origin of Primates close to the K-Pg boundary, either in the Cretaceous or straddling the boundary into the Palaeogene.


2008 ◽  
Vol 22 (3) ◽  
pp. 345 ◽  
Author(s):  
Alejandro Zaldivar-Riverón ◽  
Sergey A. Belokobylskij ◽  
Virginia León-Regagnon ◽  
Rosa Briceño-G. ◽  
Donald L. J. Quicke

The phylogenetic relationships among representatives of 64 genera of the cosmopolitan parasitic wasps of the subfamily Doryctinae were investigated based on nuclear 28S ribosomal (r) DNA (~650 bp of the D2–3 region) and cytochrome c oxidase I (COI) mitochondrial (mt) DNA (603 bp) sequence data. The molecular dating of selected clades and the biogeography of the subfamily were also inferred. The partitioned Bayesian analyses did not recover a monophyletic Doryctinae, though the relationships involved were only weakly supported. Strong evidence was found for rejecting the monophylies of both Doryctes Haliday, 1836 and Spathius Nees, 1818. Our results also support the recognition of the Rhaconotini as a valid tribe. A dispersal–vicariance analysis showed a strong geographical signal for the taxa included, with molecular dating estimates for the origin of Doryctinae and its subsequent radiation both occurring during the late Paleocene–early Eocene. The divergence time estimates suggest that diversification in the subfamily could have in part occurred as a result of continental break-up events that took place in the southern hemisphere, though more recent dispersal events account for the current distribution of several widespread taxa.


2020 ◽  
Vol 36 (Supplement_2) ◽  
pp. i884-i894
Author(s):  
Jose Barba-Montoya ◽  
Qiqing Tao ◽  
Sudhir Kumar

Abstract Motivation As the number and diversity of species and genes grow in contemporary datasets, two common assumptions made in all molecular dating methods, namely the time-reversibility and stationarity of the substitution process, become untenable. No software tools for molecular dating allow researchers to relax these two assumptions in their data analyses. Frequently the same General Time Reversible (GTR) model across lineages along with a gamma (+Γ) distributed rates across sites is used in relaxed clock analyses, which assumes time-reversibility and stationarity of the substitution process. Many reports have quantified the impact of violations of these underlying assumptions on molecular phylogeny, but none have systematically analyzed their impact on divergence time estimates. Results We quantified the bias on time estimates that resulted from using the GTR + Γ model for the analysis of computer-simulated nucleotide sequence alignments that were evolved with non-stationary (NS) and non-reversible (NR) substitution models. We tested Bayesian and RelTime approaches that do not require a molecular clock for estimating divergence times. Divergence times obtained using a GTR + Γ model differed only slightly (∼3% on average) from the expected times for NR datasets, but the difference was larger for NS datasets (∼10% on average). The use of only a few calibrations reduced these biases considerably (∼5%). Confidence and credibility intervals from GTR + Γ analysis usually contained correct times. Therefore, the bias introduced by the use of the GTR + Γ model to analyze datasets, in which the time-reversibility and stationarity assumptions are violated, is likely not large and can be reduced by applying multiple calibrations. Availability and implementation All datasets are deposited in Figshare: https://doi.org/10.6084/m9.figshare.12594638.


2013 ◽  
Vol 2013 ◽  
pp. 1-12 ◽  
Author(s):  
James A. Schulte

Methods for estimating divergence times from molecular data have improved dramatically over the past decade, yet there are few studies examining alternative taxon sampling effects on node age estimates. Here, I investigate the effect of undersampling species diversity on node ages of the South American lizard clade Liolaemini using several alternative subsampling strategies for both time calibrations and taxa numbers. Penalized likelihood (PL) and Bayesian molecular dating analyses were conducted on a densely sampled (202 taxa) mtDNA-based phylogenetic hypothesis of Iguanidae, including 92 Liolaemini species. Using all calibrations and penalized likelihood, clades with very low taxon sampling had node age estimates younger than clades with more complete taxon sampling. The effect of Bayesian and PL methods differed when either one or two calibrations only were used with dense taxon sampling. Bayesian node ages were always older when fewer calibrations were used, whereas PL node ages were always younger. This work reinforces two important points: (1) whenever possible, authors should strongly consider adding as many taxa as possible, including numerous outgroups, prior to node age estimation to avoid considerable node age underestimation and (2) using more, critically assessed, and accurate fossil calibrations should yield improved divergence time estimates.


Fossil Record ◽  
2017 ◽  
Vol 20 (2) ◽  
pp. 147-157 ◽  
Author(s):  
Kathrin Feldberg ◽  
Jiří Váňa ◽  
Alfons Schäfer-Verwimp ◽  
Michael Krings ◽  
Carsten Gröhn ◽  
...  

Abstract. A revision of the Baltic and Bitterfeld amber fossils assigned to Cylindrocolea dimorpha (Cephaloziellaceae) has yielded evidence of the presence of multicellular, bifid underleaves, which have not previously been reported for this species and conflict with the current circumscription of the family. We transfer the fossil species to Odontoschisma (sect. Iwatsukia) and propose the new combination O. dimorpha of the Cephaloziaceae. Characteristics of the fossil include an overall small size of the plant, entire-margined, bifid leaves and underleaves, more or less equally thickened leaf cell walls, ventral branching that includes stoloniform branches with reduced leaves, and the lack of a stem hyalodermis and gemmae. Placement of the fossil in Cephaloziaceae profoundly affects divergence time estimates for liverworts based on DNA sequence variation with integrated information from the fossil record. Our reclassification concurs with hypotheses on the divergence times of Cephaloziaceae derived from DNA sequence data that provide evidence of a late Early Cretaceous to early Eocene age of the Odontoschisma crown group and an origin of O. sect. Iwatsukia in the Late Cretaceous to Oligocene.


2019 ◽  
Vol 69 (4) ◽  
pp. 660-670 ◽  
Author(s):  
Tom Carruthers ◽  
Michael J Sanderson ◽  
Robert W Scotland

Abstract Rate variation adds considerable complexity to divergence time estimation in molecular phylogenies. Here, we evaluate the impact of lineage-specific rates—which we define as among-branch-rate-variation that acts consistently across the entire genome. We compare its impact to residual rates—defined as among-branch-rate-variation that shows a different pattern of rate variation at each sampled locus, and gene-specific rates—defined as variation in the average rate across all branches at each sampled locus. We show that lineage-specific rates lead to erroneous divergence time estimates, regardless of how many loci are sampled. Further, we show that stronger lineage-specific rates lead to increasing error. This contrasts to residual rates and gene-specific rates, where sampling more loci significantly reduces error. If divergence times are inferred in a Bayesian framework, we highlight that error caused by lineage-specific rates significantly reduces the probability that the 95% highest posterior density includes the correct value, and leads to sensitivity to the prior. Use of a more complex rate prior—which has recently been proposed to model rate variation more accurately—does not affect these conclusions. Finally, we show that the scale of lineage-specific rates used in our simulation experiments is comparable to that of an empirical data set for the angiosperm genus Ipomoea. Taken together, our findings demonstrate that lineage-specific rates cause error in divergence time estimates, and that this error is not overcome by analyzing genomic scale multilocus data sets. [Divergence time estimation; error; rate variation.]


2019 ◽  
Vol 69 (1) ◽  
pp. 1-16 ◽  
Author(s):  
Yuan Nie ◽  
Charles S P Foster ◽  
Tianqi Zhu ◽  
Ru Yao ◽  
David A Duchêne ◽  
...  

Abstract Establishing an accurate evolutionary timescale for green plants (Viridiplantae) is essential to understanding their interaction and coevolution with the Earth’s climate and the many organisms that rely on green plants. Despite being the focus of numerous studies, the timing of the origin of green plants and the divergence of major clades within this group remain highly controversial. Here, we infer the evolutionary timescale of green plants by analyzing 81 protein-coding genes from 99 chloroplast genomes, using a core set of 21 fossil calibrations. We test the sensitivity of our divergence-time estimates to various components of Bayesian molecular dating, including the tree topology, clock models, clock-partitioning schemes, rate priors, and fossil calibrations. We find that the choice of clock model affects date estimation and that the independent-rates model provides a better fit to the data than the autocorrelated-rates model. Varying the rate prior and tree topology had little impact on age estimates, with far greater differences observed among calibration choices and clock-partitioning schemes. Our analyses yield date estimates ranging from the Paleoproterozoic to Mesoproterozoic for crown-group green plants, and from the Ediacaran to Middle Ordovician for crown-group land plants. We present divergence-time estimates of the major groups of green plants that take into account various sources of uncertainty. Our proposed timeline lays the foundation for further investigations into how green plants shaped the global climate and ecosystems, and how embryophytes became dominant in terrestrial environments.


2007 ◽  
Vol 20 (4) ◽  
pp. 287 ◽  
Author(s):  
Michael J. Sanderson

Broad availability of molecular sequence data allows construction of phylogenetic trees with 1000s or even 10 000s of taxa. This paper reviews methodological, technological and empirical issues raised in phylogenetic inference at this scale. Numerous algorithmic and computational challenges have been identified surrounding the core problem of reconstructing large trees accurately from sequence data, but many other obstacles, both upstream and downstream of this step, are less well understood. Before phylogenetic analysis, data must be generated de novo or extracted from existing databases, compiled into blocks of homologous data with controlled properties, aligned, examined for the presence of gene duplications or other kinds of complicating factors, and finally, combined with other evidence via supermatrix or supertree approaches. After phylogenetic analysis, confidence assessments are usually reported, along with other kinds of annotations, such as clade names, or annotations requiring additional inference procedures, such as trait evolution or divergence time estimates. Prospects for partial automation of large-tree construction are also discussed, as well as risks associated with ‘outsourcing’ phylogenetic inference beyond the systematics community.


2013 ◽  
Vol 280 (1755) ◽  
pp. 20122686 ◽  
Author(s):  
Sophie Cardinal ◽  
Bryan N. Danforth

Reliable estimates on the ages of the major bee clades are needed to further understand the evolutionary history of bees and their close association with flowering plants. Divergence times have been estimated for a few groups of bees, but no study has yet provided estimates for all major bee lineages. To date the origin of bees and their major clades, we first perform a phylogenetic analysis of bees including representatives from every extant family, subfamily and almost all tribes, using sequence data from seven genes. We then use this phylogeny to place 14 time calibration points based on information from the fossil record for an uncorrelated relaxed clock divergence time analysis taking into account uncertainties in phylogenetic relationships and the fossil record. We explore the effect of placing a hard upper age bound near the root of the tree and the effect of different topologies on our divergence time estimates. We estimate that crown bees originated approximately 123 Ma (million years ago) (113–132 Ma), concurrently with the origin or diversification of the eudicots, a group comprising 75 per cent of angiosperm species. All of the major bee clades are estimated to have originated during the Middle to Late Cretaceous, which is when angiosperms became the dominant group of land plants.


Sign in / Sign up

Export Citation Format

Share Document