Bayesian Tip-Dated Phylogenetics in Paleontology: Topological Effects and Stratigraphic Fit

2020 ◽  
Author(s):  
Benedict King

Abstract The incorporation of stratigraphic data into phylogenetic analysis has a long history of debate but is not currently standard practice for paleontologists. Bayesian tip-dated (or morphological clock) phylogenetic methods have returned these arguments to the spotlight, but how tip dating affects the recovery of evolutionary relationships has yet to be fully explored. Here I show, through analysis of several data sets with multiple phylogenetic methods, that topologies produced by tip dating are outliers as compared to topologies produced by parsimony and undated Bayesian methods, which retrieve broadly similar trees. Unsurprisingly, trees recovered by tip dating have better fit to stratigraphy than trees recovered by other methods under both the Gap Excess Ratio (GER) and the Stratigraphic Completeness Index (SCI). This is because trees with better stratigraphic fit are assigned a higher likelihood by the fossilized birth-death tree model. However, the degree to which the tree model favors tree topologies with high stratigraphic fit metrics is modulated by the diversification dynamics of the group under investigation. In particular, when net diversification rate is low, the tree model favors trees with a higher GER compared to when net diversification rate is high. Differences in stratigraphic fit and tree topology between tip dating and other methods are concentrated in parts of the tree with weaker character signal, as shown by successive deletion of the most incomplete taxa from two data sets. These results show that tip dating incorporates stratigraphic data in an intuitive way, with good stratigraphic fit an expectation that can be overturned by strong evidence from character data. [fossilized birth-death; fossils; missing data; morphological clock; morphology; parsimony; phylogenetics.]

2019 ◽  
Author(s):  
Benedict King ◽  
Robin Beck

ABSTRACTThe incorporation of stratigraphic data into phylogenetic analysis has a long history of debate, but is not currently standard practice for palaeontologists. Bayesian tip-dating (or morphological clock) phylogenetic methods have returned these arguments to the spotlight, but how tip-dating affects the recovery of evolutionary relationships has yet to be fully explored. Here we show, through analysis of several datasets with multiple phylogenetic methods, that topologies produced by tip-dating are outliers when compared to topologies produced by parsimony and undated Bayesian methods, which retrieve broadly similar trees. Unsurprisingly, trees recovered by tip-dating have better fit to stratigraphy than trees recovered by other methods, due to trees with better stratigraphic fit being assigned a higher prior probability. Differences in stratigraphic fit and tree topology between tip-dating and other methods appear to be concentrated in parts of the tree with weaker character signal and a stronger influence of the prior, as shown by successive deletion of the most incomplete taxa from a sauropod dataset. Tip-dating applied to Mesozoic mammals firmly rejects a monophyletic Allotheria, and strongly supports diphyly of haramiyidans, with the late TriassicHaramiyaviaandThomasiaforming a clade with tritylodontids, which is distant from the middle Jurassic euharamiyidans. This result is not sensitive to the controversial age of the eutherianJuramaia. A Test of the age ofJuramaiausing a less restrictive prior reveals strong support from the data for an Early Cretaceous age. Our results suggest that tip-dating incorporates stratigraphic data in an intuitive way, with good stratigraphic fit a prior expectation that can be overturned by strong evidence from character data.


2019 ◽  
Vol 42 ◽  
pp. 2019001
Author(s):  
Jennifer Nowak ◽  
Andrew Sweet ◽  
Jason Weckstein ◽  
Kevin Johnson

Fruit doves and their allies are a diverse group within the pigeon and dove family (Aves: Columbidae). Progress towards subfamilial classification of Columbidae relies on identifying major groups and the phylogenetic relationships within these groups. One such recently proposed group is the Raphinae based on previous evidence that the extinct dodo is potentially within what was formerly recognized as the Treroninae (fruit doves and allies). Although several studies have explored the phylogenetic relationships within Columbidae, most have focused either on broad-scale, familial level relationships or finer scale, species level relationships. Here we use mitochondrial and nuclear gene sequences from a diverse taxonomic sample to identify relationships among the genera and species of fruit doves and their allies. In particular our goal is to identify which of these genera should be included within Raphinae (the name which has taxonomic priority over Treroninae), focusing on an inclusive, well-supported monophyletic group. We also use dense taxon sampling to explore relationships among genera and species in this group, expanding on previous studies. In addition, we use resulting phylogenetic hypotheses to reconstruct the ancestral evolutionary history of foraging mode and biogeographic patterns of dispersal within the group. We used two data sets for our phylogenetic analysis: the first consisting of novel sequences generated for this project and the second with additional, previously published sequences from the fruit dove genus (Ptilinopus). Our analyses found support for the monophyly of a clade that contains a large fraction of the genera currently classified within Raphinae and also found several well-supported clades within this group of pigeons and doves. Character reconstruction methods based on the resulting phylogeny recover multiple transitions from a terrestrial to an arboreal foraging mode and evidence for multiple dispersal events from Asia to Africa throughout the history of the clade.


2019 ◽  
Vol 15 (2) ◽  
pp. 20180632 ◽  
Author(s):  
Martin R. Smith

Phylogenetic analysis aims to establish the true relationships between taxa. Different analytical methods, however, can reach different conclusions. In order to establish which approach best reconstructs true relationships, previous studies have simulated datasets from known tree topologies, and identified the method that reconstructs the generative tree most accurately. On this basis, researchers have argued that morphological datasets should be analysed by Bayesian approaches, which employ an explicit probabilistic model of evolution, rather than parsimony methods—with implied weights parsimony sometimes identified as particularly inaccurate. Accuracy alone, however, is an inadequate measure of a tree's utility: a fully unresolved tree is perfectly accurate, yet contains no phylogenetic information. The highly resolved trees recovered by implied weights parsimony in fact contain as much useful information as the more accurate, but less resolved, trees recovered by Bayesian methods. By collapsing poorly supported groups, this superior resolution can be traded for accuracy, resulting in trees as accurate as those obtained by a Bayesian approach. By contrast, equally weighted parsimony analysis produces trees that are less resolved and less accurate, leading to less reliable evolutionary conclusions.


2019 ◽  
Vol 4 (2) ◽  
pp. 108-123 ◽  
Author(s):  
Andrew M Ritchie ◽  
Simon Y W Ho

Abstract Bayesian phylogenetic methods derived from evolutionary biology can be used to reconstruct the history of human languages using databases of cognate words. These analyses have produced exciting results regarding the origins and dispersal of linguistic and cultural groups through prehistory. Bayesian lexical dating requires the specification of priors on all model parameters. This includes the use of a prior on divergence times, often combined with a prior on tree topology and referred to as a tree prior. Violation of the underlying assumptions of the tree prior can lead to an erroneous estimate of the timescale of language evolution. To investigate these impacts, we tested the sensitivity of Bayesian dating to the tree prior in analyses of four lexical data sets. Our results show that estimates of the origin times of language families are robust to the choice of tree prior for lexical data, though less so than when Bayesian phylogenetic methods are used to analyse genetic data sets. We also used the relative fit of speciation and coalescent tree priors to determine the ability of speciation models to describe language diversification at four different taxonomic levels. We found that speciation priors were preferred over a constant-size coalescent prior regardless of taxonomic scale. However, data sets with narrower taxonomic and geographic sampling exhibited a poorer fit to ideal birth–death model expectations. Our results encourage further investigation into the nature of language diversification at different sampling scales.


2019 ◽  
Vol 11 (12) ◽  
pp. 3341-3352 ◽  
Author(s):  
Suha Naser-Khdour ◽  
Bui Quang Minh ◽  
Wenqi Zhang ◽  
Eric A Stone ◽  
Robert Lanfear

Abstract In phylogenetic inference, we commonly use models of substitution which assume that sequence evolution is stationary, reversible, and homogeneous (SRH). Although the use of such models is often criticized, the extent of SRH violations and their effects on phylogenetic inference of tree topologies and edge lengths are not well understood. Here, we introduce and apply the maximal matched-pairs tests of homogeneity to assess the scale and impact of SRH model violations on 3,572 partitions from 35 published phylogenetic data sets. We show that roughly one-quarter of all the partitions we analyzed (23.5%) reject the SRH assumptions, and that for 25% of data sets, tree topologies inferred from all partitions differ significantly from topologies inferred using the subset of partitions that do not reject the SRH assumptions. This proportion increases when comparing trees inferred using the subset of partitions that rejects the SRH assumptions, to those inferred from partitions that do not reject the SRH assumptions. These results suggest that the extent and effects of model violation in phylogenetics may be substantial. They highlight the importance of testing for model violations and possibly excluding partitions that violate models prior to tree reconstruction. Our results also suggest that further effort in developing models that do not require SRH assumptions could lead to large improvements in the accuracy of phylogenomic inference. The scripts necessary to perform the analysis are available in https://github.com/roblanf/SRHtests, and the new tests we describe are available as a new option in IQ-TREE (http://www.iqtree.org).


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3055 ◽  
Author(s):  
Andrea Cau

Bayesian phylogenetic methods integrating simultaneously morphological and stratigraphic information have been applied increasingly among paleontologists. Most of these studies have used Bayesian methods as an alternative to the widely-used parsimony analysis, to infer macroevolutionary patterns and relationships among species-level or higher taxa. Among recently introduced Bayesian methodologies, the Fossilized Birth-Death (FBD) model allows incorporation of hypotheses on ancestor-descendant relationships in phylogenetic analyses including fossil taxa. Here, the FBD model is used to infer the relationships among an ingroup formed exclusively by fossil individuals, i.e., dipnoan tooth plates from four localities in the Ain el Guettar Formation of Tunisia. Previous analyses of this sample compared the results of phylogenetic analysis using parsimony with stratigraphic methods, inferred a high diversity (five or more genera) in the Ain el Guettar Formation, and interpreted it as an artifact inflated by depositional factors. In the analysis performed here, the uncertainty on the chronostratigraphic relationships among the specimens was included among the prior settings. The results of the analysis confirm the referral of most of the specimens to the taxaAsiatoceratodus,Equinoxiodus, LavocatodusandNeoceratodus, but reject those toCeratodusandFerganoceratodus. The resulting phylogeny constrained the evolution of the Tunisian sample exclusively in the Early Cretaceous, contrasting with the previous scenario inferred by the stratigraphically-calibrated topology resulting from parsimony analysis. The phylogenetic framework also suggests that (1) the sampled localities are laterally equivalent, (2) but three localities are restricted to the youngest part of the section; both results are in agreement with previous stratigraphic analyses of these localities. The FBD model of specimen-level units provides a novel tool for phylogenetic inference among fossils but also for independent tests of stratigraphic scenarios.


Author(s):  
Konstantin Hoffmann ◽  
Remco Bouckaert ◽  
Simon J Greenhill ◽  
Denise Kühnert

Abstract Bayesian phylogenetic methods provide a set of tools to efficiently evaluate large linguistic datasets by reconstructing phylogenies—family trees—that represent the history of language families. These methods provide a powerful way to test hypotheses about prehistory, regarding the subgrouping, origins, expansion, and timing of the languages and their speakers. Through phylogenetics, we gain insights into the process of language evolution in general and into how fast individual features change in particular. This article introduces Bayesian phylogenetics as applied to languages. We describe substitution models for cognate evolution, molecular clock models for the evolutionary rate along the branches of a tree, and tree generating processes suitable for linguistic data. We explain how to find the best-suited model using path sampling or nested sampling. The theoretical background of these models is supplemented by a practical tutorial describing how to set up a Bayesian phylogenetic analysis using the software tool BEAST2.


Genetics ◽  
2000 ◽  
Vol 156 (3) ◽  
pp. 1249-1257
Author(s):  
Ilya Ruvinsky ◽  
Lee M Silver ◽  
Jeremy J Gibson-Brown

Abstract The duplication of preexisting genes has played a major role in evolution. To understand the evolution of genetic complexity it is important to reconstruct the phylogenetic history of the genome. A widely held view suggests that the vertebrate genome evolved via two successive rounds of whole-genome duplication. To test this model we have isolated seven new T-box genes from the primitive chordate amphioxus. We find that each amphioxus gene generally corresponds to two or three vertebrate counterparts. A phylogenetic analysis of these genes supports the idea that a single whole-genome duplication took place early in vertebrate evolution, but cannot exclude the possibility that a second duplication later took place. The origin of additional paralogs evident in this and other gene families could be the result of subsequent, smaller-scale chromosomal duplications. Our findings highlight the importance of amphioxus as a key organism for understanding evolution of the vertebrate genome.


Genetics ◽  
1997 ◽  
Vol 147 (4) ◽  
pp. 1855-1861 ◽  
Author(s):  
Montgomery Slatkin ◽  
Bruce Rannala

Abstract A theory is developed that provides the sampling distribution of low frequency alleles at a single locus under the assumption that each allele is the result of a unique mutation. The numbers of copies of each allele is assumed to follow a linear birth-death process with sampling. If the population is of constant size, standard results from theory of birth-death processes show that the distribution of numbers of copies of each allele is logarithmic and that the joint distribution of numbers of copies of k alleles found in a sample of size n follows the Ewens sampling distribution. If the population from which the sample was obtained was increasing in size, if there are different selective classes of alleles, or if there are differences in penetrance among alleles, the Ewens distribution no longer applies. Likelihood functions for a given set of observations are obtained under different alternative hypotheses. These results are applied to published data from the BRCA1 locus (associated with early onset breast cancer) and the factor VIII locus (associated with hemophilia A) in humans. In both cases, the sampling distribution of alleles allows rejection of the null hypothesis, but relatively small deviations from the null model can account for the data. In particular, roughly the same population growth rate appears consistent with both data sets.


2019 ◽  
Vol 11 (10) ◽  
pp. 2824-2849 ◽  
Author(s):  
Paweł Mackiewicz ◽  
Adam Dawid Urantówka ◽  
Aleksandra Kroczak ◽  
Dorota Mackiewicz

Abstract Mitochondrial genes are placed on one molecule, which implies that they should carry consistent phylogenetic information. Following this advantage, we present a well-supported phylogeny based on mitochondrial genomes from almost 300 representatives of Passeriformes, the most numerous and differentiated Aves order. The analyses resolved the phylogenetic position of paraphyletic Basal and Transitional Oscines. Passerida occurred divided into two groups, one containing Paroidea and Sylvioidea, whereas the other, Passeroidea and Muscicapoidea. Analyses of mitogenomes showed four types of rearrangements including a duplicated control region (CR) with adjacent genes. Mapping the presence and absence of duplications onto the phylogenetic tree revealed that the duplication was the ancestral state for passerines and was maintained in early diverged lineages. Next, the duplication could be lost and occurred independently at least four times according to the most parsimonious scenario. In some lineages, two CR copies have been inherited from an ancient duplication and highly diverged, whereas in others, the second copy became similar to the first one due to concerted evolution. The second CR copies accumulated over twice as many substitutions as the first ones. However, the second CRs were not completely eliminated and were retained for a long time, which suggests that both regions can fulfill an important role in mitogenomes. Phylogenetic analyses based on CR sequences subjected to the complex evolution can produce tree topologies inconsistent with real evolutionary relationships between species. Passerines with two CRs showed a higher metabolic rate in relation to their body mass.


Sign in / Sign up

Export Citation Format

Share Document