scholarly journals Bayesian Evaluation of Temporal Signal in Measurably Evolving Populations

2020 ◽  
Vol 37 (11) ◽  
pp. 3363-3379 ◽  
Author(s):  
Sebastian Duchene ◽  
Philippe Lemey ◽  
Tanja Stadler ◽  
Simon Y W Ho ◽  
David A Duchene ◽  
...  

Abstract Phylogenetic methods can use the sampling times of molecular sequence data to calibrate the molecular clock, enabling the estimation of evolutionary rates and timescales for rapidly evolving pathogens and data sets containing ancient DNA samples. A key aspect of such calibrations is whether a sufficient amount of molecular evolution has occurred over the sampling time window, that is, whether the data can be treated as having come from a measurably evolving population. Here, we investigate the performance of a fully Bayesian evaluation of temporal signal (BETS) in sequence data. The method involves comparing the fit to the data of two models: a model in which the data are accompanied by the actual (heterochronous) sampling times, and a model in which the samples are constrained to be contemporaneous (isochronous). We conducted simulations under a wide range of conditions to demonstrate that BETS accurately classifies data sets according to whether they contain temporal signal or not, even when there is substantial among-lineage rate variation. We explore the behavior of this classification in analyses of five empirical data sets: modern samples of A/H1N1 influenza virus, the bacterium Bordetella pertussis, coronaviruses from mammalian hosts, ancient DNA from Hepatitis B virus, and mitochondrial genomes of dog species. Our results indicate that BETS is an effective alternative to other tests of temporal signal. In particular, this method has the key advantage of allowing a coherent assessment of the entire model, including the molecular clock and tree prior which are essential aspects of Bayesian phylodynamic analyses.

2019 ◽  
Author(s):  
Sebastian Duchene ◽  
Philippe Lemey ◽  
Tanja Stadler ◽  
Simon YW Ho ◽  
David A Duchene ◽  
...  

AbstractPhylogenetic methods can use the sampling times of molecular sequence data to calibrate the molecular clock, enabling the estimation of evolutionary rates and timescales for rapidly evolving pathogens and data sets containing ancient DNA samples. A key aspect of such calibrations is whether a sufficient amount of molecular evolution has occurred over the sampling time window, that is, whether the data can be treated as having come from a measurably evolving population. Here we investigate the performance of a fully Bayesian evaluation of temporal signal (BETS) in sequence data. The method involves comparing the fit to the data of two models: a model in which the data are accompanied by the actual (heterochronous) sampling times, and a model in which the samples are constrained to be contemporaneous (isochronous). We conducted simulations under a wide range of conditions to demonstrate that BETS accurately classifies data sets according to whether they contain temporal signal or not, even when there is substantial among-lineage rate variation. We explore the behaviour of this classification in analyses of five empirical data sets: modern samples of A/H1N1 influenza virus, the bacterium Bordetella pertussis, coronaviruses from mammalian hosts, ancient DNA from Hepatitis B virus and mitochondrial genomes of dog species. Our results indicate that BETS is an effective alternative to other tests of temporal signal. In particular, this method has the key advantage of allowing a coherent assessment of the entire model, including the molecular clock and tree prior which are essential aspects of Bayesian phylodynamic analyses.


2017 ◽  
Author(s):  
K. Jun Tong ◽  
David A. Duchêne ◽  
Sebastián Duchêne ◽  
Jemma L. Geoghegan ◽  
Simon Y.W. Ho

AbstractThe estimation of evolutionary rates from ancient DNA sequences can be negatively affected by among-lineage rate variation and non-random sampling. Using a simulation study, we compared the performance of three phylogenetic methods for inferring evolutionary rates from time-structured data sets: root-to-tip regression, least-squares dating, and Bayesian inference. Our results show that these methods produce reliable estimates when the substitution rate is high, rate variation is low, and samples of similar ages are not phylogenetically clustered. The interaction of these factors is particularly important for Bayesian estimation of evolutionary rates. We also inferred rates for time-structured mitogenomic data sets from six vertebrate species. Root-to-tip regression estimated a different rate from least-squares dating and Bayesian inference for mitogenomes from the horse, which has high levels of among-lineage rate variation. We recommend using multiple methods of inference and testing data for temporal signal, among-lineage rate variation, and phylo-temporal clustering.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Eleanor F. Miller ◽  
Andrea Manica

Abstract Background Today an unprecedented amount of genetic sequence data is stored in publicly available repositories. For decades now, mitochondrial DNA (mtDNA) has been the workhorse of genetic studies, and as a result, there is a large volume of mtDNA data available in these repositories for a wide range of species. Indeed, whilst whole genome sequencing is an exciting prospect for the future, for most non-model organisms’ classical markers such as mtDNA remain widely used. By compiling existing data from multiple original studies, it is possible to build powerful new datasets capable of exploring many questions in ecology, evolution and conservation biology. One key question that these data can help inform is what happened in a species’ demographic past. However, compiling data in this manner is not trivial, there are many complexities associated with data extraction, data quality and data handling. Results Here we present the mtDNAcombine package, a collection of tools developed to manage some of the major decisions associated with handling multi-study sequence data with a particular focus on preparing sequence data for Bayesian skyline plot demographic reconstructions. Conclusions There is now more genetic information available than ever before and large meta-data sets offer great opportunities to explore new and exciting avenues of research. However, compiling multi-study datasets still remains a technically challenging prospect. The mtDNAcombine package provides a pipeline to streamline the process of downloading, curating, and analysing sequence data, guiding the process of compiling data sets from the online database GenBank.


2009 ◽  
Vol 364 (1527) ◽  
pp. 2197-2207 ◽  
Author(s):  
Peter G. Foster ◽  
Cymon J. Cox ◽  
T. Martin Embley

The three-domains tree, which depicts eukaryotes and archaebacteria as monophyletic sister groups, is the dominant model for early eukaryotic evolution. By contrast, the ‘eocyte hypothesis’, where eukaryotes are proposed to have originated from within the archaebacteria as sister to the Crenarchaeota (also called the eocytes), has been largely neglected in the literature. We have investigated support for these two competing hypotheses from molecular sequence data using methods that attempt to accommodate the across-site compositional heterogeneity and across-tree compositional and rate matrix heterogeneity that are manifest features of these data. When ribosomal RNA genes were analysed using standard methods that do not adequately model these kinds of heterogeneity, the three-domains tree was supported. However, this support was eroded or lost when composition-heterogeneous models were used, with concomitant increase in support for the eocyte tree for eukaryotic origins. Analysis of combined amino acid sequences from 41 protein-coding genes supported the eocyte tree, whether or not composition-heterogeneous models were used. The possible effects of substitutional saturation of our data were examined using simulation; these results suggested that saturation is delayed by among-site rate variation in the sequences, and that phylogenetic signal for ancient relationships is plausibly present in these data.


2015 ◽  
Author(s):  
Michael R. May ◽  
Sebastian Höhna ◽  
Brian R. Moore

The paleontological record chronicles numerous episodes of mass extinction that severely culled the Tree of Life. Biologists have long sought to assess the extent to which these events may have impacted particular groups. We present a novel method for detecting mass-extinction events from phylogenies estimated from molecular sequence data. We develop our approach in a Bayesian statistical framework, which enables us to harness prior information on the frequency and magnitude of mass-extinction events. The approach is based on an episodic stochastic-branching process model in which rates of speciation and extinction are constant between rate-shift events. We model three types of events: (1) instantaneous tree-wide shifts in speciation rate; (2) instantaneous tree-wide shifts in extinction rate, and; (3) instantaneous tree-wide mass-extinction events. Each of the events is described by a separate compound Poisson process (CPP) model, where the waiting times between each event are exponentially distributed with event-specific rate parameters. The magnitude of each event is drawn from an event-type specific prior distribution. Parameters of the model are then estimated using a reversible-jump Markov chain Monte Carlo (rjMCMC) algorithm. We demonstrate via simulation that this method has substantial power to detect the number of mass-extinction events, provides unbiased estimates of the timing of mass-extinction events, while exhibiting an appropriate (i.e., below 5%) false discovery rate even in the case of background diversification rate variation. Finally, we provide an empirical application of this approach to conifers, which reveals that this group has experienced two major episodes of mass extinction. This new approach?the CPP on Mass Extinction Times (CoMET) model?provides an effective tool for identifying mass-extinction events from molecular phylogenies, even when the history of those groups includes more prosaic temporal variation in diversification rate.


MycoKeys ◽  
2021 ◽  
Vol 81 ◽  
pp. 69-138
Author(s):  
Yalemwork Meswaet ◽  
Ralph Mangelsdorff ◽  
Nourou S. Yorou ◽  
Meike Piepenbring

Cercosporoid fungi (Mycosphaerellaceae, Mycosphaerellales, Ascomycota) are one of the largest and most diverse groups of hyphomycetes causing a wide range of diseases of economically important plants as well as of plants in the wild. Although more than 6000 species are known for this group, the documentation of this fungal group is far from complete. Especially in the tropics, the diversity of cercosporoid fungi is poorly known. The present study aims to identify and characterise cercosporoid fungi collected on host plants belonging to Fabaceae in Benin, West Africa. Information on their morphology, host species and DNA sequence data (18S rDNA, 28S rDNA, ITS and tef1) is provided. DNA sequence data were obtained by a simple and non-culture-based method for DNA isolation which has been applied for cercosporoid fungi for the first time in the context of the present study. Among the loci used for the phylogenetic analysis, tef1 provided the best resolution together with the multigene dataset. Species delimitation in many cases, however, was only possible by combining molecular sequence data with morphological characteristics. Based on forty specimens recently collected in Benin, 18 species are presented with morphological descriptions, illustrations and sequence data. Among these, six species in the genus Cercospora and two species in Pseudocercospora are proposed as species new to science. The newly described species are Cercospora (C.) beninensis on Crotalaria macrocalyx, C. parakouensis on Desmodium tortuosum, C. rhynchophora on Vigna unguiculata, C. vignae-subterraneae on Vigna subterranea, C. tentaculifera on Vigna unguiculata, C. zorniicola on Zornia glochidiata, Pseudocercospora sennicola on Senna occidentalis and Pseudocercospora tabei on Vigna unguiculata. Eight species of cercosporoid fungi are reported for Benin for the first time, three of them, namely C. cf. canscorina, C. cf. fagopyri and C. phaseoli-lunati are new for West Africa. The presence of two species of cercosporoid fungi on Fabaceae previously reported from Benin, namely Nothopassalora personata and Passalora arachidicola, is confirmed.


2015 ◽  
Author(s):  
Remco Bouckaert ◽  
Peter Lockhart

Most methods for performing a phylogenetic analysis based on sequence alignments of gene data assume that the mechanism of evolution is constant through time. It is recognised that some sites do evolve somewhat faster than others, and this can be captured using a (gamma) rate heterogeneity model. Further, some species have shorter replication times than others, and this results in faster rates of substitution in some lineages. This feature of lineage specific rate variation can be captured to some extent, by using relaxed clock models. However, it is also clear that there are additional poorly characterised features of sequence data that can sometimes lead to extreme differences in lineage specific rates. This variation is poorly captured by constant time reversible substitution models. The significance of extreme lineage specific rate differences is that they lead both to errors in reconstructing evolutionary relationships as well as biased estimates for the age of ancestral nodes. We propose a new model that allows gamma rate heterogeneity to change on branches, thus offering a more realistic model of sequence evolution. It adds negligible computational cost to likelihood calculations. We illustrate its effectiveness with an example of green algae and land-plants. For many real world data sets, we find a much better fit with multi-gamma sites models as well as substantial differences in ancestral node date estimates.


2016 ◽  
Author(s):  
Sebastian Duchêne ◽  
Kathryn E. Holt ◽  
François-Xavier Weill ◽  
Simon Le Hello ◽  
Jane Hawkey ◽  
...  

ABSTRACTEstimating the rates at which bacterial genomes evolve is critical to understanding major evolutionary and ecological processes such as disease emergence, long-term host-pathogen associations, and short-term transmission patterns. The surge in bacterial genomic data sets provides a new opportunity to estimate these rates and reveal the factors that shape bacterial evolutionary dynamics. For many organisms estimates of evolutionary rate display an inverse association with the time-scale over which the data are sampled. However, this relationship remains unexplored in bacteria due to the difficulty in estimating genome-wide evolutionary rates, which are impacted by the extent of temporal structure in the data and the prevalence of recombination. We collected 36 whole genome sequence data sets from 16 species of bacterial pathogens to systematically estimate and compare their evolutionary rates and assess the extent of temporal structure in the absence of recombination. The majority (28/36) of data sets possessed sufficient clock-like structure to robustly estimate evolutionary rates. However, in some species reliable estimates were not possible even with “ancient DNA” data sampled over many centuries, suggesting that they evolve very slowly or that they display extensive rate variation among lineages. The robustly estimated evolutionary rates spanned several orders of magnitude, from 10−6 to 10−8 nucleotide substitutions site-1 year-1. This variation was largely attributable to sampling time, which was strongly negatively associated with estimated evolutionary rates, with this relationship best described by an exponential decay curve. To avoid potential estimation biases such time-dependency should be considered when inferring evolutionary time-scales in bacteria.


2021 ◽  
Vol 95 ◽  
Author(s):  
N.Q.-X. Wee ◽  
S.C. Cutmore ◽  
T.H. Cribb

Abstract Of over 250 species of Monorchiidae Odhner, 1911, just four are known from gerreid fishes. Here, we report adult specimens of a new species infecting Gerres oyena (Forsskål) and Gerres subfasciatus Cuvier from off Heron Island and North Stradbroke Island, Queensland, Australia. The species is morphologically most similar to the concept of Lasiotocus Looss, 1907, which currently comprises eight species, in the possession of an unspined genital atrium, bipartite terminal organ, round oral sucker and unlobed ovary. However, phylogenetic analyses of the 28S ribosomal DNA gene region shows the species to be distantly related to the two sequenced species of Lasiotocus – Lasiotocus mulli (Stossich, 1883) Odhner, 1911 and Lasiotocus trachinoti Overstreet & Brown, 1970 – and that it clearly requires a distinct genus; thus, we propose Gerricola queenslandensis n. g., n. sp. Morphologically, G. queenslandensis n. g., n. sp. differs significantly from L. mulli and L. trachinoti only in the possession of distinctly longer caeca, which terminate in the post-testicular region, and in the absence of a distinct gap in the terminal organ spines. The remaining species of Lasiotocus possess caeca that also terminate in the post-testicular region, which might warrant their transfer to Gerricola n. g. However, doubt about their monophyly due to a combination of significant morphological variation, a lack of information on some features and infection of a wide range of hosts, lead us to retain these taxa as species of Lasiotocus until molecular sequence data are available to better inform their phylogenetic and taxonomic positions. Sporocysts and cercariae of G. queenslandensis n. g., n. sp. were found in a lucinid bivalve, Codakia paytenorum (Iredale), from Heron Island. Sexual adult and intramolluscan stages were genetically matched with the ITS2 ribosomal DNA and cox1 mitochondrial DNA regions. This is the second record of the Lucinidae as a first intermediate host for the Monorchiidae. Additionally, we report sporocysts and cercariae of another monorchiid infection in a tellinid bivalve, Jactellina clathrata (Deshayes), from Heron Island. Molecular sequence data for this species do not match any sequenced species and phylogenetic analyses do not suggest any generic position.


2020 ◽  
pp. 73-95
Author(s):  
Eveling Gabot-Rodríguez ◽  
Sixto J. Incháustegui ◽  
Markus Pfenninger ◽  
Barbara Feldmeyer ◽  
Gunther Kӧhler

Anoles are a group of lizards that offer a wide range of opportunities to study different biological topics. In this work, we examined some aspects of the morphology from 139 individuals of green anoles collected in urban parks of Santo Domingo and the Distrito Nacional. We investigated evidence of hybridization between the two Hispaniola endemic species Anolis chlorocyanus and A. cyanostictus and the introduced species A. porcatus. We categorized the individuals in pure species and intermediates based on their phenotype. Additionally, mitochondrial 16S sequence data was generated from the collected specimens, to compare molecular and phenotypic species assignments. We consider the general congruence between both data sets in most but inconsistency in a few specimens as evidence for hybridization between the two endemic species. However, we did not find evidence of hybridization between any of these species and the introduced species A. porcatus. Nevertheless, the continuous expansion of the distribution of this invasive species possibly will have drastic negative consequences for the populations of the endemic species.


Sign in / Sign up

Export Citation Format

Share Document