scholarly journals A Comparison of Methods for Estimating Substitution Rates from Ancient DNA Sequence Data

2017 ◽  
Author(s):  
K. Jun Tong ◽  
David A. Duchêne ◽  
Sebastián Duchêne ◽  
Jemma L. Geoghegan ◽  
Simon Y.W. Ho

AbstractThe estimation of evolutionary rates from ancient DNA sequences can be negatively affected by among-lineage rate variation and non-random sampling. Using a simulation study, we compared the performance of three phylogenetic methods for inferring evolutionary rates from time-structured data sets: root-to-tip regression, least-squares dating, and Bayesian inference. Our results show that these methods produce reliable estimates when the substitution rate is high, rate variation is low, and samples of similar ages are not phylogenetically clustered. The interaction of these factors is particularly important for Bayesian estimation of evolutionary rates. We also inferred rates for time-structured mitogenomic data sets from six vertebrate species. Root-to-tip regression estimated a different rate from least-squares dating and Bayesian inference for mitogenomes from the horse, which has high levels of among-lineage rate variation. We recommend using multiple methods of inference and testing data for temporal signal, among-lineage rate variation, and phylo-temporal clustering.

2016 ◽  
Author(s):  
Sebastian Duchêne ◽  
Kathryn E. Holt ◽  
François-Xavier Weill ◽  
Simon Le Hello ◽  
Jane Hawkey ◽  
...  

ABSTRACTEstimating the rates at which bacterial genomes evolve is critical to understanding major evolutionary and ecological processes such as disease emergence, long-term host-pathogen associations, and short-term transmission patterns. The surge in bacterial genomic data sets provides a new opportunity to estimate these rates and reveal the factors that shape bacterial evolutionary dynamics. For many organisms estimates of evolutionary rate display an inverse association with the time-scale over which the data are sampled. However, this relationship remains unexplored in bacteria due to the difficulty in estimating genome-wide evolutionary rates, which are impacted by the extent of temporal structure in the data and the prevalence of recombination. We collected 36 whole genome sequence data sets from 16 species of bacterial pathogens to systematically estimate and compare their evolutionary rates and assess the extent of temporal structure in the absence of recombination. The majority (28/36) of data sets possessed sufficient clock-like structure to robustly estimate evolutionary rates. However, in some species reliable estimates were not possible even with “ancient DNA” data sampled over many centuries, suggesting that they evolve very slowly or that they display extensive rate variation among lineages. The robustly estimated evolutionary rates spanned several orders of magnitude, from 10−6 to 10−8 nucleotide substitutions site-1 year-1. This variation was largely attributable to sampling time, which was strongly negatively associated with estimated evolutionary rates, with this relationship best described by an exponential decay curve. To avoid potential estimation biases such time-dependency should be considered when inferring evolutionary time-scales in bacteria.


Genetics ◽  
2001 ◽  
Vol 159 (1) ◽  
pp. 401-411
Author(s):  
Rasmus Nielsen

AbstractThis article describes a new Markov chain Monte Carlo (MCMC) method applicable to DNA sequence data, which treats mutations in the genealogy as missing data. The method facilitates inferences regarding the age and identity of specific mutations while taking the full complexities of the mutational process in DNA sequences into account. We demonstrate the utility of the method in three applications. First, we demonstrate how the method can be used to make inferences regarding population genetical parameters such as θ (the effective population size times the mutation rate). Second, we show how the method can be used to estimate the ages of mutations in finite sites models and for making inferences regarding the distribution and ages of nonsynonymous and synonymous mutations. The method is applied to two previously published data sets and we demonstrate that in one of the data sets the average age of nonsynonymous mutations is significantly lower than the average age of synonymous mutations, suggesting the presence of slightly deleterious mutations. Third, we demonstrate how the method in general can be used to evaluate the posterior distribution of a function of a mapping of mutations on a gene genealogy. This application is useful for evaluating the uncertainty associated with methods that rely on mapping mutations on a phylogeny or a gene genealogy.


2020 ◽  
Vol 37 (11) ◽  
pp. 3363-3379 ◽  
Author(s):  
Sebastian Duchene ◽  
Philippe Lemey ◽  
Tanja Stadler ◽  
Simon Y W Ho ◽  
David A Duchene ◽  
...  

Abstract Phylogenetic methods can use the sampling times of molecular sequence data to calibrate the molecular clock, enabling the estimation of evolutionary rates and timescales for rapidly evolving pathogens and data sets containing ancient DNA samples. A key aspect of such calibrations is whether a sufficient amount of molecular evolution has occurred over the sampling time window, that is, whether the data can be treated as having come from a measurably evolving population. Here, we investigate the performance of a fully Bayesian evaluation of temporal signal (BETS) in sequence data. The method involves comparing the fit to the data of two models: a model in which the data are accompanied by the actual (heterochronous) sampling times, and a model in which the samples are constrained to be contemporaneous (isochronous). We conducted simulations under a wide range of conditions to demonstrate that BETS accurately classifies data sets according to whether they contain temporal signal or not, even when there is substantial among-lineage rate variation. We explore the behavior of this classification in analyses of five empirical data sets: modern samples of A/H1N1 influenza virus, the bacterium Bordetella pertussis, coronaviruses from mammalian hosts, ancient DNA from Hepatitis B virus, and mitochondrial genomes of dog species. Our results indicate that BETS is an effective alternative to other tests of temporal signal. In particular, this method has the key advantage of allowing a coherent assessment of the entire model, including the molecular clock and tree prior which are essential aspects of Bayesian phylodynamic analyses.


2019 ◽  
Author(s):  
Sebastian Duchene ◽  
Philippe Lemey ◽  
Tanja Stadler ◽  
Simon YW Ho ◽  
David A Duchene ◽  
...  

AbstractPhylogenetic methods can use the sampling times of molecular sequence data to calibrate the molecular clock, enabling the estimation of evolutionary rates and timescales for rapidly evolving pathogens and data sets containing ancient DNA samples. A key aspect of such calibrations is whether a sufficient amount of molecular evolution has occurred over the sampling time window, that is, whether the data can be treated as having come from a measurably evolving population. Here we investigate the performance of a fully Bayesian evaluation of temporal signal (BETS) in sequence data. The method involves comparing the fit to the data of two models: a model in which the data are accompanied by the actual (heterochronous) sampling times, and a model in which the samples are constrained to be contemporaneous (isochronous). We conducted simulations under a wide range of conditions to demonstrate that BETS accurately classifies data sets according to whether they contain temporal signal or not, even when there is substantial among-lineage rate variation. We explore the behaviour of this classification in analyses of five empirical data sets: modern samples of A/H1N1 influenza virus, the bacterium Bordetella pertussis, coronaviruses from mammalian hosts, ancient DNA from Hepatitis B virus and mitochondrial genomes of dog species. Our results indicate that BETS is an effective alternative to other tests of temporal signal. In particular, this method has the key advantage of allowing a coherent assessment of the entire model, including the molecular clock and tree prior which are essential aspects of Bayesian phylodynamic analyses.


Genetics ◽  
1993 ◽  
Vol 134 (4) ◽  
pp. 1195-1204
Author(s):  
S Tarès ◽  
J M Cornuet ◽  
P Abad

Abstract An AluI family of highly reiterated nontranscribed sequences has been found in the genome of the honeybee Apis mellifera. This repeated sequence is shown to be present at approximately 23,000 copies per haploid genome constituting about 2% of the total genomic DNA. The nucleotide sequence of 10 monomers was determined. The consensus sequences is 176 nucleotides long and has an A + T content of 58%. There are clusters of both direct and inverted repeats. Internal subrepeating units ranging from 11 to 17 nucleotides are observed, suggesting that it could have evolved from a shorter sequence. DNA sequence data reveal that this repeat class is unusually homogeneous compared to the other class of invertebrate highly reiterated DNA sequences. The average pairwise sequence divergence between the repeats is 2.5%. In spite of this unusual homogeneity, divergence has been found in the repeated sequence hybridization ladder between four different honeybee subspecies. Therefore, the AluI highly reiterated sequences provide a new probe for fingerprinting in A. m. mellifera.


2009 ◽  
Vol 34 (3) ◽  
pp. 580-594 ◽  
Author(s):  
Anthony R. Magee ◽  
Ben-Erik van Wyk ◽  
Patricia M. Tilney ◽  
Stephen R. Downie

Generic circumscriptions and phylogenetic relationships of the Cape genera Capnophyllum, Dasispermum, and Sonderina are explored through parsimony and Bayesian inference analyses of nrDNA ITS and cpDNA rps16 intron sequences, morphology, and combined molecular and morphological data. The relationship of these genera with the North African genera Krubera and Stoibrax is also assessed. Analyses of both molecular data sets place Capnophyllum, Dasispermum, Sonderina, and the only southern African species of Stoibrax (S. capense) within the newly recognized Lefebvrea clade of tribe Tordylieae. Capnophyllum is strongly supported as monophyletic and is distantly related to Krubera. The monotypic genus Dasispermum and Stoibrax capense are embedded within a paraphyletic Sonderina. This complex is distantly related to the North African species of Stoibrax in tribe Apieae, in which the type species, Stoibrax dichotomum, occurs. Consequently, Dasispermum is expanded to include both Sonderina and Stoibrax capense. New combinations are formalized for Dasispermum capense, D. hispidum, D. humile, and D. tenue. An undescribed species from the Tanqua Karoo in South Africa is also closely related to Capnophyllum and the Dasispermum–Sonderina complex. The genus Scaraboides is described herein to accommodate the new species, S. manningii. This monotypic genus shares the dorsally compressed fruit and involute marginal wings with Capnophyllum, but is easily distinguished by its erect branching habit, green leaves, scabrous umbels, and fruit with indistinct median and lateral ribs, additional solitary vittae in each marginal wing, and parallel, closely spaced commissural vittae. Despite the marked fruit similarities with Capnophyllum, analyses of DNA sequence data place Scaraboides closer to the Dasispermum–Sonderina complex, with which it shares the erect habit, green (nonglaucous) leaves, and scabrous umbels.


2020 ◽  
Author(s):  
Patrick J. Brownsey ◽  
Daniel J. Ohlsen ◽  
Lara D. Shepherd ◽  
Whitney L. M. Bouma ◽  
Erin L. May ◽  
...  

Five indigenous species of Pellaea in Australasia belong to section Platyloma. Their taxonomic history is outlined, morphological, cytological and genetic evidence for their recognition reviewed, and new morphological and chloroplast DNA-sequence data provided. Australian plants of P. falcata (R.Br.) Fée are diploid and have longer, narrower pinnae than do New Zealand plants previously referred to P. falcata, which are tetraploid. Evidence indicates that P. falcata does not occur in New Zealand, and that collections so-named are P. rotundifolia (G.Forst.) Hook. Chloroplast DNA sequences are uninformative in distinguishing Australian P. falcata from New Zealand P. rotundifolia, but show that Australian P. nana is distinct from both. Sequence data also show that Australian and New Zealand populations of P. calidirupium Brownsey & Lovis are closely related, and that Australian P. paradoxa (R.Br.) Hook. is distinct from other Australian species. Although P. falcata is diploid and P. rotundifolia tetraploid, P. calidirupium, P. nana (Hook.) Bostock and P. paradoxa each contain multiple ploidy levels. Diploid populations of Pellaea species are confined to Australia, and only tetraploids are known in New Zealand. Evolution of the group probably involved hybridisation, autoploidy, alloploidy, and possibly apomixis. Further investigation is required to resolve the status of populations from Mount Maroon, Queensland and the Kermadec Islands.


Phytotaxa ◽  
2019 ◽  
Vol 408 (1) ◽  
pp. 77-84
Author(s):  
YING-LI PENG ◽  
ZHUANG ZHOU ◽  
SI-REN LAN ◽  
ZHONG-JIAN LIU

A new orchid species, Cymbidium jiangchengense, from Yunnan Province, China, is described and illustrated. Its distinctiveness is evaluated with morphology and molecular analyses. A detailed comparison between the newly discovered orchid and other members of Cymbidium was performed. The new plant was characterized by stem-like pseudobulbs, narrowly oblong leaves, coriaceous leaves with an acute apex, a 2-flowered inflorescence, a purplish pink flower, narrowly elliptic sepals, petals, a obovate-lanceolate lip with a cordate midlobe, a yellow central callus, and a disc with a trough shape longitudinal lamella from the base extending to the base of the midlobe and a lamellae apex inflated to form two calluses that are not confluent apically. These features distinguish this new orchid from all other known species of Cymbidium. A molecular study based on nuclear ribosomal ITS and plastid matK and rbcL DNA sequence data indicates that C. jiangchengense is a distinct species that sister to C. wadae and a member of section Eburnea, subgenus Cyperorchis.


2020 ◽  
Vol 6 (2) ◽  
Author(s):  
Sebastian Duchene ◽  
Leo Featherstone ◽  
Melina Haritopoulou-Sinanidou ◽  
Andrew Rambaut ◽  
Philippe Lemey ◽  
...  

Abstract The ongoing SARS-CoV-2 outbreak marks the first time that large amounts of genome sequence data have been generated and made publicly available in near real time. Early analyses of these data revealed low sequence variation, a finding that is consistent with a recently emerging outbreak, but which raises the question of whether such data are sufficiently informative for phylogenetic inferences of evolutionary rates and time scales. The phylodynamic threshold is a key concept that refers to the point in time at which sufficient molecular evolutionary change has accumulated in available genome samples to obtain robust phylodynamic estimates. For example, before the phylodynamic threshold is reached, genomic variation is so low that even large amounts of genome sequences may be insufficient to estimate the virus’s evolutionary rate and the time scale of an outbreak. We collected genome sequences of SARS-CoV-2 from public databases at eight different points in time and conducted a range of tests of temporal signal to determine if and when the phylodynamic threshold was reached, and the range of inferences that could be reliably drawn from these data. Our results indicate that by 2 February 2020, estimates of evolutionary rates and time scales had become possible. Analyses of subsequent data sets, that included between 47 and 122 genomes, converged at an evolutionary rate of about 1.1 × 10−3 subs/site/year and a time of origin of around late November 2019. Our study provides guidelines to assess the phylodynamic threshold and demonstrates that establishing this threshold constitutes a fundamental step for understanding the power and limitations of early data in outbreak genome surveillance.


Zootaxa ◽  
2012 ◽  
Vol 3361 (1) ◽  
pp. 56-62 ◽  
Author(s):  
JOSEFINA CURIEL ◽  
JUAN J. MORRONE

Insect life stages are known imperfectly in many cases, and classifications are usually based on adult morphology. This isunfortunate as information on other life stages may be useful for biomonitoring. The major impediment to using elmid(Coleoptera) larvae for freshwater biomonitoring is the lack of larval descriptions and illustrations. Reliable molecular proto-cols may be used to associate larvae and adults. After adults of seven species of Mexican Macrelmis were identified morpho-logically, seven larval specimens were associated to them based on two gene fragments: Cox1 and Cob. The phylogeneticanalysis allowed identifying the larval specimens as Macrelmis leonilae, M. scutellaris, M. species 7, M. species 10, and M.species 11. Two species based on adults associated uncertainly with one larva, and one larva did not match with any adult. Adult/larval association in elmids using DNA sequence data seems to be promising in terms of speed and reliability.


Sign in / Sign up

Export Citation Format

Share Document