scholarly journals Variation in the molecular clock of primates

2016 ◽  
Author(s):  
Priya Moorjani ◽  
Carlos Eduardo G. Amorim ◽  
Peter F. Arndt ◽  
Molly Przeworski

Events in primate evolution are often dated by assuming a "molecular clock", i.e., a constant rate of substitution per unit time, but the validity of this assumption remains unclear. Among mammals, it is well known that there exists substantial variation in yearly substitution rates. Such variation is to be expected from differences in life-history traits, suggesting that it should also be found among primates. Motivated by these considerations, we analyze whole genomes from ten primate species, including Old World Monkeys (OWMs), New World Monkeys (NWMs) and apes, focusing on putatively neutral autosomal sites and controlling for possible effects of biased gene conversion and methylation at CpG sites. We find that substitution rates are ~65% higher in lineages leading from the hominoid-NWM ancestor to NWMs than to apes. Within apes, rates are ~2% higher in chimpanzees and ~7% higher in the gorilla than in humans. Substitution types subject to biased gene conversion show no more variation among species than those not subject to it. Not all mutation types behave similarly, however: in particular, transitions at CpG sites exhibit a more clock-like behavior than do other types, presumably due to their non-replicative origin. Thus, not only the total rate, but also the mutational spectrum varies among primates. This finding suggests that events in primate evolution are most reliably dated using CpG transitions. Taking this approach, we estimate that the average time to the most recent common ancestor of human and chimpanzee is 12.1 million years and their split time 7.9 million years.

2016 ◽  
Vol 113 (38) ◽  
pp. 10607-10612 ◽  
Author(s):  
Priya Moorjani ◽  
Carlos Eduardo G. Amorim ◽  
Peter F. Arndt ◽  
Molly Przeworski

Events in primate evolution are often dated by assuming a constant rate of substitution per unit time, but the validity of this assumption remains unclear. Among mammals, it is well known that there exists substantial variation in yearly substitution rates. Such variation is to be expected from differences in life history traits, suggesting it should also be found among primates. Motivated by these considerations, we analyze whole genomes from 10 primate species, including Old World Monkeys (OWMs), New World Monkeys (NWMs), and apes, focusing on putatively neutral autosomal sites and controlling for possible effects of biased gene conversion and methylation at CpG sites. We find that substitution rates are up to 64% higher in lineages leading from the hominoid–NWM ancestor to NWMs than to apes. Within apes, rates are ∼2% higher in chimpanzees and ∼7% higher in the gorilla than in humans. Substitution types subject to biased gene conversion show no more variation among species than those not subject to it. Not all mutation types behave similarly, however; in particular, transitions at CpG sites exhibit a more clocklike behavior than do other types, presumably because of their nonreplicative origin. Thus, not only the total rate, but also the mutational spectrum, varies among primates. This finding suggests that events in primate evolution are most reliably dated using CpG transitions. Taking this approach, we estimate the human and chimpanzee divergence time is 12.1 million years,​ and the human and gorilla divergence time is 15.1 million years​.


2010 ◽  
Vol 7 (11) ◽  
pp. 3387-3402 ◽  
Author(s):  
S. Trajanovski ◽  
C. Albrecht ◽  
K. Schreiber ◽  
R. Schultheiß ◽  
T. Stadler ◽  
...  

Abstract. Ancient Lake Ohrid on the Balkan Peninsula is considered to be the oldest ancient lake in Europe with a suggested Plio-/Pleistocene age. Its exact geological age, however, remains unknown. Therefore, molecular clock data of Lake Ohrid biota may serve as an independent constraint of available geological data, and may thus help to refine age estimates. Such evolutionary data may also help unravel potential biotic and abiotic factors that promote speciation events. Here, mitochondrial sequencing data of one of the largest groups of endemic taxa in the Ohrid watershed, the leech genus Dina, is used to test whether it represents an ancient lake species flock, to study the role of potential horizontal and vertical barriers in the watershed for evolutionary events, to estimate the onset of diversification in this group based on molecular clock analyses, and to compare this data with data from other endemic species for providing an approximate time frame for the origin of Lake Ohrid. Based on the criteria speciosity, monophyly and endemicity, it can be concluded that Dina spp. from the Ohrid watershed, indeed, represents an ancient lake species flock. Lineage sorting of its species, however, does not seem to be complete and/or hybridization may occur. Analyses of population structures of Dina spp. in the Ohrid watershed indicate a horizontal zonation of haplotypes from spring and lake populations, corroborating the role of lake-side springs, particularly the southern feeder springs, for evolutionary processes in endemic Ohrid taxa. Vertical differentiation of lake taxa, however, appears to be limited, though differences between populations from the littoral and the profundal are apparent. Molecular clock analyses indicate that the most recent common ancestor of extant species of this flock is approximately 1.99 ± 0.83 million years (Ma) old, whereas the split of the Ohrid Dina flock from a potential sister taxon outside the lake is estimated at 8.30 ± 3.60 Ma. Comparisons with other groups of endemic Ohrid species indicated that in all cases, diversification within the watershed started ≤2 Ma ago. Thus, this estimate may provide information on a minimum age for the origin of Lake Ohrid. Maximum ages are less consistent and generally less reliable. But cautiously, a maximum age of 3 Ma is suggested. Interestingly, this time frame of approximately 2–3 Ma ago for the origin of Lake Ohrid, generated based on genetic data, well fits the time frame most often used in the literature by geologists.


2010 ◽  
Vol 7 (4) ◽  
pp. 5011-5045 ◽  
Author(s):  
S. Trajanovski ◽  
C. Albrecht ◽  
K. Schreiber ◽  
R. Schultheiß ◽  
T. Stadler ◽  
...  

Abstract. Ancient Lake Ohrid on the Balkan Peninsula is considered to be the oldest ancient lake in Europe with a suggested Plio-Pleistocene age. Its exact geological age, however, remains unknown. Therefore, molecular clock data of Lake Ohrid biota may serve as an independent constraint of available geological data, and may thus also help to refine age estimates. Such evolutionary data may also help unravel potential biotic and abiotic factors that promote speciation events. Here, mitochondrial sequencing data of one of the largest groups of endemic taxa in Lake Ohrid, the leech genus Dina, is used to test whether it represents an ancient lake species flock, to study the role of horizontal and vertical barriers in Lake Ohrid for evolutionary events, to estimate the onset of intralacustrine diversification in this group based on molecular clock analyses, and to compare this data with data from other endemic species for providing an approximate time frame for the origin of Lake Ohrid. Based on the criteria speciosity, monophyly and endemicity, it can be concluded that Lake Ohrid Dina, indeed, represents an ancient lake species flock. Lineage sorting of its species, however, does not seem to be complete. Analyses of population structures of Dina spp. in the Ohrid watershed indicate a horizontal zonation of haplotypes from spring and lake populations, corroborating the role of lake-side springs, particularly the southern feeder springs, for evolutionary processes in endemic Ohrid taxa. Vertical differentiation of lake taxa, however, appears to be limited, though differences between populations from the littoral and the profundal are apparent. Molecular clock analyses indicate that the most recent common ancestor of extant species of this flock is approximately 1.99±0.83 Ma old, whereas the split of the Lake Ohrid Dina flock from a potential sister taxon outside the lake is estimated at 8.30±3.60 Ma. Comparisons with other groups of endemic Ohrid species indicated that in all cases, intralacustrine diversification started ≤2 Ma ago. Thus, this estimate may provide information on a minimum age for the origin of Lake Ohrid. Maximum ages are less consistent and generally less reliable. But cautiously, a maximum age of 3 Ma is suggested. Interestingly, this time frame of approximately 2–3 Ma for the origin of Lake Ohrid, generated based solely on evolutionary data, well fits the time frame most often used in the literature by geologists. Future studies must show whether this concurrence holds true.


2005 ◽  
Vol 79 (3) ◽  
pp. 1595-1604 ◽  
Author(s):  
Leen Vijgen ◽  
Els Keyaerts ◽  
Elien Moës ◽  
Inge Thoelen ◽  
Elke Wollants ◽  
...  

ABSTRACT Coronaviruses are enveloped, positive-stranded RNA viruses with a genome of approximately 30 kb. Based on genetic similarities, coronaviruses are classified into three groups. Two group 2 coronaviruses, human coronavirus OC43 (HCoV-OC43) and bovine coronavirus (BCoV), show remarkable antigenic and genetic similarities. In this study, we report the first complete genome sequence (30,738 nucleotides) of the prototype HCoV-OC43 strain (ATCC VR759). Complete genome and open reading frame (ORF) analyses were performed in comparison to the BCoV genome. In the region between the spike and membrane protein genes, a 290-nucleotide deletion is present, corresponding to the absence of BCoV ORFs ns4.9 and ns4.8. Nucleotide and amino acid similarity percentages were determined for the major HCoV-OC43 ORFs and for those of other group 2 coronaviruses. The highest degree of similarity is demonstrated between HCoV-OC43 and BCoV in all ORFs with the exception of the E gene. Molecular clock analysis of the spike gene sequences of BCoV and HCoV-OC43 suggests a relatively recent zoonotic transmission event and dates their most recent common ancestor to around 1890. An evolutionary rate in the order of 4 × 10−4 nucleotide changes per site per year was estimated. This is the first animal-human zoonotic pair of coronaviruses that can be analyzed in order to gain insights into the processes of adaptation of a nonhuman coronavirus to a human host, which is important for understanding the interspecies transmission events that led to the origin of the severe acute respiratory syndrome outbreak.


2014 ◽  
Author(s):  
Sylvain Glemin ◽  
Peter F Arndt ◽  
Philipp W Messer ◽  
Dmitri Petrov ◽  
Nicolas Galtier ◽  
...  

Many lines of evidence indicate GC-biased gene conversion (gBGC) has a major impact on the evolution of mammalian genomes. However, up to now, this process had not been properly quantified. In principle, the strength of gBGC can be measured from the analysis of derived allele frequency spectra. However, this approach is sensitive to a number of confounding factors. In particular, we show by simulations that the inference is pervasively affected by polymorphism polarization errors, especially at hypermutable sites, and spatial heterogeneity in gBGC strength. Here we propose a new method to quantify gBGC from DAF spectra, incorporating polarization errors and taking spatial heterogeneity into account. This method is very general in that it does not require any prior knowledge about the source of polarization errors and also provides information about mutation patterns. We apply this approach to human polymorphism data from the 1000 genomes project. We show that the strength of gBGC does not differ between hypermutable CpG sites and non-CpG sites, suggesting that in humans gBGC is not caused by the base-excision repair machinery. We further find that the impact of gBGC is concentrated primarily within recombination hotspots: genome-wide, the strength of gBGC is in the nearly neutral area, but 2% of the human genome is subject to strong gBGC, with population-scaled gBGC coefficients above 5. Given that the location of recombination hotspots evolves very rapidly, our analysis predicts that in the long term, a large fraction of the genome is affected by short episodes of strong gBGC.


2005 ◽  
Vol 86 (5) ◽  
pp. 1467-1474 ◽  
Author(s):  
Gareth J. Hughes ◽  
Lillian A. Orciari ◽  
Charles E. Rupprecht

Throughout North America, rabies virus (RV) is endemic in bats. Distinct RV variants exist that are closely associated with infection of individual host species, such that there is little or no sustained spillover infection away from the primary host. Using Bayesian methodology, nucleotide substitution rates were estimated from alignments of partial nucleoprotein (N) gene sequences of nine distinct bat RV variants from North America. Substitution rates ranged from 2·32×10−4 to 1·38×10−3 substitutions per site per year. A maximum-likelihood (ML) molecular clock model was rejected for only two of the nine datasets. In addition, using sequences from bat RV variants across the Americas, the evolutionary rate for the complete N gene was estimated to be 2·32×10−4. This rate was used to scale trees using Bayesian and ML methods, and the time of the most recent common ancestor for current bat RV variant diversity in the Americas was estimated to be 1660 (range 1267–1782) and 1651 (range 1254–1773), respectively. Our reconstructions suggest that RV variants currently associated with infection of bats from Latin America (Desmodus and Tadarida) share the earliest common ancestor with the progenitor RV. In addition, from the ML tree, times were estimated for the emergence of the three major lineages responsible for bat rabies cases in North America. Adaptation to infection of the colonial bat species analysed (Eptesicus fuscus, Myotis spp.) appears to have occurred much quicker than for the solitary species analysed (Lasionycteris noctivagans, Pipistrellus subflavus, Lasiurus borealis, Lasiurus cinereus), suggesting that the process of virus adaptation may be dependent on host biology.


1992 ◽  
Vol 6 ◽  
pp. 100-100
Author(s):  
John J. Flynn

Calculations of “rates of evolution” have been applied to a variety of indicators of change within populations, species, or higher taxa. This has led to confusion about taxonomic and temporal scaling, particularly when rates are calculated for supposedly “equivalent” taxonomic ranks, or “higher-level” taxa that are not monophyletic groups. All calculations of rates of evolutionary change require accurate temporal calibration. Even in studies of molecular evolution that assume a “molecular clock”, the rate at which any clock ticks must be calibrated empirically by fossil data on the age of divergence of some taxa.Molecular clock rates for all Mammalia generally have been calculated from the primate fossil record and phylogeny. However, rates of molecular evolution have been shown to vary both within and among different clades. Given a preference for a more rigorous system in which molecular divergence is not assumed to occur at a constant rate, the time of divergence should be determined directly for all clades in studies of molecular “rates of evolution”.The mammalian order Carnivora is a monophyletic group widely cited in studies of evolutionary tempo, and mode. However, few of those rate studies have considered explicitly the roles of fossil taxa and rigorously tested phylogenies. For example, phylogenetic placement of early Cenozoic Carnivora (generally placed in the paraphyletic “stem-group” “Miacoidea”), relative to the two major clades of living Carnivora (Caniformia and Feliformia), profoundly influences estimates of the age of cladogenetic divergence for clades of living carnivorans. If all the taxa placed within the “Miacoidea” lie outside a restricted clade of Carnivora (defined as the most recent common ancestor of extant Carnivora, and all of its descendants), then the oldest Carnivora (“neocarnivorans”) are late Eocene (about 35–40 Ma). However, if miacid “miacoids” are caniforms and viverravid “miacoids” are feliforms, then the Caniformia/Feliformia (=Carnivora) clade is at least as old as the oldest “miacoid” (middle Paleocene, or >60 Ma). The implications for calculations of rates of evolution within Carnivora are obvious. Similarly, many fossil Carnivora taxa have been assigned to living families, although the phylogenetic relationships of both fossil and living taxa within most of these families has been poorly understood. This presentation will consider: 1) minimum estimates of clade divergence time, based on current hypotheses of carnivoran phylogeny (emphasizing placement of fossil taxa) and oldest occurrence of fossils within a clade or its sister group- traditional taxonomies both underestimate (e.g. Caniformia/Feliformia) and overestimate (e.g. some living families, such as Viverridae) clade divergence times; and 2) calculation of rates of evolution within Carnivora, focusing on taxonomic diversification and molecular divergence, comparison of rates calculated using traditional taxonomies and artificial “higher-taxa” categories versus those using phylogenetic clades (“unranked”), and the effects of fossil taxa.


2019 ◽  
Vol 5 (2) ◽  
Author(s):  
Magda Bletsa ◽  
Marc A Suchard ◽  
Xiang Ji ◽  
Sophie Gryseels ◽  
Bram Vrancken ◽  
...  

Abstract The need to estimate divergence times in evolutionary histories in the presence of various sources of substitution rate variation has stimulated a rich development of relaxed molecular clock models. Viral evolutionary studies frequently adopt an uncorrelated clock model as a generic relaxed molecular clock process, but this may impose considerable estimation bias if discrete rate variation exists among clades or lineages. For HIV-1 group M, rate variation among subtypes has been shown to result in inconsistencies in time to the most recent common ancestor estimation. Although this calls into question the adequacy of available molecular dating methods, no solution to this problem has been offered so far. Here, we investigate the use of mixed effects molecular clock models, which combine both fixed and random effects in the evolutionary rate, to estimate divergence times. Using simulation, we demonstrate that this model outperforms existing molecular clock models in a Bayesian framework for estimating time-measured phylogenies in the presence of mixed sources of rate variation, while also maintaining good performance in simpler scenarios. By analysing a comprehensive HIV-1 group M complete genome data set we confirm considerable rate variation among subtypes that is not adequately modelled by uncorrelated relaxed clock models. The mixed effects clock model can accommodate this rate variation and results in a time to the most recent common ancestor of HIV-1 group M of 1920 (1915–25), which is only slightly earlier than the uncorrelated relaxed clock estimate for the same data set. The use of complete genome data appears to have a more profound impact than the molecular clock model because it reduces the credible intervals by 50 per cent relative to similar estimates based on short envelope gene sequences.


2020 ◽  
Author(s):  
Jonathan Pekar ◽  
Michael Worobey ◽  
Niema Moshiri ◽  
Konrad Scheffler ◽  
Joel O. Wertheim

AbstractUnderstanding when SARS-CoV-2 emerged is critical to evaluating our current approach to monitoring novel zoonotic pathogens and understanding the failure of early containment and mitigation efforts for COVID-19. We employed a coalescent framework to combine retrospective molecular clock inference with forward epidemiological simulations to determine how long SARS-CoV-2 could have circulated prior to the time of the most recent common ancestor. Our results define the period between mid-October and mid-November 2019 as the plausible interval when the first case of SARS-CoV-2 emerged in Hubei province. By characterizing the likely dynamics of the virus before it was discovered, we show that over two-thirds of SARS-CoV-2-like zoonotic events would be self-limited, dying out without igniting a pandemic. Our findings highlight the shortcomings of zoonosis surveillance approaches for detecting highly contagious pathogens with moderate mortality rates.


2021 ◽  
Author(s):  
Mahan Ghafari ◽  
Peter Simmonds ◽  
Oliver G Pybus ◽  
Aris Katzourakis

AbstractMolecular clock dating is widely used to estimate timescales of phylogenetic histories and to infer rates at which species evolve. One of the major challenges with inferring rates of molecular evolution is the observation of a strong correlation between estimated rates and the timeframe of their measurements. Recent empirical analysis of virus evolutionary rates suggest that a power-law rate decay best explains the time-dependent pattern of substitution rates and that the same pattern is observed regardless of virus type (e.g. groups I-VII in the Baltimore classification). However there exists no explanation for this trend based on molecular evolutionary mechanisms. We provide a simple predictive mechanistic model of the time-dependent rate phenomenon, incorporating saturation and host constraints on the evolution of some sites. Our model recapitulates the ubiquitous power-law rate decay with a slope of −0.65 (95% HPD: −0.72, −0.52) and can satisfactorily account for the variation in inferred molecular evolutionary rates over a wide range of timeframes. We show that once the saturation of sites starts - typically after hundreds of years in RNA viruses and thousands of years in DNA viruses - standard substitution models fail to correctly estimate divergence times among species, while our model successfully re-creates the observed pattern of rate decay. We apply our model to re-date the diversification of genotypes of hepatitis C virus (HCV) to 396,000 (95% HPD: 326,000 - 425,000) years before present, a time preceding the dispersal of modern humans out of Africa, and also showed that the most recent common ancestor of sarbecoviruses dates back to 23,500 (95% HPD: 21,100 - 25,300) years ago, nearly thirty times older than previous estimates. This not only creates a radical new perspective for our understanding the origins of HCV but also suggests a substantial revision of evolutionary timescales of other viruses can be similarly achieved.


Sign in / Sign up

Export Citation Format

Share Document