strict molecular clock
Recently Published Documents


TOTAL DOCUMENTS

6
(FIVE YEARS 4)

H-INDEX

1
(FIVE YEARS 0)

2021 ◽  
Author(s):  
Roux-Cil Ferreira ◽  
Emmanuel Wong ◽  
Gopi Gugan ◽  
Kaitlyn Wade ◽  
Molly Liu ◽  
...  

Phylogenetics has played a pivotal role in the genomic epidemiology of SARS-CoV-2, such as tracking the emergence and global spread of variants, and scientific communication. However, the rapid accumulation of genomic data from around the world - with over two million genomes currently available in the GISAID database - is testing the limits of standard phylogenetic methods. Here, we describe a new approach to rapidly analyze and visualize large numbers of SARS-CoV-2 genomes. Using Python, genomes are filtered for problematic sites, incomplete coverage, and excessive divergence from a strict molecular clock. All differences from the reference genome, including indels, are extracted using minimap2, and compactly stored as a set of features for each genome. For each Pango lineage (https://cov-lineages.org), we collapse genomes with identical features into 'variants', generate 100 bootstrap samples of the feature set union to generate weights, and compute the symmetric differences between the weighted feature sets for every pair of variants. The resulting distance matrices are used to generate neigihbor-joining trees in RapidNJ and converted into a majority-rule consensus tree for the lineage. Branches with support values below 50% or mean lengths below 0.5 differences are collapsed, and tip labels on affected branches are mapped to internal nodes as directly-sampled ancestral variants. Currently, we process about 1.6 million genomes in approximately nine hours on 34 cores. The resulting trees are visualized using the JavaScript framework D3.js as 'beadplots', in which variants are represented by horizontal line segments, annotated with beads representing samples by collection date. Variants are linked by vertical edges to represent branches in the consensus tree. These visualizations are published at https://filogeneti.ca/CoVizu. All source code was released under an MIT license at https://github.com/PoonLab/covizu.


2021 ◽  
Author(s):  
Edward Susko ◽  
Mike Steel ◽  
Andrew J. Roger

AbstractTwo recent high profile studies have attempted to use edge (branch) length ratios from large sets of phylogenetic trees to determine the relative ages of genes of different origins in the evolution of eukaryotic cells. This approach can be straightforwardly justified if substitution rates are constant over the tree for a given protein. However, such strict molecular clock assumptions are not expected to hold on the billion-year timescale. Here we propose an alternative set of conditions under which comparisons of edge length distributions from multiple sets of phylogenies of proteins with different origins can be validly used to discern the order of their origins. We also point out scenarios where these conditions are not expected to hold and caution is warranted.


2020 ◽  
Author(s):  
Thijs Janzen ◽  
Folmer Bokma ◽  
Rampal S. Etienne

ABSTRACTAlthough molecular mechanisms associated with the generation of mutations are highly conserved across taxa, there is widespread variation in mutation rates between evolutionary lineages. When phylogenies are reconstructed based on nucleotide sequences, such variation is typically accounted for by the assumption of a relaxed molecular clock, which, however, is just a statistical distribution of mutation rates without any underlying biological mechanism. Here, we propose that variation in accumulated mutations may be partly explained by an elevated mutation rate during speciation. Using simulations, we show how shifting mutations from branches to speciation events impacts inference of branching times in phylogenetic reconstruction. Furthermore, the resulting nucleotide alignments are better described by a relaxed than by a strict molecular clock. Thus, elevated mutation rates during speciation potentially explain part of the variation in substitution rates that is observed across the tree of life.


2019 ◽  
Author(s):  
Fransiskus Xaverius Ivan ◽  
Akhila Deshpande ◽  
Chun Wei Lim ◽  
Xinrui Zhou ◽  
Jie Zheng ◽  
...  

AbstractVarious computational and statistical approaches have been proposed to uncover the mutational patterns of rapidly evolving influenza viral genes. Nonetheless, the approaches mainly rely on sequence alignments which could potentially lead to spurious mutations obtained by comparing sequences from different clades that coexist during particular periods of time. To address this issue, we propose a phylogenetic tree-based pipeline that takes into account the evolutionary structure in the sequence data. Assuming that the sequences evolve progressively under a strict molecular clock, considering a competitive model that is based on a certain Markov model, and using a resampling approach to obtain robust estimates, we could capture statistically significant single-mutations and co-mutations during the sequence evolution. Moreover, by considering the results obtained from analyses that consider all paths and the longest path in the resampled trees, we can categorize the mutational sites and suggest their relevance. Here we applied the pipeline to investigate the 50 years of evolution of the HA sequences of influenza A/H3N2 viruses. In addition to confirming previous knowledge on the A/H3N2 HA evolution, we also demonstrate the use of the pipeline to classify mutational sites according to whether they are able to enhance antigenic drift, compensate other mutations that enhance antigenic drift, or both.


2018 ◽  
Vol 10 (6) ◽  
pp. 1631-1636 ◽  
Author(s):  
Fabia U Battistuzzi ◽  
Qiqing Tao ◽  
Lance Jones ◽  
Koichiro Tamura ◽  
Sudhir Kumar

2017 ◽  
Author(s):  
Fabia U. Battistuzzi ◽  
Qiqing Tao ◽  
Lance Jones ◽  
Koichiro Tamura ◽  
Sudhir Kumar

AbstractThe RelTime method estimates divergence times when evolutionary rates vary among lineages. Theoretical analyses show that RelTime relaxes the strict molecular clock throughout a molecular phylogeny, and it performs well in the analysis of empirical and computer simulated datasets in which evolutionary rates are variable. Lozano-Fernandez et al. (2017) found that the application of RelTime to one metazoan dataset (Erwin et al. 2011) produced equal rates for several ancient lineages, which led them to speculate that RelTime imposes a strict molecular clock for deep animal divergences. RelTime does not impose a strict molecular clock. The pattern observed by Lozano-Fernandez et al. (2017) was a result of the use of an option to assign the same rate to lineages in RelTime when the rates are not statistically significantly different. The median rate difference was 5% for many deep metazoan lineages for Erwin et al. (2011) dataset, so the rate equality was not rejected. In fact, RelTime analysis with and without the option to test rate differences produced very similar time estimates. We found that the Bayesian time estimates vary widely depending on the root priors assigned, and that the use of less restrictive priors produce Bayesian divergence times that are concordant with those from RelTime for Erwin et al. (2011) dataset. Therefore, it is prudent to discuss Bayesian estimates obtained under a range of priors in any discourse about molecular dating, including method comparisons.


Sign in / Sign up

Export Citation Format

Share Document