coalescent tree
Recently Published Documents


TOTAL DOCUMENTS

33
(FIVE YEARS 8)

H-INDEX

8
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Remi Matthey-Doret

Forward simulations are increasingly important in evolutionary genetics to simulate selection with realistic demography, mating systems and ecology. To reach the performance needed for genome-wide simulations a number of new simulation techniques have been developed recently. Kelleher et al. (2018) introduced a technique consisting in recording the entire genetic history of the population and placing mutations on the coalescent tree. This method cannot model selection. I recently introduced a simulation technique that speed up fitness calculation by assuming that fitness effects among haplotypes are multiplicative (Matthey-Doret, 2021). More precisely, fitness measures are stored for subsets of the genome and, at time of reproduction, if no recombination happen within a given subset, then the fitness for this subset for the offspring haplotype is directly inferred from the parental haplotype. Here, I present a hybrid of the above two techniques. The algorithm records the genetic history of a species, directly places the mutations on the tree and infers fitness of subsets of the genome from parental haplotypes. At recombinant sites, the algorithm explores the tree to reconstruct the genetic data at the recombining segment. I benchmarked this new technique implemented in SimBit and report an important improvement of performance compared to previous techniques to simulate selection. This improvement is particularly drastic at low recombination rate. Such developments of new simulation techniques are pushing the horizon of the realism with which we can simulate species molecular evolution.


2020 ◽  
Author(s):  
Niema Moshiri

AbstractMotivationThe ability to simulate coalescent viral phylogenies constrained by a given transmission network can enable the benchmarking of computational tools used in molecular epidemiology as well as the ability to gain insights into unobservable aspects of the virology of a novel pathogen. However, such simulation experiments require generating a large number of technical simulation replicates, and existing tools for coalescent simulations along a transmission network are too slow to conduct such experiments at the scale of the global population.ResultsCoaTran is a massively scalable tool that simulates a coalescent viral phylogeny constrained by a user-provided transmission network. CoaTran is written in highly-optimized C++ code and can generate global population scale phylogenetic coalescent simulations in seconds to minutes.AvailabilityCoaTran is freely available at https://github.com/niemasd/CoaTran as an open-source software [email protected] informationSupplementary data are available online.


Author(s):  
Jordan Douglas

Abstract Summary Visualization is a vital task in phylogenetics and yet there is a deficit in programs which visualize the multispecies coalescent (MSC) model. UglyTrees (UT) is an easy-to-use program for visualizing multiple gene trees embedded within a single species trees. The mapping between gene and species nodes is automatically detected allowing for ready access to the program. UT can scrape the contents of a website for MSC analyses, enabling the sharing of interactive MSC figures through optional parameters in the URL. If a posterior distribution is uploaded, the transitions between MSC states are animated allowing the visual tracking of trees throughout the sequence. Availability and implementation UT runs in all major web browsers including mobile devices, and is hosted at www.uglytrees.nz. The MIT-licensed code is available at https://github.com/UglyTrees/uglytrees.github.io.


Author(s):  
Kris V Parag ◽  
Oliver G Pybus ◽  
Chieh-Hsi Wu

AbstractIn Bayesian phylogenetics, the coalescent process provides an informative framework for inferring dynamical changes in the effective size of a population from a sampled phylogeny (or tree) of its sequences. Popular coalescent inference methods such as the Bayesian Skyline Plot, Skyride and Skygrid all model this population size with a discontinuous, piecewise-constant likelihood but apply a smoothing prior to ensure that posterior population size estimates transition gradually with time. These prior distributions implicitly encode extra population size information that is not available from the observed coalescent tree (data). Here we present a novel statistic, Ω, to quantify and disaggregate the relative contributions of the coalescent data and prior assumptions to the resulting posterior estimate precision. Our statistic also measures the additional mutual information introduced by such priors. Using Ω we show that, because it is surprisingly easy to over-parametrise piecewise-constant population models, common smoothing priors can lead to overconfident and potentially misleading conclusions, even under robust experimental designs. We propose Ω as a useful tool for detecting when posterior estimate precision is overly reliant on prior choices.


2019 ◽  
Author(s):  
Filippo Disanto ◽  
Thomas Wiehe

AbstractThe Kingman coalescent process is a classical model of gene genealogies in population genetics. It generates Yule-distributed, binary ranked tree topologies—also called histories—with a finite number of n leaves, together with n −1 exponentially distributed time lengths: one for each each layer of the history. Using a discrete approach, we study the lengths of the external branches of Yule distributed histories, where the length of an external branch is defined as the rank of its parent node. We study the multiplicity of external branches of given length in a random history of n leaves. A correspondence between the external branches of the ordered histories of size n and the non-peak entries of the permutations of size n −1 provides easy access to the length distributions of the first and second longest external branch in a random Yule history and coalescent tree of size n. The length of the longest external branch is also studied in dependence of root balance of a random tree. As a practical application, we compare the observed and expected number of mutations on the longest external branches in samples from natural populations.


2019 ◽  
Vol 4 (2) ◽  
pp. 108-123 ◽  
Author(s):  
Andrew M Ritchie ◽  
Simon Y W Ho

Abstract Bayesian phylogenetic methods derived from evolutionary biology can be used to reconstruct the history of human languages using databases of cognate words. These analyses have produced exciting results regarding the origins and dispersal of linguistic and cultural groups through prehistory. Bayesian lexical dating requires the specification of priors on all model parameters. This includes the use of a prior on divergence times, often combined with a prior on tree topology and referred to as a tree prior. Violation of the underlying assumptions of the tree prior can lead to an erroneous estimate of the timescale of language evolution. To investigate these impacts, we tested the sensitivity of Bayesian dating to the tree prior in analyses of four lexical data sets. Our results show that estimates of the origin times of language families are robust to the choice of tree prior for lexical data, though less so than when Bayesian phylogenetic methods are used to analyse genetic data sets. We also used the relative fit of speciation and coalescent tree priors to determine the ability of speciation models to describe language diversification at four different taxonomic levels. We found that speciation priors were preferred over a constant-size coalescent prior regardless of taxonomic scale. However, data sets with narrower taxonomic and geographic sampling exhibited a poorer fit to ideal birth–death model expectations. Our results encourage further investigation into the nature of language diversification at different sampling scales.


2019 ◽  
Vol 15 ◽  
pp. 117693431988361
Author(s):  
Cortland K Griswold

In polymerase chain reaction (PCR)-based DNA sequencing studies, there is the possibility that mutations at the binding sites of primers result in no primer binding and therefore no amplification. In this article, we call such mutations PCR dropouts and present a coalescent-based theory of the distribution of segregating PCR dropout mutations within a species. We show that dropout mutations typically occur along branch sections that are at or near the base of a coalescent tree, if at all. Given that a dropout mutation occurs along a branch section near the base of a tree, there is a good chance that it causes the alleles of a large fraction of a species to go unamplified, which distorts the tree shape. Expected coalescence times and distributions of pairwise sequence differences in the presence of PCR dropout mutations are derived under the assumptions of both neutrality and background selection. These expectations differ from when PCR dropout mutations are absent and may form the basis of inferential approaches to detect the presence of dropout mutations, as well as the development of unbiased estimators of statistics associated with population-level genetic variation.


2018 ◽  
Author(s):  
Johannes Wirtz ◽  
Martina Rauscher ◽  
Thomas Wiehe

AbstractWe revisit the classical concept of two-locus linkage disequilibrium (LD) and introduce a novel way of looking at haplotypes. In contrast to defining haplotypes as allele combinations at two marker loci, we concentrate on the clustering of sampled chromosomes induced by their coalescent genealogy. The root of a binary coalescent trees defines two clusters of chromosomes. At two different loci this assignment may be different as a result of recombination. We show that the amount of shared chromosomes among clusters at two different loci, measured by the squared correlation, constitutes a natural measure of LD. We call this topological LD (tLD) since it is induced by the topology of the coalescent tree. We find that its rate of decay decreases more slowly with distance between loci than that of conventional LD. Furthermore, tLD has a smaller coefficient of variation, which should render it more accurate for any kind of mapping purposes than conventional LD. We conclude with a practical application to the LCT region in human populations.


Genetics ◽  
2017 ◽  
Vol 208 (2) ◽  
pp. 791-805 ◽  
Author(s):  
Zongfeng Yang ◽  
Junrui Li ◽  
Thomas Wiehe ◽  
Haipeng Li

Sign in / Sign up

Export Citation Format

Share Document