scholarly journals A Revised Model of Anatomically Modern Human Expansions Out of Africa through a Machine Learning Approximate Bayesian Computation Approach

Genes ◽  
2020 ◽  
Vol 11 (12) ◽  
pp. 1510
Author(s):  
Maria Teresa Vizzari ◽  
Andrea Benazzo ◽  
Guido Barbujani ◽  
Silvia Ghirotto

There is a wide consensus in considering Africa as the birthplace of anatomically modern humans (AMH), but the dispersal pattern and the main routes followed by our ancestors to colonize the world are still matters of debate. It is still an open question whether AMH left Africa through a single process, dispersing almost simultaneously over Asia and Europe, or in two main waves, first through the Arab Peninsula into southern Asia and Australo-Melanesia, and later through a northern route crossing the Levant. The development of new methodologies for inferring population history and the availability of worldwide high-coverage whole-genome sequences did not resolve this debate. In this work, we test the two main out-of-Africa hypotheses through an Approximate Bayesian Computation approach, based on the Random-Forest algorithm. We evaluated the ability of the method to discriminate between the alternative models of AMH out-of-Africa, using simulated data. Once assessed that the models are distinguishable, we compared simulated data with real genomic variation, from modern and archaic populations. This analysis showed that a model of multiple dispersals is four-fold as likely as the alternative single-dispersal model. According to our estimates, the two dispersal processes may be placed, respectively, around 74,000 and around 46,000 years ago.

2008 ◽  
Vol 24 (23) ◽  
pp. 2713-2719 ◽  
Author(s):  
Jean-Marie Cornuet ◽  
Filipe Santos ◽  
Mark A. Beaumont ◽  
Christian P. Robert ◽  
Jean-Michel Marin ◽  
...  

Entropy ◽  
2021 ◽  
Vol 23 (8) ◽  
pp. 961
Author(s):  
Mijung Park ◽  
Margarita Vinaroz ◽  
Wittawat Jitkrittum

We developed a novel approximate Bayesian computation (ABC) framework, ABCDP, which produces differentially private (DP) and approximate posterior samples. Our framework takes advantage of the sparse vector technique (SVT), widely studied in the differential privacy literature. SVT incurs the privacy cost only when a condition (whether a quantity of interest is above/below a threshold) is met. If the condition is sparsely met during the repeated queries, SVT can drastically reduce the cumulative privacy loss, unlike the usual case where every query incurs the privacy loss. In ABC, the quantity of interest is the distance between observed and simulated data, and only when the distance is below a threshold can we take the corresponding prior sample as a posterior sample. Hence, applying SVT to ABC is an organic way to transform an ABC algorithm to a privacy-preserving variant with minimal modification, but yields the posterior samples with a high privacy level. We theoretically analyzed the interplay between the noise added for privacy and the accuracy of the posterior samples. We apply ABCDP to several data simulators and show the efficacy of the proposed framework.


Author(s):  
Waleed Almutiry ◽  
Rob Deardon

AbstractInfectious disease transmission between individuals in a heterogeneous population is often best modelled through a contact network. However, such contact network data are often unobserved. Such missing data can be accounted for in a Bayesian data augmented framework using Markov chain Monte Carlo (MCMC). Unfortunately, fitting models in such a framework can be highly computationally intensive. We investigate the fitting of network-based infectious disease models with completely unknown contact networks using approximate Bayesian computation population Monte Carlo (ABC-PMC) methods. This is done in the context of both simulated data, and data from the UK 2001 foot-and-mouth disease epidemic. We show that ABC-PMC is able to obtain reasonable approximations of the underlying infectious disease model with huge savings in computation time when compared to a full Bayesian MCMC analysis.


2018 ◽  
Author(s):  
Silvia Ghirotto ◽  
Maria Teresa Vizzari ◽  
Francesca Tassi ◽  
Guido Barbujani ◽  
Andrea Benazzo

AbstractInferring past demographic histories is crucial in population genetics, and the amount of complete genomes now available should in principle facilitate this inference. In practice, however, the available inferential methods suffer from severe limitations. Although hundreds complete genomes can be simultaneously analyzed, complex demographic processes can easily exceed computational constraints, and the procedures to evaluate the reliability of the estimates contribute to increase the computational effort. Here we present an Approximate Bayesian Computation (ABC) framework, based on the Random Forest algorithm, to infer complex past population processes using complete genomes. To do this, we propose to summarize the data by the full genomic distribution of the four mutually exclusive categories of segregating sites (FDSS), a statistic fast to compute from unphased genome data. We constructed an efficient ABC pipeline and tested how accurately it allows one to recognize the true model among models of increasing complexity, using simulated data and taking into account different sampling strategies in terms of number of individuals analyzed, number and size of the genetic loci considered. We tested the power of the FDSS to be informative about even complex evolutionary histories and compared the results with those obtained summarizing the data through the unfolded Site Frequency Spectrum, thus highlighting for both statistics the experimental conditions maximizing the inferential power. Finally, we analyzed two datasets, testing models (a) on the dispersal of anatomically modern humans out of Africa and (b) the evolutionary relationships of the three species of Orangutan inhabiting Borneo and Sumatra.


2021 ◽  
Vol 8 (6) ◽  
pp. 202237
Author(s):  
Yunchen Xiao ◽  
Len Thomas ◽  
Mark A. J. Chaplain

We present two different methods to estimate parameters within a partial differential equation model of cancer invasion. The model describes the spatio-temporal evolution of three variables—tumour cell density, extracellular matrix density and matrix degrading enzyme concentration—in a one-dimensional tissue domain. The first method is a likelihood-free approach associated with approximate Bayesian computation; the second is a two-stage gradient matching method based on smoothing the data with a generalized additive model (GAM) and matching gradients from the GAM to those from the model. Both methods performed well on simulated data. To increase realism, additionally we tested the gradient matching scheme with simulated measurement error and found that the ability to estimate some model parameters deteriorated rapidly as measurement error increased.


Author(s):  
Cecilia Viscardi ◽  
Michele Boreale ◽  
Fabio Corradi

AbstractWe consider the problem of sample degeneracy in Approximate Bayesian Computation. It arises when proposed values of the parameters, once given as input to the generative model, rarely lead to simulations resembling the observed data and are hence discarded. Such “poor” parameter proposals do not contribute at all to the representation of the parameter’s posterior distribution. This leads to a very large number of required simulations and/or a waste of computational resources, as well as to distortions in the computed posterior distribution. To mitigate this problem, we propose an algorithm, referred to as the Large Deviations Weighted Approximate Bayesian Computation algorithm, where, via Sanov’s Theorem, strictly positive weights are computed for all proposed parameters, thus avoiding the rejection step altogether. In order to derive a computable asymptotic approximation from Sanov’s result, we adopt the information theoretic “method of types” formulation of the method of Large Deviations, thus restricting our attention to models for i.i.d. discrete random variables. Finally, we experimentally evaluate our method through a proof-of-concept implementation.


Sign in / Sign up

Export Citation Format

Share Document