scholarly journals Inferring Population Size History from Large Samples of Genome-Wide Molecular Data - An Approximate Bayesian Computation Approach

PLoS Genetics ◽  
2016 ◽  
Vol 12 (3) ◽  
pp. e1005877 ◽  
Author(s):  
Simon Boitard ◽  
Willy Rodríguez ◽  
Flora Jay ◽  
Stefano Mona ◽  
Frédéric Austerlitz
Author(s):  
Guillaume Laval ◽  
Etienne Patin ◽  
Pierre Boutillier ◽  
Lluis Quintana-Murci

Over the last 100,000 years, humans have spread across the globe and encountered a highly diverse set of environments to which they have had to adapt. Genome-wide scans of selection are powerful to detect selective sweeps. However, because of unknown fractions of undetected sweeps and false discoveries, the numbers of detected sweeps often poorly reflect actual numbers of selective sweeps in populations. The thousands of soft sweeps on standing variation recently evidenced in humans have also been interpreted as a majority of mis-classified neutral regions. In such a context, the extent of human adaptation remains little understood. We present a new rationale to estimate these actual numbers of sweeps expected over the last 100,000 years (denoted by X) from genome-wide population data, both considering hard sweeps and selective sweeps on standing variation. We implemented an approximate Bayesian computation framework and showed, based on computer simulations, that such a method can properly estimate X. We then jointly estimated the number of selective sweeps, their mean intensity and age in several 1000G African, European and Asian populations. Our estimations of X, found weakly sensitive to demographic misspecifications, revealed very limited numbers of sweeps regardless the frequency of the selected alleles at the onset of selection and the completion of sweeps. We estimated ∼80 sweeps in average across fifteen 1000G populations when assuming incomplete sweeps only and ∼140 selective sweeps in non-African populations when incorporating complete sweeps in our simulations. The method proposed may help to address controversies on the number of selective sweeps in populations, guiding further genome-wide investigations of recent positive selection.


2019 ◽  
Vol 36 (11) ◽  
pp. 2451-2461 ◽  
Author(s):  
Bo-Wen Zhang ◽  
Lin-Lin Xu ◽  
Nan Li ◽  
Peng-Cheng Yan ◽  
Xin-Hua Jiang ◽  
...  

Abstract Persian walnut (Juglans regia) is cultivated worldwide for its high-quality wood and nuts, but its origin has remained mysterious because in phylogenies it occupies an unresolved position between American black walnuts and Asian butternuts. Equally unclear is the origin of the only American butternut, J. cinerea. We resequenced the whole genome of 80 individuals from 19 of the 22 species of Juglans and assembled the genome of its relatives Pterocarya stenoptera and Platycarya strobilacea. Using phylogenetic-network analysis of single-copy nuclear genes, genome-wide site pattern probabilities, and Approximate Bayesian Computation, we discovered that J. regia (and its landrace J. sigillata) arose as a hybrid between the American and the Asian lineages and that J. cinerea resulted from massive introgression from an immigrating Asian butternut into the genome of an American black walnut. Approximate Bayesian Computation modeling placed the hybrid origin in the late Pliocene, ∼3.45 My, with both parental lineages since having gone extinct in Europe.


2016 ◽  
Author(s):  
Simon Boitard ◽  
Willy Rodriguez ◽  
Flora Jay ◽  
Stefano Mona ◽  
Frederic Austeritz

Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey), PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles.


Genetics ◽  
2021 ◽  
Author(s):  
Guillaume Laval ◽  
Etienne Patin ◽  
Pierre Boutillier ◽  
Lluis Quintana-Murci

Abstract During their dispersals over the last 100,000 years, modern humans have been exposed to a large variety of environments, resulting in genetic adaptation. While genome-wide scans for the footprints of positive Darwinian selection have increased knowledge of genes and functions potentially involved in human local adaptation, they have globally produced evidence of a limited contribution of selective sweeps in humans. Conversely, studies based on machine learning algorithms suggest that recent sweeps from standing variation are widespread in humans, an observation that has been recently questioned. Here, we sought to formally quantify the number of recent selective sweeps in humans, by leveraging approximate Bayesian computation and whole-genome sequence data. Our computer simulations revealed suitable ABC estimations, regardless of the frequency of the selected alleles at the onset of selection and the completion of sweeps. Under a model of recent selection from standing variation, we inferred that an average of 68 (from 56 to 79) and 140 (from 94 to 198) sweeps occurred over the last 100,000 years of human history, in African and Eurasian populations, respectively. The former estimation is compatible with human adaptation rates estimated since divergence with chimps, and reveal numbers of sweeps per generation per site in the range of values estimated in Drosophila. Our results confirm the rarity of selective sweeps in humans and show a low contribution of sweeps from standing variation to recent human adaptation.


Author(s):  
Cecilia Viscardi ◽  
Michele Boreale ◽  
Fabio Corradi

AbstractWe consider the problem of sample degeneracy in Approximate Bayesian Computation. It arises when proposed values of the parameters, once given as input to the generative model, rarely lead to simulations resembling the observed data and are hence discarded. Such “poor” parameter proposals do not contribute at all to the representation of the parameter’s posterior distribution. This leads to a very large number of required simulations and/or a waste of computational resources, as well as to distortions in the computed posterior distribution. To mitigate this problem, we propose an algorithm, referred to as the Large Deviations Weighted Approximate Bayesian Computation algorithm, where, via Sanov’s Theorem, strictly positive weights are computed for all proposed parameters, thus avoiding the rejection step altogether. In order to derive a computable asymptotic approximation from Sanov’s result, we adopt the information theoretic “method of types” formulation of the method of Large Deviations, thus restricting our attention to models for i.i.d. discrete random variables. Finally, we experimentally evaluate our method through a proof-of-concept implementation.


Sign in / Sign up

Export Citation Format

Share Document