scholarly journals A conditional likelihood is required to estimate the selection coefficient in ancient DNA

2016 ◽  
Author(s):  
Angelo Valleriani

AbstractTime-series of allele frequencies are a useful and unique set of data to determine the strength of natural selection on the background of genetic drift. Technically, the selection coefficient is estimated by means of a likelihood function built under the hypothesis that the available trajectory spans a sufficiently large portion of the fitness landscape. Especially for ancient DNA, however, often only one single such trajectories is available and the coverage of the fitness landscape is very limited. In fact, one single trajectory is more representative of a process conditioned both in the initial and in the final condition than of a process free to visit the available fitness landscape. Based on two models of population genetics, here we show how to build a likelihood function for the selection coefficient that takes the statistical peculiarity of single trajectories into account. We show that this conditional likelihood delivers a precise estimate of the selection coefficient also when allele frequencies are close to fixation whereas the unconditioned likelihood fails. Finally, we discuss the fact that the traditional, unconditioned likelihood always delivers an answer, which is often unfalsifiable and appears reasonable also when it is not correct.

2019 ◽  
Author(s):  
Zhangyi He ◽  
Xiaoyang Dai ◽  
Mark Beaumont ◽  
Feng Yu

AbstractTemporally spaced genetic data allow for more accurate inference of population genetic parameters and hypothesis testing on the recent action of natural selection. In this work, we develop a novel likelihood-based method for jointly estimating selection coefficient and allele age from time series data of allele frequencies. Our approach is based on a hidden Markov model where the underlying process is a Wright-Fisher diffusion conditioned to survive until the time of the most recent sample. This formulation circumvents the assumption required in existing methods that the allele is created by mutation at a certain low frequency. We calculate the likelihood by numerically solving the resulting Kolmogorov backward equation backwards in time while re-weighting the solution with the emission probabilities of the observation at each sampling time point. This procedure reduces the two-dimensional numerical search for the maximum of the likelihood surface for both the selection coefficient and the allele age to a one-dimensional search over the selection coefficient only. We illustrate through extensive simulations that our method can produce accurate estimates of the selection coefficient and the allele age under both constant and non-constant demographic histories. We apply our approach to re-analyse ancient DNA data associated with horse base coat colours. We find that ignoring demographic histories or grouping raw samples can significantly bias the inference results.


2018 ◽  
Author(s):  
Antonios Kioukis ◽  
Pavlos Pavlidis

The evolution of a population by means of genetic drift and natural selection operating on a gene regulatory network (GRN) of an individual has not been scrutinized in depth. Thus, the relative importance of various evolutionary forces and processes on shaping genetic variability in GRNs is understudied. Furthermore, it is not known if existing tools that identify recent and strong positive selection from genomic sequences, in simple models of evolution, can detect recent positive selection when it operates on GRNs. Here, we propose a simulation framework, called EvoNET, that simulates forward-in-time the evolution of GRNs in a population. Since the population size is finite, random genetic drift is explicitly applied. The fitness of a mutation is not constant, but we evaluate the fitness of each individual by measuring its genetic distance from an optimal genotype. Mutations and recombination may take place from generation to generation, modifying the genotypic composition of the population. Each individual goes through a maturation period, where its GRN reaches equilibrium. At the next step, individuals compete to produce the next generation. As time progresses, the beneficial genotypes push the population higher in the fitness landscape. We examine properties of the GRN evolution such as robustness against the deleterious effect of mutations and the role of genetic drift. We confirm classical results from Andreas Wagner’s work that GRNs show robustness against mutations and we provide new results regarding the interplay between random genetic drift and natural selection.


Author(s):  
Gerard G. Dumancas

Population genetics is the study of the frequency and interaction of alleles and genes in population and how this allele frequency distribution changes over time as a result of evolutionary processes such as natural selection, genetic drift, and mutation. This field has become essential in the foundation of modern evolutionary synthesis. Traditionally regarded as a highly mathematical discipline, its modern approach comprises more than the theoretical, lab, and fieldwork. Supercomputers play a critical role in the success of this field and are discussed in this chapter.


2020 ◽  
Author(s):  
Iain Mathieson

AbstractTime series data of allele frequencies are a powerful resource for detecting and classifying natural and artificial selection. Ancient DNA now allows us to observe these trajectories in natural populations of long-lived species such as humans. Here, we develop a hidden Markov model to infer selection coefficients that vary over time. We show through simulations that our approach can accurately estimate both selection coefficients and the timing of changes in selection. Finally, we analyze some of the strongest signals of selection in the human genome using ancient DNA. We show that the European lactase persistence mutation was selected over the past 5,000 years with a selection coefficient of 2-2.5% in Britain, Central Europe and Iberia, but not Italy. In northern East Asia, selection at the ADH1B locus associated with alcohol metabolism intensified around 4,000 years ago, approximately coinciding with the introduction of rice-based agriculture. Finally, a derived allele at the FADS locus was selected in parallel in both Europe and East Asia, as previously hypothesized. Our approach is broadly applicable to both natural and experimental evolution data and shows how time series data can be used to resolve fine-scale details of selection.


1974 ◽  
Vol 6 (1) ◽  
pp. 4-6 ◽  
Author(s):  
Brian Charlesworth

The Hardy-Weinberg law is generally regarded as one of the most important results of population genetics. It was originally proved for the case of populations with distinct generations (Hardy (1908), Weinberg (1908)); a general proof for populations with overlapping generations has apparently not been given before. The case of a single autosomal locus with an arbitrary number of alleles is considered here. Births and deaths are assumed to occur in continuous time. The weak ergodicity property of the birth rate and age structure of such a population, first derived by Norton (1928), is used to establish the fact that allele frequencies tend to constant limits in the absence of mutation, migration, selection and genetic drift.


Genetics ◽  
2020 ◽  
Vol 216 (2) ◽  
pp. 463-480
Author(s):  
Zhangyi He ◽  
Xiaoyang Dai ◽  
Mark Beaumont ◽  
Feng Yu

Temporally spaced genetic data allow for more accurate inference of population genetic parameters and hypothesis testing on the recent action of natural selection. In this work, we develop a novel likelihood-based method for jointly estimating selection coefficient and allele age from time series data of allele frequencies. Our approach is based on a hidden Markov model where the underlying process is a Wright-Fisher diffusion conditioned to survive until the time of the most recent sample. This formulation circumvents the assumption required in existing methods that the allele is created by mutation at a certain low frequency. We calculate the likelihood by numerically solving the resulting Kolmogorov backward equation backward in time while reweighting the solution with the emission probabilities of the observation at each sampling time point. This procedure reduces the two-dimensional numerical search for the maximum of the likelihood surface, for both the selection coefficient and the allele age, to a one-dimensional search over the selection coefficient only. We illustrate through extensive simulations that our method can produce accurate estimates of the selection coefficient and the allele age under both constant and nonconstant demographic histories. We apply our approach to reanalyze ancient DNA data associated with horse base coat colors. We find that ignoring demographic histories or grouping raw samples can significantly bias the inference results.


2019 ◽  
Author(s):  
Aaron J. Stern ◽  
Peter R. Wilton ◽  
Rasmus Nielsen

AbstractMost current methods for detecting natural selection from DNA sequence data are limited in that they are either based on summary statistics or a composite likelihood, and as a consequence, do not make full use of the information available in DNA sequence data. We here present a new importance sampling approach for approximating the full likelihood function for the selection coefficient. The method treats the ancestral recombination graph (ARG) as a latent variable that is integrated out using previously published Markov Chain Monte Carlo (MCMC) methods. The method can be used for detecting selection, estimating selection coefficients, testing models of changes in the strength of selection, estimating the time of the start of a selective sweep, and for inferring the allele frequency trajectory of a selected or neutral allele. We perform extensive simulations to evaluate the method and show that it uniformly improves power to detect selection compared to current popular methods such as nSL and SDS, under various demographic models and can provide reliable inferences of allele frequency trajectories under many conditions. We also explore the potential of our method to detect extremely recent changes in the strength of selection. We use the method to infer the past allele frequency trajectory for a lactase persistence SNP (MCM6) in Europeans. We also study a set of 11 pigmentation-associated variants. Several genes show evidence of strong selection particularly within the last 5,000 years, including ASIP, KITLG, and TYR. However, selection on OCA2/HERC2 seems to be much older and, in contrast to previous claims, we find no evidence of selection on TYRP1.Author summaryCurrent methods to study natural selection using modern population genomic data are limited in their power and flexibility. Here, we present a new method to infer natural selection that builds on recent methodological advances in estimating genome-wide genealogies. By using importance sampling we are able to efficiently estimate the likelihood function of the selection coefficient. We show our method improves power to test for selection over competing methods across a diverse range of scenarios, and also accurately infers the selection coefficient. We also demonstrate a novel capability of our model, using it to infer the allele’s frequency over time. We validate these results with a study of a lactase persistence SNP in Europeans, and also study a set of 11 pigmentation-associated variants.


2005 ◽  
Vol 85 (3) ◽  
pp. 171-181 ◽  
Author(s):  
ARNAUD LE ROUZIC ◽  
GRÉGORY DECELIERE

Although transposable elements (TEs) have been found in all organisms in which they have been looked for, the ways in which they invade genomes and populations are still a matter of debate. By extending the classical models of population genetics, several approaches have been developed to account for the dynamics of TEs, especially in Drosophila melanogaster. While the formalism of these models is based on simplifications, they enable us to understand better how TEs invade genomes, as a result of multiple evolutionary forces including duplication, deletion, self-regulation, natural selection and genetic drift. The aim of this paper is to review the assumptions and the predictions of these different models by highlighting the importance of the specific characteristics of both the TEs and the hosts, and the host/TE relationships. Then, perspectives in this domain will be discussed.


1974 ◽  
Vol 6 (01) ◽  
pp. 4-6
Author(s):  
Brian Charlesworth

The Hardy-Weinberg law is generally regarded as one of the most important results of population genetics. It was originally proved for the case of populations with distinct generations (Hardy (1908), Weinberg (1908)); a general proof for populations with overlapping generations has apparently not been given before. The case of a single autosomal locus with an arbitrary number of alleles is considered here. Births and deaths are assumed to occur in continuous time. The weak ergodicity property of the birth rate and age structure of such a population, first derived by Norton (1928), is used to establish the fact that allele frequencies tend to constant limits in the absence of mutation, migration, selection and genetic drift.


Sign in / Sign up

Export Citation Format

Share Document