Random partitions in population genetics

This paper is concerned with models for the genetic variation of a sample of gametes from a large population. The need for consistency between different sample sizes limits the mathematical possibilities to what are here called ‘partition structures Distinctive among them is the structure described by the Ewens sampling formula, which is shown to enjoy a characteristic property of non-interference between the different alleles. This characterization explains the robustness of the Ewens formula when neither selection nor recurrent mutation is significant, although different structures arise from selective and ‘charge-state’ models

1978 ◽  
Vol 201 (1143) ◽  
pp. 217-217

One of the most striking results of recent theoretical research in population genetics is the sampling formula associated with the name of W. J. Ewens, who enunciated it in 1972, since which time it has been shown to hold for many different population models. This asserts that, if a sample of n gametes is taken from a population, and classified according to the gene at a particular locus, then the probability that there are a 1 alleles represented once in the sample, a 2 represented twice, a 3 thrice, and so on, is given for some positive value of θ by the expression P n (a 1 ,a 2 ..., a n ) = n !/θ(θ+1)...(θ+ n ─1) ∏ n j=1 ﴾θ aj /j aj a j !﴿. Most of the models for which this has been established share three broad features: ( a ) the size of the population is large compared with n , and the expected total number of mutations per generation is moderate (and in fact differs from θ by a constant factor depending on the reproductive mechanism), ( b ) the population is in statistical equilibrium under mutation and genetic drift, with selection at the locus playing a negligible rôle, and ( c ) mutation is non-recurrent, so that every mutant allele is a completely novel one.


2020 ◽  
Author(s):  
Pavel Khromov ◽  
Alexandre V. Morozov

AbstractWe have developed a computational approach to simultaneous genome-wide inference of key population genetics parameters: selection strengths, mutation rates rescaled by the effective population size and the fraction of viable genotypes, solely from an alignment of genomic sequences sampled from the same population. Our approach is based on a generalization of the Ewens sampling formula, used to compute steady-state probabilities of allelic counts in a neutrally evolving population, to populations subjected to selective constraints. Patterns of polymorphisms observed in alignments of genomic sequences are used as input to Approximate Bayesian Computation, which employs the generalized Ewens sampling formula to infer the distributions of population genetics parameters. After carrying out extensive validation of our approach on synthetic data, we have applied it to the evolution of the Drosophila melanogaster genome, where an alignment of 197 genomic sequences is available for a single ancestral-range population from Zambia, Africa. We have divided the Drosophila genome into 100-bp windows and assumed that sequences in each window can exist in either low- or high-fitness state. Thus, the steady-state population in our model is subject to a constant influx of deleterious mutations, which shape the observed frequencies of allelic counts in each window. Our approach, which focuses on deleterious mutations and accounts for intra-window linkage and epistasis, provides an alternative description of background selection. We find that most of the Drosophila genome evolves under selective constraints imposed by deleterious mutations. These constraints are not confined to known functional regions of the genome such as coding sequences and may reflect global biological processes such as the necessity to maintain chromatin structure. Furthermore, we find that inference of mutation rates in the presence of selection leads to mutation rate estimates that are several-fold higher than neutral estimates widely used in the literature. Our computational pipeline can be used in any organism for which a sample of genomic sequences from the same population is available.


1990 ◽  
Vol 27 (1) ◽  
pp. 28-43 ◽  
Author(s):  
Jennie C. Hansen

For each n > 0, the Ewens sampling formula from population genetics is a measure on the set of all partitions of the integer n. To determine the limiting distributions for the part sizes of a partition with respect to the measures given by this formula, we associate to each partition a step function on [0, 1]. Each jump in the function equals the number of parts in the partition of a certain size. We normalize these functions and show that the induced measures on D[0, 1] converge to Wiener measure. This result complements Kingman's frequency limit theorem [10] for the Ewens partition structure.


1992 ◽  
Vol 29 (1) ◽  
pp. 1-10 ◽  
Author(s):  
Gudrun Trieb

In recent papers by Hoppe and Donnelly it has been shown that a Pólya urn model generating the Ewens sampling formula (population genetics) parallels a construction of Kingman using a Poisson–Dirichlet ‘paintbox'. Even the jump chain of Kingman's n-coalescent can be constructed using the urn. The properties of a certain process based on the coalescent also are derived. This process was introduced by Hoppe.


1990 ◽  
Vol 27 (01) ◽  
pp. 28-43 ◽  
Author(s):  
Jennie C. Hansen

For each n > 0, the Ewens sampling formula from population genetics is a measure on the set of all partitions of the integer n. To determine the limiting distributions for the part sizes of a partition with respect to the measures given by this formula, we associate to each partition a step function on [0, 1]. Each jump in the function equals the number of parts in the partition of a certain size. We normalize these functions and show that the induced measures on D[0, 1] converge to Wiener measure. This result complements Kingman's frequency limit theorem [10] for the Ewens partition structure.


1992 ◽  
Vol 29 (01) ◽  
pp. 1-10
Author(s):  
Gudrun Trieb

In recent papers by Hoppe and Donnelly it has been shown that a Pólya urn model generating the Ewens sampling formula (population genetics) parallels a construction of Kingman using a Poisson–Dirichlet ‘paintbox'. Even the jump chain of Kingman's n-coalescent can be constructed using the urn. The properties of a certain process based on the coalescent also are derived. This process was introduced by Hoppe.


Genetics ◽  
1974 ◽  
Vol 76 (2) ◽  
pp. 367-377
Author(s):  
Takeo Maruyama

ABSTRACT A Markov process (chain) of gene frequency change is derived for a geographically-structured model of a population. The population consists of colonies which are connected by migration. Selection operates in each colony independently. It is shown that there exists a stochastic clock that transforms the originally complicated process of gene frequency change to a random walk which is independent of the geographical structure of the population. The time parameter is a local random time that is dependent on the sample path. In fact, if the alleles are selectively neutral, the time parameter is exactly equal to the sum of the average local genetic variation appearing in the population, and otherwise they are approximately equal. The Kolmogorov forward and backward equations of the process are obtained. As a limit of large population size, a diffusion process is derived. The transition probabilities of the Markov chain and of the diffusion process are obtained explicitly. Certain quantities of biological interest are shown to be independent of the population structure. The quantities are the fixation probability of a mutant, the sum of the average local genetic variation and the variation summed over the generations in which the gene frequency in the whole population assumes a specified value.


Sign in / Sign up

Export Citation Format

Share Document