A Coalescent Estimator of the Population Recombination Rate

Genetics ◽  
1997 ◽  
Vol 145 (3) ◽  
pp. 833-846 ◽  
Author(s):  
Jody Hey ◽  
John Wakeley

Population genetic models often use a population recombination parameter 4Nc, where N is the effective population size and c is the recombination rate per generation. In many ways 4Nc is comparable to 4Nu, the population mutation rate. Both combine genome level and population level processes, and together they describe the rate of production of genetic variation in a population. However, 4Nc is more difficult to estimate. For a population sample of DNA sequences, historical recombination can only be detected if polymorphisms exist, and even then most recombination events are not detectable. This paper describes an estimator of 4Nc, hereafter designated γ (gamma), that was developed using a coalescent model for a sample of four DNA sequences with recombination. The reliability of γ was assessed using multiple coalescent simulations. In general γ has low to moderate bias, and the reliability of γ is comparable, though less, than that for a widely used estimator of 4Nu. If there exists an independent estimate of the recombination rate (per generation, per base pair), γ can be used to estimate the effective population size or the neutral mutation rate.

Genetics ◽  
1994 ◽  
Vol 136 (2) ◽  
pp. 685-692 ◽  
Author(s):  
Y X Fu

Abstract A new estimator of the essential parameter theta = 4Ne mu from DNA polymorphism data is developed under the neutral Wright-Fisher model without recombination and population subdivision, where Ne is the effective population size and mu is the mutation rate per locus per generation. The new estimator has a variance only slightly larger than the minimum variance of all possible unbiased estimators of the parameter and is substantially smaller than that of any existing estimator. The high efficiency of the new estimator is achieved by making full use of phylogenetic information in a sample of DNA sequences from a population. An example of estimating theta by the new method is presented using the mitochondrial sequences from an American Indian population.


Genetics ◽  
1997 ◽  
Vol 146 (4) ◽  
pp. 1489-1499 ◽  
Author(s):  
Yun-Xin Fu

A coalescent theory for a sample of DNA sequences from a partially selfing diploid population and an algorithm for simulating such samples are developed in this article. Approximate formulas are given for the expectation and the variance of the number of segregating sites in a sample of k sequences from n individuals. Several new estimators of the important parameters θ = 4Nμ and the selfing rate s, where N and μ are, respectively, the effective population size and the mutation rate per sequence per generation, are proposed and their sampling properties are studied.


Author(s):  
Bruce Walsh ◽  
Michael Lynch

This chapter examines the relative strengths of the nonadaptive evolutionary forces (drift, mutation, recombination) acting on genomes. It reviews estimators for effective population size, mutation rate, and recombination rate, and summarizes the known genomic results over a wide range of taxa. The mutation rate tends to be lower in organisms with larger effective population sizes, consistent with the drift-barrier hypothesis wherein selection is ineffective when it is less than the reciprocal of the effective population size.


Genetics ◽  
1995 ◽  
Vol 140 (4) ◽  
pp. 1421-1430 ◽  
Author(s):  
M K Kuhner ◽  
J Yamato ◽  
J Felsenstein

Abstract We present a new way to make a maximum likelihood estimate of the parameter 4N mu (effective population size times mutation rate per site, or theta) based on a population sample of molecular sequences. We use a Metropolis-Hastings Markov chain Monte Carlo method to sample genealogies in proportion to the product of their likelihood with respect to the data and their prior probability with respect to a coalescent distribution. A specific value of theta must be chosen to generate the coalescent distribution, but the resulting trees can be used to evaluate the likelihood at other values of theta, generating a likelihood curve. This procedure concentrates sampling on those genealogies that contribute most of the likelihood, allowing estimation of meaningful likelihood curves based on relatively small samples. The method can potentially be extended to cases involving varying population size, recombination, and migration.


2010 ◽  
Vol 107 (5) ◽  
pp. 2147-2152 ◽  
Author(s):  
Chad D. Huff ◽  
Jinchuan Xing ◽  
Alan R. Rogers ◽  
David Witherspoon ◽  
Lynn B. Jorde

The genealogies of different genetic loci vary in depth. The deeper the genealogy, the greater the chance that it will include a rare event, such as the insertion of a mobile element. Therefore, the genealogy of a region that contains a mobile element is on average older than that of the rest of the genome. In a simple demographic model, the expected time to most recent common ancestor (TMRCA) is doubled if a rare insertion is present. We test this expectation by examining single nucleotide polymorphisms around polymorphic Alu insertions from two completely sequenced human genomes. The estimated TMRCA for regions containing a polymorphic insertion is two times larger than the genomic average (P < <10−30), as predicted. Because genealogies that contain polymorphic mobile elements are old, they are shaped largely by the forces of ancient population history and are insensitive to recent demographic events, such as bottlenecks and expansions. Remarkably, the information in just two human DNA sequences provides substantial information about ancient human population size. By comparing the likelihood of various demographic models, we estimate that the effective population size of human ancestors living before 1.2 million years ago was 18,500, and we can reject all models where the ancient effective population size was larger than 26,000. This result implies an unusually small population for a species spread across the entire Old World, particularly in light of the effective population sizes of chimpanzees (21,000) and gorillas (25,000), which each inhabit only one part of a single continent.


2018 ◽  
Author(s):  
Bhavin S. Khatri ◽  
Austin Burt

Estimating recent effective population size is of great importance in characterising and predicting the evolution of natural populations. Methods based on nucleotide diversity may underestimate current day effective population sizes due to historical bottlenecks, whilst methods that reconstruct demographic history typically only detect long-term variations. However, soft selective sweeps, which leave a fingerprint of mutational history by recurrent mutations on independent haplotype backgrounds, holds promise of an estimate more representative of recent population history. Here we present a simple and robust method of estimation based only on knowledge of the number of independent recurrent origins and the current frequency of the beneficial allele in a population sample, independent of the strength of selection and age of the mutation. Using a forward time theoretical framework, we show the mean number of origins is a function of θ = 2Nμ and current allele frequency, through a simple equation, and the distribution is approximately Poisson. This estimate is robust to whether mutants pre-existed before selection arose, and is equally accurate for diploid populations with incomplete dominance. For fast (e.g., seasonal) demographic changes compared to time scale for fixation of the mutant allele, and for moderate peak-to-trough ratios, we show our constant population size estimate can be used to bound the maximum and minimum population size. Applied to the Vgsc gene of Anopheles gambiae, we estimate an effective population size of roughly 6 × 107, and including seasonal demographic oscillations, a minimum effective population size greater than 6 × 106 and a maximum less than 3 × 109.


2021 ◽  
Author(s):  
Irene Novo ◽  
Armando Caballero ◽  
Enrique Santiago

The effective population size ( N e ) is a key parameter to quantify the magnitude of genetic drift and inbreeding, with important implications in human evolution. The increasing availability of high-density genetic markers allows the estimation of historical changes in N e across time using measures of genome diversity or linkage disequilibrium between markers. Selection is expected to reduce diversity and N e , and this reduction is modulated by the heterogeneity of the genome in terms of recombination rate. Here we investigate by computer simulations the consequences of selection (both positive and negative) and of recombination rate heterogeneity in the estimation of historical N e . We also investigate the relationship between diversity parameters and N e across the different regions of the genome using human marker data. We show that the estimates of historical N e obtained from linkage disequilibrium between markers ( N e LD ) are virtually unaffected by selection. In contrast, those estimates obtained by coalescence mutation-recombination-based methods can be strongly affected by it, what could have important consequences for the estimation of human demography. The simulation results are supported by the analysis of human data. The estimates of N e LD obtained for particular genomic regions do not correlate with recombination rate, nucleotide diversity, polymorphism, background selection statistic, minor allele frequency of SNPs, loss of function and missense variants and gene density. This suggests that N e LD measures are merely indicative of demographic changes in population size across generations.


Sign in / Sign up

Export Citation Format

Share Document