scholarly journals Estimating effective population size from samples of sequences: a bootstrap Monte Carlo integration method

1992 ◽  
Vol 60 (3) ◽  
pp. 209-220 ◽  
Author(s):  
Joseph Felsenstein

SummaryWe would like to use maximum likelihood to estimate parameters such as the effective population size Ne, or, if we do not know mutation rates, the product 4Neμof mutation rate per site and effective population size. To compute the likelihood for a sample of unrecombined nucleotide sequences taken from a random-mating population it is necessary to sum over all genealogies that could have led to the sequences, computing for each one the probability that it would have yielded the sequences, and weighting each one by its prior probability. The genealogies vary in tree topology and in branch lengths. Although the likelihood and the prior are straightforward to compute, the summation over all genealogies seems at first sight hopelessly difficult. This paper reports that it is possible to carry out a Monte Carlo integration to evaluate the likelihoods pproximately. The method uses bootstrap sampling of sites to create data sets for each of which a maximum likelihood tree is estimated. The resulting trees are assumed to be sampled from a distribution whose height is proportional to the likelihood surface for the full data. That it will be so is dependent on a theorem which is not proven, but seems likely to be true if the sequences are not short. One can use the resulting estimated likelihood curve to make a maximum likelihood estimate of the parameter of interest, Ne or of 4Neμ. The method requires at least 100 times the computational effort required for estimation of a phylogeny by maximum likelihood, but is practical on today's work stations. The method does not at present have any way of dealing with recombination.

Genetics ◽  
1977 ◽  
Vol 86 (3) ◽  
pp. 697-713
Author(s):  
C Chevalet ◽  
M Gillois ◽  
R F Nassar

ABSTRACT Properties of identity relation between genes are discussed, and a derivation of recurrent equations of identity coefficients in a random mating, diploid dioecious population is presented. Computations are run by repeated matrix multiplication. Results show that for effective population size (Ne) larger than 16 and no mutation, a given identity coefficient at any time t can be expressed approximately as a function of (1—f), (1—f)3 and (1—f)6, where f is the mean inbreeding coefficient at time t. Tables are presented, for small Ne values and extreme sex ratios, showing the pattern of change in the identity coefficients over time. The pattern of evolution of identity coefficients is also presented and discussed with respect to N eu, where u is the mutation rate. Applications of these results to the evolution of genetic variability within and between inbred lines are discussed.


Author(s):  
Belete Asefa ◽  
Kefelegn Kebede ◽  
Kefena Effa

The study was undertaken in bale zone to assess farmer’s selective breeding objectives, trait preferences, selection criteria and breeding system October 2012 to November 2013. A purposive and multistage sampling technique was applied for selection of 3 district and 9 kebeles. Then 360 households were selected by using simple random sampling techniques after the list of pastoralist having goats was identified. Statistical analysis system version 9.1 was used for analysis of data. Indices, effective population size and rate of inbreeding were calculated on average each respondent holds about 14 goats. Milk production is the main reason of goat keeping in the study area. Appearance is the first rank as selection criteria for male and female in all studies area. About 47.8% of the respondents have their own buck. The main use of breeding buck in the study area was for mating purpose (76.2%). Mean estimate of effective population size and mean rate of inbreeding was 2.43 and 0.21, respectively when a household flock is herded alone and under random mating. Therefore, any breed improvement strategies that are intended to be implemented in the study area and else- where should consider the traditional breeding practices and breeding objectives of the community.Int. J. Agril. Res. Innov. & Tech. 5 (2): 7-15, December, 2015


2011 ◽  
Vol 93 (2) ◽  
pp. 105-114 ◽  
Author(s):  
LEEYOUNG PARK

SummaryIn order to estimate the effective population size (Ne) of the current human population, two new approaches, which were derived from previous methods, were used in this study. One is based on the deviation from linkage equilibrium (LE) between completely unlinked loci in different chromosomes and another is based on the deviation from the Hardy–Weinberg Equilibrium (HWE). When random mating in a population is assumed, genetic drifts in population naturally induce linkage disequilibrium (LD) between chromosomes and the deviation from HWE. The latter provides information on the Ne of the current population, and the former provides the same when the Ne is constant. If Ne fluctuates, recent Ne changes are reflected in the estimates based on LE, and the comparison between two estimates can provide information regarding recent changes of Ne. Using HapMap Phase III data, the estimates were varied from 622 to 10 437, depending on populations and estimates. The Ne appeared to fluctuate as it provided different estimates for each of the two methods. These Ne estimates were found to agree approximately with the overall increment observed in recent human populations.


2018 ◽  
Author(s):  
Amy Ko ◽  
Rasmus Nielsen

Pedigrees provide a fine resolution of the genealogical relationships among individuals and serve an important function in many areas of genetic studies. One such use of pedigree information is in the estimation of short-term effective population size (Ne), which is of great relevance in fields such as conservation genetics. Despite the usefulness of pedigrees, however, they are often an unknown parameter and must be inferred from genetic data. In this study, we present a Bayesian method to jointly estimate pedigrees and Ne from genetic markers using Markov Chain Monte Carlo. Our method supports analysis of a large number of markers and individuals with the use of composite likelihood, which significantly increases computational efficiency. We show on simulated data that our method is able to jointly estimate relationships up to first cousins and Ne with high accuracy. We also apply the method on a real dataset of house sparrows to reconstruct their previously unreported pedigree.


2010 ◽  
Vol 47 (01) ◽  
pp. 41-57 ◽  
Author(s):  
Edward Pollak

Consider a monoecious diploid population with nonoverlapping generations, whose size varies with time according to an irreducible, aperiodic Markov chain with states x 1 N,…,x K N, where K ≪ N. It is assumed that all matings except for selfing are possible and equally probable. At time 0 a random sample of n ≪ N genes is taken. Given two successive population sizes x j N and x i N, the numbers of gametes that individual parents contribute to offspring can be shown to be exchangeable random variables distributed as G ij . Under minimal conditions on the first three moments of G ij for all i and j, a suitable effective population size N e is derived. Then if time is recorded in a backward direction in units of 2N e generations, it can be shown that coalescent theory holds.


2009 ◽  
Vol 91 (1) ◽  
pp. 47-60 ◽  
Author(s):  
B. J. HAYES ◽  
P. M. VISSCHER ◽  
M. E. GODDARD

SummaryDense marker genotypes allow the construction of the realized relationship matrix between individuals, with elements the realized proportion of the genome that is identical by descent (IBD) between pairs of individuals. In this paper, we demonstrate that by replacing the average relationship matrix derived from pedigree with the realized relationship matrix in best linear unbiased prediction (BLUP) of breeding values, the accuracy of the breeding values can be substantially increased, especially for individuals with no phenotype of their own. We further demonstrate that this method of predicting breeding values is exactly equivalent to the genomic selection methodology where the effects of quantitative trait loci (QTLs) contributing to variation in the trait are assumed to be normally distributed. The accuracy of breeding values predicted using the realized relationship matrix in the BLUP equations can be deterministically predicted for known family relationships, for example half sibs. The deterministic method uses the effective number of independently segregating loci controlling the phenotype that depends on the type of family relationship and the length of the genome. The accuracy of predicted breeding values depends on this number of effective loci, the family relationship and the number of phenotypic records. The deterministic prediction demonstrates that the accuracy of breeding values can approach unity if enough relatives are genotyped and phenotyped. For example, when 1000 full sibs per family were genotyped and phenotyped, and the heritability of the trait was 0·5, the reliability of predicted genomic breeding values (GEBVs) for individuals in the same full sib family without phenotypes was 0·82. These results were verified by simulation. A deterministic prediction was also derived for random mating populations, where the effective population size is the key parameter determining the effective number of independently segregating loci. If the effective population size is large, a very large number of individuals must be genotyped and phenotyped in order to accurately predict breeding values for unphenotyped individuals from the same population. If the heritability of the trait is 0·3, and Ne=1000, approximately 5750 individuals with genotypes and phenotypes are required in order to predict GEBVs of un-phenotyped individuals in the same population with an accuracy of 0·7.


Genetics ◽  
1981 ◽  
Vol 98 (3) ◽  
pp. 625-640
Author(s):  
Masatoshi Nei ◽  
Fumio Tajima

ABSTRACT The statistical properties of the standardized variance of gene frequency changes (a quantity equivalent to Wright's inbreeding coefficient) in a random mating population are studied, and new formulae for estimating the effective population size are developed. The accuracy of the formulae depends on the ratio of sample size to effective size, the number of generations involved (t), and the number of loci or alleles used. It is shown that the standardized variance approximately follows the Χ2 distribution unless t is very large, and the confidence interval of the estimate of effective size can be obtained by using this property. Application of the formulae to data from an isolated population of Dacus oleae has shown that the effective size of this population is about one tenth of the minimum census size, though there was a possibility that the procedure of sampling genes was improper.


Genetics ◽  
2019 ◽  
Vol 212 (3) ◽  
pp. 855-868 ◽  
Author(s):  
Amy Ko ◽  
Rasmus Nielsen

Pedigrees provide the genealogical relationships among individuals at a fine resolution and serve an important function in many areas of genetic studies. One such use of pedigree information is in the estimation of the short-term effective population size (Ne), which is of great relevance in fields such as conservation genetics. Despite the usefulness of pedigrees, however, they are often an unknown parameter and must be inferred from genetic data. In this study, we present a Bayesian method to jointly estimate pedigrees and Ne from genetic markers using Markov Chain Monte Carlo. Our method supports analysis of a large number of markers and individuals within a single generation with the use of a composite likelihood, which significantly increases computational efficiency. We show, on simulated data, that our method is able to jointly estimate relationships up to first cousins and Ne with high accuracy. We also apply the method on a real dataset of house sparrows to reconstruct their previously unreported pedigree.


Sign in / Sign up

Export Citation Format

Share Document