scholarly journals The effect of sample size on estimates of genetic differentiation and effective population size for Schistosoma mansoni populations

2018 ◽  
Vol 48 (14) ◽  
pp. 1149-1154 ◽  
Author(s):  
Lúcio M. Barbosa ◽  
Bruna C. Barros ◽  
Moreno de Souza Rodrigues ◽  
Luciano K. Silva ◽  
Mitermayer G. Reis ◽  
...  
Heredity ◽  
2019 ◽  
Vol 124 (2) ◽  
pp. 299-312 ◽  
Author(s):  
Tetsuya Akita

Abstract In this study, we developed a nearly unbiased estimator of contemporary effective mother size in a population, which is based on a known maternal half-sibling relationship found within the same cohort. Our method allows for variance of the average number of offspring per mother (i.e., parental variation, such as age-specific fecundity) and variance of the number of offspring among mothers with identical reproductive potential (i.e., nonparental variation, such as family-correlated survivorship). We also developed estimators of the variance and coefficient of variation of contemporary effective mother size and qualitatively evaluated the performance of the estimators by running an individual-based model. Our results provide guidance for (i) a sample size to ensure the required accuracy and precision when the order of effective mother size is available and (ii) a degree of uncertainty regarding the estimated effective mother size when information about the size is unavailable. To the best of our knowledge, this is the first report to demonstrate the derivation of a nearly unbiased estimator of effective population size; however, its current application is limited to effective mother size and situations, in which the sample size is not particularly small and maternal half-sibling relationships can be detected without error. The results of this study demonstrate the usefulness of a sibship assignment method for estimating effective population size; in addition, they have the potential to greatly widen the scope of genetic monitoring, especially in the situation of small sample size.


2019 ◽  
Author(s):  
M. Elise Lauterbur

AbstractPopulation genetics employs two major models for conceptualizing genetic relationships among individuals – outcome-driven (coalescent) and process-driven (forward). These models are complementary, but the basic Kingman coalescent and its extensions make fundamental assumptions to allow analytical approximations: a constant effective population size much larger than the sample size. These make the probability of multiple coalescent events per generation negligible. Although these assumptions are often violated in species of conservation concern, conservation genetics often uses coalescent models of effective population sizes and trajectories in endangered species. Despite this, the effect of very small effective population sizes, and their interaction with bottlenecks and sample sizes, on such analyses of genetic diversity remains unexplored. Here, I use simulations to analyze the influence of small effective population size, population decline, and their relationship with sample size, on coalescent-based estimates of genetic diversity. Compared to forward process-based estimates, coalescent models significantly overestimate genetic diversity in oversampled populations with very small effective sizes. When sampled soon after a decline, coalescent models overestimate genetic diversity in small populations regardless of sample size. Such overestimates artificially inflate estimates of both bottleneck and population split times. For conservation applications with small effective population sizes, forward simulations that do not make population size assumptions are computationally tractable and should be considered instead of coalescent-based models. These findings underscore the importance of the theoretical basis of analytical techniques as applied to conservation questions.


2020 ◽  
Vol 10 (4) ◽  
pp. 1929-1937 ◽  
Author(s):  
Florianne Marandel ◽  
Grégory Charrier ◽  
Jean‐Baptiste Lamy ◽  
Sabrina Le Cam ◽  
Pascal Lorance ◽  
...  

2005 ◽  
Vol 6 (4) ◽  
pp. 615-621 ◽  
Author(s):  
Lasse F. Jensen ◽  
Michael M. Hansen ◽  
Jens Carlsson ◽  
Volker Loeschcke ◽  
Karen-Lise D. Mensberg

PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9939
Author(s):  
Jessica F. McLaughlin ◽  
Kevin Winker

Sample size is a critical aspect of study design in population genomics research, yet few empirical studies have examined the impacts of small sample sizes. We used datasets from eight diverging bird lineages to make pairwise comparisons at different levels of taxonomic divergence (populations, subspecies, and species). Our data are from loci linked to ultraconserved elements and our analyses used one single nucleotide polymorphism per locus. All individuals were genotyped at all loci, effectively doubling sample size for coalescent analyses. We estimated population demographic parameters (effective population size, migration rate, and time since divergence) in a coalescent framework using Diffusion Approximation for Demographic Inference, an allele frequency spectrum method. Using divergence-with-gene-flow models optimized with full datasets, we subsampled at sequentially smaller sample sizes from full datasets of 6–8 diploid individuals per population (with both alleles called) down to 1:1, and then we compared estimates and their changes in accuracy. Accuracy was strongly affected by sample size, with considerable differences among estimated parameters and among lineages. Effective population size parameters (ν) tended to be underestimated at low sample sizes (fewer than three diploid individuals per population, or 6:6 haplotypes in coalescent terms). Migration (m) was fairly consistently estimated until <2 individuals per population, and no consistent trend of over-or underestimation was found in either time since divergence (T) or theta (Θ = 4Nrefμ). Lineages that were taxonomically recognized above the population level (subspecies and species pairs; that is, deeper divergences) tended to have lower variation in scaled root mean square error of parameter estimation at smaller sample sizes than population-level divergences, and many parameters were estimated accurately down to three diploid individuals per population. Shallower divergence levels (i.e., populations) often required at least five individuals per population for reliable demographic inferences using this approach. Although divergence levels might be unknown at the outset of study design, our results provide a framework for planning appropriate sampling and for interpreting results if smaller sample sizes must be used.


1981 ◽  
Vol 38 (3) ◽  
pp. 209-216 ◽  
Author(s):  
William G. Hill

SUMMARYA method is proposed for estimating effective population size (N) from data on linkage disequilibrium among neutral genes at several polymorphic loci or restriction sites. The efficiency of the method increases with larger sample size and more tightly linked genes; but for very tightly linked genes estimates of N are more dependent on long-term than on recent population history. Two sets of data are analysed as examples.


2019 ◽  
Author(s):  
Tetsuya Akita

AbstractIn this study, we developed a nearly unbiased estimator of contemporary effective mother size in a population, which is based on a known maternal half-sibling relationship found within the same cohort. Our method allows for variance of the average number of offspring per mother (i.e., parental variation, such as age-specific fecundity) and variance of the number of offspring among mothers with identical reproductive potential (i.e., non-parental variation, such as family-correlated survivorship). We also developed estimators of the variance and coefficient variation of contemporary effective mother size and qualitatively evaluated the performance of the estimators by running an individual-based model. Our results provide guidance for (i) a sample size to ensure the required accuracy and precision when the order of effective mother size is available and (ii) a degree of uncertainty regarding the estimated effective mother size when information about the size is unavailable. To the best of our knowledge, this is the first report to demonstrate the derivation of a nearly unbiased estimator of effective population size; however, its current application is limited to effective mother size and situations in which the sample size is not particularly small and maternal half-sibling relationships can be detected without error. The results of this study demonstrate the usefulness of a sibship assignment method for estimating effective population size; in addition, they have the potential to greatly widen the scope of genetic monitoring.


Author(s):  
Jessica F. McLaughlin ◽  
Kevin Winker

AbstractSample size is a critical aspect of study design in population genomics research, yet few empirical studies have examined the impacts of small sample sizes. We used datasets from eight diverging bird lineages to make pairwise comparisons at different levels of taxonomic divergence (populations, subspecies, and species). Our data are from loci linked to ultraconserved elements (UCEs) and our analyses used one SNP per locus. All individuals were genotyped at all loci (McLaughlin et al. 2020). We estimated population demographic parameters (effective population size, migration rate, and time since divergence) in a coalescent framework using Diffusion Approximation for Demographic Inference (δaδi; Gutenkunst et al. 2009), an allele frequency spectrum (AFS) method. Using divergence-with-gene-flow models optimized with full datasets, we subsampled at sequentially smaller sample sizes from full datasets of 6 – 8 diploid individuals per population (with both alleles called) down to 1:1, and then we compared estimates and their changes in accuracy. Accuracy was strongly affected by sample size, with considerable differences among estimated parameters and among lineages. Effective population size parameters (ν) tended to be underestimated at low sample sizes (fewer than 3 diploid individuals per population, or 6:6 haplotypes in coalescent terms). Migration (m) was fairly consistently estimated until ≤ 2 individuals per population, and no consistent trend of over- or underestimation was found in either time since divergence (T) or Θ (4Nrefμ). Lineages that were taxonomically recognized above the population level (subspecies and species pairs; i.e., deeper divergences) tended to have lower variation in scaled root mean square error (SMRSE) of parameter estimation at smaller sample sizes than population-level divergences, and many parameters were estimated accurately down to 3 diploid individuals per population. Shallower divergence levels (i.e., populations) often required at least 5 individuals per population for reliable demographic inferences using this approach. Although divergence levels might be unknown at the outset of study design, our results provide a framework for planning appropriate sampling and for interpreting results if smaller sample sizes must be used.


2015 ◽  
Author(s):  
Lucie Gattepaille ◽  
Mattias Jakobsson

Inferring and understanding changes in effective population size over time is a major challenge for population genetics. Here we investigate some theoretical properties of random mating populations with varying size over time. In particular, we present an exact method to compute the population size as a function of time using the distributions of coalescent-times of samples of any size. This result reduces the problem of population size inference to a problem of estimating coalescent-time distributions. Using tree inference algorithms and genetic data, we can investigate the effects of a range of conditions associated with real data, for instance finite number of loci, sample size, mutation rate and presence of cryptic recombination. We show that our method requires at least a modest number of loci (10,000 or more) and that increasing the sample size from 2 to 10 greatly improves the inference whereas further increase in sample size only results in a modest improvement, even under as scenario of exponential growth. We also show that small amounts of recombination can lead to biased population size reconstruction when unaccounted for. The approach can handle large sample sizes and the computations are fast. We apply our method on human genomes from 4 populations and reconstruct population size profiles that are coherent with previous knowledge, including the Out-of-Africa bottleneck. Additionally, a potential difference in population size between African and non-African populations as early as 400 thousand years ago is uncovered.


Sign in / Sign up

Export Citation Format

Share Document