scholarly journals Dynamic Sampling Bias and Overdispersion Induced by Skewed Offspring Distributions

Genetics ◽  
2021 ◽  
Author(s):  
Takashi Okada ◽  
Oskar Hallatschek

Abstract Natural populations often show enhanced genetic drift consistent with a strong skew in their offspring number distribution. The skew arises because the variability of family sizes is either inherently strong or amplified by population expansions. The resulting allele-frequency fluctuations are large and, therefore, challenge standard models of population genetics, which assume sufficiently narrow offspring distributions. While the neutral dynamics backward in time can be readily analyzed using coalescent approaches, we still know little about the effect of broad offspring distributions on the forward-in-time dynamics, especially with selection. Here, we employ an asymptotic analysis combined with a scaling hypothesis to demonstrate that over-dispersed frequency trajectories emerge from the competition of conventional forces, such as selection or mutations, with an emerging time-dependent sampling bias against the minor allele. The sampling bias arises from the characteristic time-dependence of the largest sampled family size within each allelic type. Using this insight, we establish simple scaling relations for allele-frequency fluctuations, fixation probabilities, extinction times, and the site frequency spectra that arise when offspring numbers are distributed according to a power law n−(1+α). To demonstrate that this coarse-grained model captures a wide variety of evolutionary dynamics, we validate our results in traveling waves, where the phenomenon of ’gene surfing’ can produce any exponent 1 < α < 2. We argue that the concept of a dynamic sampling bias is useful to develop both intuition and statistical tests for the unusual dynamics of populations with skewed offspring distributions, which can confound commonly used tests for selection or demographic history.

2021 ◽  
Author(s):  
Takashi Okada ◽  
Oskar Hallatschek

Natural populations often show enhanced genetic drift consistent with a strong skew in their offspring number distribution. The skew arises because the variability of family sizes is either inherently strong or amplified by population expansions, leading to so-called ‘jackpot’ events. The resulting allele frequency fluctuations are large and, therefore, challenge standard models of population genetics, which assume sufficiently narrow offspring distributions. While the neutral dynamics backward in time can be readily analyzed using coalescent approaches, we still know little about the effect of broad offspring distributions on the dynamics forward in time, especially with selection. Here, we employ an exact asymptotic analysis combined with a scaling hypothesis to demonstrate that over-dispersed frequency trajectories emerge from the competition of conventional forces, such as selection or mutations, with an emerging time-dependent sampling bias against the minor allele. The sampling bias arises from the characteristic time-dependence of the largest sampled family size within each allelic type. Using this insight, we establish simple scaling relations for allele frequency fluctuations, fixation probabilities, extinction times, and the site frequency spectra that arise when offspring numbers are distributed according to a power law n−(1+α). To demonstrate that this coarse-grained model captures a wide variety of non-equilibrium dynamics, we validate our results in traveling waves, where the phenomenon of ‘gene surfing’ can produce any exponent 1 < α < 2. We argue that the concept of a dynamic sampling bias is useful generally to develop both intuition and statistical tests for the unusual dynamics of populations with skewed offspring distributions, which can confound commonly used tests for selection or demographic history.


2018 ◽  
Author(s):  
Heather E. Machado ◽  
Alan O. Bergland ◽  
Ryan Taylor ◽  
Susanne Tilk ◽  
Emily Behrman ◽  
...  

AbstractTo advance our understanding of adaptation to temporally varying selection pressures, we identified signatures of seasonal adaptation occurring in parallel amongDrosophila melanogasterpopulations. To study these evolutionary dynamics, we estimated allele frequencies genome-wide from flies sampled early and late in the growing season from 20 widely dispersed populations. We identify parallel seasonal allele frequency shifts across North America and Europe, demonstrating that seasonal adaptation is a general phenomenon of temperate fly populations. The direction of allele frequency change at seasonally variable polymorphisms can be predicted by weather conditions in the weeks prior to sampling, linking the environment and the genomic response to selection. The extent of allele frequency fluctuations implies that seasonal evolution drives substantial (5-10%) allele frequency fluctuations at >1% of common polymorphisms across the genome. Our results suggest that fluctuating selection is an important evolutionary force affecting the extent and stability of linked and functional variation.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Xiaoming Liu ◽  
Yun-Xin Fu

An amendment to this paper has been published and can be accessed via the original article.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Nandika Perera ◽  
Gayani Galhena ◽  
Gaya Ranawaka

AbstractA new 16 X-short tandem repeat (STR) multiplex PCR system has recently been developed for Sr Lankans, though its applicability in evolutionary genetics and forensic investigations has not been thoroughly assessed. In this study, 838 unrelated individuals covering all four major ethnic groups (Sinhalese, Sri Lankan Tamils, Indian Tamils and Moors) in Sri Lanka were successfully genotyped using this new multiplex system. The results indicated a high forensic efficiency for the tested loci in all four ethnicities confirming its suitability for forensic applications of Sri Lankans. Allele frequency distribution of Indian Tamils showed subtle but statistically significant differences from those of Sinhalese and Moors, in contrast to frequency distributions previously reported for autosomal STR alleles. This suggest a sex biased demographic history among Sri Lankans requiring a separate X-STR allele frequency database for Indian Tamils. Substantial differences observed in the patterns of LD among the four groups demand the use of a separate haplotype frequency databases for each individual ethnicity. When analysed together with other 14 world populations, all Sri Lankan ethnicities except Indian Tamils clustered closely with populations from Indian Bhil tribe, Bangladesh and Europe reflecting their shared Indo-Aryan ancestry.


2020 ◽  
Author(s):  
Kamaludin Dingle ◽  
Fatme Ghaddar ◽  
Petr Šulc ◽  
Ard A. Louis

The relative prominence of developmental bias versus natural selection is a long standing controversy in evolutionary biology. Here we demonstrate quantitatively that developmental bias is the primary explanation for the occupation of the morphospace of RNA secondary structure (SS) shapes. By using the RNAshapes method to define coarse-grained SS classes, we can directly measure the frequencies that non-coding RNA SS shapes appear in nature. Our main findings are, firstly, that only the most frequent structures appear in nature: The vast majority of possible structures in the morphospace have not yet been explored. Secondly, and perhaps more surprisingly, these frequencies are accurately predicted by the likelihood that structures appear upon uniform random sampling of sequences. The ultimate cause of these patterns is not natural selection, but rather strong phenotype bias in the RNA genotype-phenotype (GP) map, a type of developmental bias that tightly constrains evolutionary dynamics to only act within a reduced subset of structures which are easy to “find”.


2011 ◽  
Vol 79 (4) ◽  
pp. 203-219 ◽  
Author(s):  
Sergio Lukić ◽  
Jody Hey ◽  
Kevin Chen

Author(s):  
Rikker Dockum

The study of sound change is foundational to traditional historical linguistics, particularly the linguistic comparative method. It is well established that the phonology of modern languages encodes useful data for studying the history of those languages, and their genetic relationships to one another. However, phonology has typically been the means to the end, enabling the comparative method, and coding of a comparative lexicon for cognacy. Once coded, the particular sounds involved no longer factor into the analysis. This study examines whether the phoneme inventories and phonotactic profiles of a set of languages themselves contain phylogenetic signal detectable using established statistical tests D statistic (Fritz & Purvis 2010), K (Blomberg et al 2003), and NeighborNet delta score (Holland et al 2002) and Q-residual (Gray et al 2010). This study adds to the growing body of work on the use of phonological traits in computational phylogenetics for linguistics. Using data from 20 Tai lects from the Kra-Dai language family, this study confirms and extends previous findings. This includes detection of strong phylogenetic signal in phoneme frequency and biphone transition probabilities, but also relatively strong phylogenetic signal detected in even coarse-grained phoneme and biphone presence/absence, which previous work was unable to do.


2016 ◽  
Author(s):  
Champak R. Beeravolu ◽  
Michael J. Hickerson ◽  
Laurent A.F. Frantz ◽  
Konrad Lohse

AbstractWe introduce ABLE (Approximate Blockwise Likelihood Estimation), a novel composite likelihood framework based on a recently introduced summary of sequence variation: the blockwise site frequency spectrum (bSFS). This simulation-based framework uses the the frequencies of bSFS configurations to jointly model demographic history and recombination and is explicitly designed to make inference using multiple whole genomes or genome-wide multi-locus data (e.g. RADSeq) catering to the needs of researchers studying model or non-model organisms respectively. The flexible nature of our method further allows for arbitrarily complex population histories using unphased and unpolarized whole genome sequences. In silico experiments demonstrate accurate parameter estimates across a range of divergence models with increasing complexity, and as a proof of principle, we infer the demographic history of the two species of orangutan from multiple genome sequences (over 160 Mbp in length) from each species. Our results indicate that the two orangutan species split approximately 650-950 thousand years ago but experienced a pulse of secondary contact much more recently, most likely during a period of low sea-level South East Asia (∼300,000 years ago). Unlike previous analyses we can reject a history of continuous gene flow and co-estimate genome-wide recombination. ABLE is available for download at https://github.com/champost/ABLE.


GigaScience ◽  
2020 ◽  
Vol 9 (3) ◽  
Author(s):  
Ekaterina Noskova ◽  
Vladimir Ulyantsev ◽  
Klaus-Peter Koepfli ◽  
Stephen J O’Brien ◽  
Pavel Dobrynin

Abstract Background The demographic history of any population is imprinted in the genomes of the individuals that make up the population. One of the most popular and convenient representations of genetic information is the allele frequency spectrum (AFS), the distribution of allele frequencies in populations. The joint AFS is commonly used to reconstruct the demographic history of multiple populations, and several methods based on diffusion approximation (e.g., ∂a∂i) and ordinary differential equations (e.g., moments) have been developed and applied for demographic inference. These methods provide an opportunity to simulate AFS under a variety of researcher-specified demographic models and to estimate the best model and associated parameters using likelihood-based local optimizations. However, there are no known algorithms to perform global searches of demographic models with a given AFS. Results Here, we introduce a new method that implements a global search using a genetic algorithm for the automatic and unsupervised inference of demographic history from joint AFS data. Our method is implemented in the software GADMA (Genetic Algorithm for Demographic Model Analysis, https://github.com/ctlab/GADMA). Conclusions We demonstrate the performance of GADMA by applying it to sequence data from humans and non-model organisms and show that it is able to automatically infer a demographic model close to or even better than the one that was previously obtained manually. Moreover, GADMA is able to infer multiple demographic models at different local optima close to the global one, providing a larger set of possible scenarios to further explore demographic history.


Sign in / Sign up

Export Citation Format

Share Document