scholarly journals Efficiently inferring the demographic history of many populations with allele count data

2018 ◽  
Author(s):  
John A. Kamm ◽  
Jonathan Terhorst ◽  
Richard Durbin ◽  
Yun S. Song

AbstractThe sample frequency spectrum (SFS), or histogram of allele counts, is an important summary statistic in evolutionary biology, and is often used to infer the history of population size changes, migrations, and other demographic events affecting a set of populations. The expected multipopulation SFS under a given demographic model can be efficiently computed when the populations in the model are related by a tree, scaling to hundreds of populations. Admixture, back-migration, and introgression are common natural processes that violate the assumption of a tree-like population history, however, and until now the expected SFS could be computed for only a handful of populations when the demographic history is not a tree. In this article, we present a new method for efficiently computing the expected SFS and linear functionals of it, for demographies described by general directed acyclic graphs. This method can scale to more populations than previously possible for complex demographic histories including admixture. We apply our method to an 8-population SFS to estimate the timing and strength of a proposed “basal Eurasian” admixture event in human history. We implement and release our method in a new open-source software package momi2.

2018 ◽  
Author(s):  
Adamandia Kapopoulou ◽  
Susanne P. Pfeifer ◽  
Jeffrey D. Jensen ◽  
Stefan Laurent

ABSTRACTAs one of the most commonly utilized organisms in the study of local adaptation, an accurate characterization of the demographic history of Drosophila melanogaster remains as an important research question. This owes both to the inherent interest in characterizing the population history of this model organism, as well as to the well-established importance of an accurate null demographic model for increasing power and decreasing false positive rates in genomic scans for positive selection. While considerable attention has been afforded to this issue in non-African populations, less is known about the demographic history of African populations, including from the ancestral range of the species. While qualitative predictions and hypotheses have previously been forwarded, we here present a quantitative model fitting of the population history characterizing both the ancestral Zambian population range as well as the subsequently colonized west African populations, which themselves served as the source of multiple non-African colonization events. These parameter estimates thus represent an important null model for future investigations in to African and non-African D. melanogaster populations alike.


2010 ◽  
Vol 60 (4) ◽  
pp. 449-465
Author(s):  
Wen Longying ◽  
Zhang Lixun ◽  
An Bei ◽  
Luo Huaxing ◽  
Liu Naifa ◽  
...  

AbstractWe have used phylogeographic methods to investigate the genetic structure and population history of the endangered Himalayan snowcock (Tetraogallus himalayensis) in northwestern China. The mitochondrial cytochrome b gene was sequenced of 102 individuals sampled throughout the distribution range. In total, we found 26 different haplotypes defined by 28 polymorphic sites. Phylogenetic analyses indicated that the samples were divided into two major haplogroups corresponding to one western and one eastern clade. The divergence time between these major clades was estimated to be approximately one million years. An analysis of molecular variance showed that 40% of the total genetic variability was found within local populations, 12% among populations within regional groups and 48% among groups. An analysis of the demographic history of the populations suggested that major expansions have occurred in the Himalayan snowcock populations and these correlate mainly with the first and the second largest glaciations during the Pleistocene. In addition, the data indicate that there was a population expansion of the Tianshan population during the uplift of the Qinghai-Tibet Plateau, approximately 2 million years ago.


2013 ◽  
Vol 59 (4) ◽  
pp. 458-474 ◽  
Author(s):  
Sen Song ◽  
Shijie Bao ◽  
Ying Wang ◽  
Xinkang Bao ◽  
Bei An ◽  
...  

Abstract Pleistocene climate fluctuations have shaped the patterns of genetic diversity observed in extant species. Although the effects of recent glacial cycles on genetic diversity have been well studied on species in Europe and North America, genetic legacy of species in the Pleistocene in north and northwest of China where glaciations was not synchronous with the ice sheet development in the Northern Hemisphere or or had little or no ice cover during the glaciations’ period, remains poorly understood. Here we used phylogeographic methods to investigate the genetic structure and population history of the chukar partridge Alec-toris chukar in north and northwest China. A 1,152 – 1,154 bp portion of the mtDNA CR were sequenced for all 279 specimens and a total number of 91 haplotypes were defined by 113 variable sites. High levels of gene flow were found and gene flow estimates were greater than 1 for most population pairs in our study. The AMOVA analysis showed that 81% and 16% of the total genetic variability was found within populations and among populations within groups, respectively. The demographic history of chukar was examined using neutrality tests and mismatch distribution analyses and results indicated Late Pleistocene population expansion. Results revealed that most populations of chukar experienced population expansion during 0.027 ? 0.06 Ma. These results are at odds with the results found in Europe and North America, where population expansions occurred after Last Glacial Maximum (LGM, 0.023 to 0.018 Ma). Our results are not consistent with the results from avian species of Tibetan Plateau, either, where species experienced population expansion following the retreat of the extensive glaciation period (0.5 to 0.175 Ma).


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e1910 ◽  
Author(s):  
Quentin Rougemont ◽  
Camille Roux ◽  
Samuel Neuenschwander ◽  
Jerome Goudet ◽  
Sophie Launey ◽  
...  

Inferring the history of isolation and gene flow during species divergence is a central question in evolutionary biology. The European river lamprey (Lampetra fluviatilis) and brook lamprey(L. planeri)show a low reproductive isolation but have highly distinct life histories, the former being parasitic-anadromous and the latter non-parasitic and freshwater resident. Here we used microsatellite data from six replicated population pairs to reconstruct their history of divergence using an approximate Bayesian computation framework combined with a random forest model. In most population pairs, scenarios of divergence with recent isolation were outcompeted by scenarios proposing ongoing gene flow, namely the Secondary Contact (SC) and Isolation with Migration (IM) models. The estimation of demographic parameters under the SC model indicated a time of secondary contact close to the time of speciation, explaining why SC and IM models could not be discriminated. In case of an ancient secondary contact, the historical signal of divergence is lost and neutral markers converge to the same equilibrium as under the less parameterized model allowing ongoing gene flow. Our results imply that models of secondary contacts should be systematically compared to models of divergence with gene flow; given the difficulty to discriminate among these models, we suggest that genome-wide data are needed to adequately reconstruct divergence history.


GigaScience ◽  
2020 ◽  
Vol 9 (3) ◽  
Author(s):  
Ekaterina Noskova ◽  
Vladimir Ulyantsev ◽  
Klaus-Peter Koepfli ◽  
Stephen J O’Brien ◽  
Pavel Dobrynin

Abstract Background The demographic history of any population is imprinted in the genomes of the individuals that make up the population. One of the most popular and convenient representations of genetic information is the allele frequency spectrum (AFS), the distribution of allele frequencies in populations. The joint AFS is commonly used to reconstruct the demographic history of multiple populations, and several methods based on diffusion approximation (e.g., ∂a∂i) and ordinary differential equations (e.g., moments) have been developed and applied for demographic inference. These methods provide an opportunity to simulate AFS under a variety of researcher-specified demographic models and to estimate the best model and associated parameters using likelihood-based local optimizations. However, there are no known algorithms to perform global searches of demographic models with a given AFS. Results Here, we introduce a new method that implements a global search using a genetic algorithm for the automatic and unsupervised inference of demographic history from joint AFS data. Our method is implemented in the software GADMA (Genetic Algorithm for Demographic Model Analysis, https://github.com/ctlab/GADMA). Conclusions We demonstrate the performance of GADMA by applying it to sequence data from humans and non-model organisms and show that it is able to automatically infer a demographic model close to or even better than the one that was previously obtained manually. Moreover, GADMA is able to infer multiple demographic models at different local optima close to the global one, providing a larger set of possible scenarios to further explore demographic history.


2013 ◽  
Vol 8 (9) ◽  
pp. 854-875 ◽  
Author(s):  
Adrianna Kilikowska ◽  
Anna Wysocka ◽  
Artur Burzyński ◽  
Goce Kostoski ◽  
Joanna Rychlińska ◽  
...  

AbstractAncient lakes as places of extensive speciation processes have been characterized by a high degree of endemicity and biodiversity. The most outstanding European ancient lake is the oligotrophic and karstic Balkan Lake Ohrid. The lake is inhabited by a number of endemic species, but their evolutionary history is largely unresolved. in the present study, the genetic structure, gene genealogy and demographic history of the representatives of the Ohridian endemic Proasellus species were studied using both biparentally (allozyme loci) and maternally (partial mitochondrial cytochrome oxidase subunit I gene) inherited markers. Both data sets gave similar results and supported discrepancies among genetic differentiation, the current morphology-based taxonomy and bathymetric segregation. Horizontal distribution of endemic Proasellus species (Lake Ohrid vs adjacent feeder springs) within the lake presumably promote parapatric speciation whereas the main role of vertical barriers into diversification processes was not fully supported. The analyses of demographic history suggested the decline of endemic isopod populations. The radiation of endemic Proasellus populations within the lake could have started from the sublittoral/profundal zone towards the littoral or in the opposite direction — from the littoral to the profundal. Our analyses did not exclude both possibilities.


Heredity ◽  
2021 ◽  
Author(s):  
Armando Arredondo ◽  
Beatriz Mourato ◽  
Khoa Nguyen ◽  
Simon Boitard ◽  
Willy Rodríguez ◽  
...  

AbstractInferring the demographic history of species is one of the greatest challenges in populations genetics. This history is often represented as a history of size changes, ignoring population structure. Alternatively, when structure is assumed, it is defined a priori as a population tree and not inferred. Here we propose a framework based on the IICR (Inverse Instantaneous Coalescence Rate). The IICR can be estimated for a single diploid individual using the PSMC method of Li and Durbin (2011). For an isolated panmictic population, the IICR matches the population size history, and this is how the PSMC outputs are generally interpreted. However, it is increasingly acknowledged that the IICR is a function of the demographic model and sampling scheme with limited connection to population size changes. Our method fits observed IICR curves of diploid individuals with IICR curves obtained under piecewise stationary symmetrical island models. In our models we assume a fixed number of time periods during which gene flow is constant, but gene flow is allowed to change between time periods. We infer the number of islands, their sizes, the periods at which connectivity changes and the corresponding rates of connectivity. Validation with simulated data showed that the method can accurately recover most of the scenario parameters. Our application to a set of five human PSMCs yielded demographic histories that are in agreement with previous studies using similar methods and with recent research suggesting ancient human structure. They are in contrast with the view of human evolution consisting of one ancestral population branching into three large continental and panmictic populations with varying degrees of connectivity and no population structure within each continent.


Heredity ◽  
2021 ◽  
Author(s):  
Kristy Mualim ◽  
Christoph Theunert ◽  
Montgomery Slatkin

AbstractWe present a method called the G(A|B) method for estimating coalescence probabilities within population lineages from genome sequences when one individual is sampled from each population. Population divergence times can be estimated from these coalescence probabilities if additional assumptions about the history of population sizes are made. Our method is based on a method presented by Rasmussen et al. (2014) to test whether an archaic genome is from a population directly ancestral to a present-day population. The G(A|B) method does not require distinguishing ancestral from derived alleles or assumptions about demographic history before population divergence. We discuss the relationship of our method to two similar methods, one introduced by Green et al. (2010) and called the F(A|B) method and the other introduced by Schlebusch et al. (2017) and called the TT method. When our method is applied to individuals from three or more populations, it provides a test of whether the population history is treelike because coalescence probabilities are additive on a tree. We illustrate the use of our method by applying it to three high-coverage archaic genomes, two Neanderthals (Vindija and Altai) and a Denisovan.


2019 ◽  
Vol 37 (4) ◽  
pp. 994-1006 ◽  
Author(s):  
María C Ávila-Arcos ◽  
Kimberly F McManus ◽  
Karla Sandoval ◽  
Juan Esteban Rodríguez-Rodríguez ◽  
Viridiana Villa-Islas ◽  
...  

Abstract Native American genetic variation remains underrepresented in most catalogs of human genome sequencing data. Previous genotyping efforts have revealed that Mexico’s Indigenous population is highly differentiated and substructured, thus potentially harboring higher proportions of private genetic variants of functional and biomedical relevance. Here we have targeted the coding fraction of the genome and characterized its full site frequency spectrum by sequencing 76 exomes from five Indigenous populations across Mexico. Using diffusion approximations, we modeled the demographic history of Indigenous populations from Mexico with northern and southern ethnic groups splitting 7.2 KYA and subsequently diverging locally 6.5 and 5.7 KYA, respectively. Selection scans for positive selection revealed BCL2L13 and KBTBD8 genes as potential candidates for adaptive evolution in Rarámuris and Triquis, respectively. BCL2L13 is highly expressed in skeletal muscle and could be related to physical endurance, a well-known phenotype of the northern Mexico Rarámuri. The KBTBD8 gene has been associated with idiopathic short stature and we found it to be highly differentiated in Triqui, a southern Indigenous group from Oaxaca whose height is extremely low compared to other Native populations.


2020 ◽  
Author(s):  
Armando Arredondo ◽  
Beatriz Mourato ◽  
Khoa Nguyen ◽  
Simon Boitard ◽  
Willy Rodríguez ◽  
...  

AbstractInferring the demographic history of species is one of the greatest challenges in populations genetics. This history is often represented as a history of size changes, thus ignoring population structure. Alternatively, structure is defined a priori as a population tree and not inferred. Here we propose a framework based on the IICR (Inverse Instantaneous Coalescence Rate), which can be estimated using the PSMC method of Li and Durbin (2011) for a single diploid individual. For an isolated population, the IICR matches the population size history, which is how the PSMC outputs are generally interpreted. However, it is increasingly acknowledged that the IICR is a function of the demographic model and sampling scheme. Our automated method fits observed IICR curves of diploid individuals with IICR curves obtained under piecewise-stationary symmetrical island models, in which we assume a fixed number of time periods during which gene flow is constant. We infer the number of islands, their sizes, the periods at which connectivity changes and the corresponding rates of connectivity. Validation with simulated data showed that the method can accurately recover most of the scenario parameters. Our application to a set of five human PSMCs yielded demographic histories that are in agreement with previous studies using similar methods and with recent research suggesting ancient human structure. They are in contrast with the widely accepted view of human evolution consisting of one ancestral population branching into three large continental and panmictic populations with varying degrees of connectivity and no population structure within each continent.


Sign in / Sign up

Export Citation Format

Share Document