scholarly journals Comparison of single genome and allele frequency data reveals discordant demographic histories

2017 ◽  
Author(s):  
Annabel C. Beichman ◽  
Tanya N. Phung ◽  
Kirk E. Lohmueller

ABSTRACTInference of demographic history from genetic data is a primary goal of population genetics of model and non-model organisms. Whole genome-based approaches such as the Pairwise/Multiple Sequentially Markovian Coalescent (PSMC/MSMC) methods use genomic data from one to four individuals to infer the demographic history of an entire population, while site frequency spectrum (SFS)-based methods use the distribution of allele frequencies in a sample to reconstruct the same historical events. Although both methods are extensively used in empirical studies and perform well on data simulated under simple models, there have been only limited comparisons of them in more complex and realistic settings. Here we use published demographic models based on data from three human populations (Yoruba (YRI), descendants of northwest-Europeans (CEU), and Han Chinese (CHB)) as an empirical test case to study the behavior of both inference procedures. We find that several of the demographic histories inferred by the whole genome-based methods do not predict the genome-wide distribution of heterozygosity nor do they predict the empirical SFS. However, using simulated data, we also find that the whole genome methods can reconstruct the complex demographic models inferred by SFS-based methods, suggesting that the discordant patterns of genetic variation are not attributable to a lack of statistical power, but may reflect unmodeled complexities in the underlying demography. More generally, our findings indicate that demographic inference from a small number of genomes, routine in genomic studies of nonmodel organisms, should be interpreted cautiously, as these models cannot recapitulate other summaries of the data.

Genetics ◽  
2020 ◽  
Vol 214 (4) ◽  
pp. 1019-1030 ◽  
Author(s):  
Raul Torres ◽  
Markus G. Stetter ◽  
Ryan D. Hernandez ◽  
Jeffrey Ross-Ibarra

Neutral genetic diversity across the genome is determined by the complex interplay of mutation, demographic history, and natural selection. While the direct action of natural selection is limited to functional loci across the genome, its impact can have effects on nearby neutral loci due to genetic linkage. These effects of selection at linked sites, referred to as genetic hitchhiking and background selection (BGS), are pervasive across natural populations. However, only recently has there been a focus on the joint consequences of demography and selection at linked sites, and some empirical studies have come to apparently contradictory conclusions as to their combined effects. To understand the relationship between demography and selection at linked sites, we conducted an extensive forward simulation study of BGS under a range of demographic models. We found that the relative levels of diversity in BGS and neutral regions vary over time and that the initial dynamics after a population size change are often in the opposite direction of the long-term expected trajectory. Our detailed observations of the temporal dynamics of neutral diversity in the context of selection at linked sites in nonequilibrium populations provide new intuition about why patterns of diversity under BGS vary through time in natural populations and help reconcile previously contradictory observations. Most notably, our results highlight that classical models of BGS are poorly suited for predicting diversity in nonequilibrium populations.


2019 ◽  
Author(s):  
Ke Wang ◽  
Iain Mathieson ◽  
Jared O’Connell ◽  
Stephan Schiffels

AbstractThe genetic diversity of humans, like many species, has been shaped by a complex pattern of population separations followed by isolation and subsequent admixture. This pattern, reaching at least as far back as the appearance of our species in the paleontological record, has left its traces in our genomes. Reconstructing a population’s history from these traces is a challenging problem. Here we present a novel approach based on the Multiple Sequentially Markovian Coalescent (MSMC) to analyse the population separation history. Our approach, called MSMC-IM, uses an improved implementation of the MSMC (MSMC2) to estimate coalescence rates within and across pairs of populations, and then fits a continuous Isolation-Migration model to these rates to obtain a time-dependent estimate of gene flow. We show, using simulations, that our method can identify complex demographic scenarios involving post-split admixture or archaic introgression. We apply MSMC-IM to whole genome sequences from 15 worldwide populations, tracking the process of human genetic diversification. We detect traces of extremely deep ancestry between some African populations, with around 1% of ancestry dating to divergences older than a million years ago.Author SummaryHuman demographic history is reflected in specific patterns of shared mutations between the genomes from different populations. Here we aim to unravel this pattern to infer population structure through time with a new approach, called MSMC-IM. Based on estimates of coalescence rates within and across populations, MSMC-IM fits a time-dependent migration model to the pairwise rate of coalescences. We implemented this approach as an extension to existing software (MSMC2), and tested it with simulations exhibiting different histories of admixture and gene flow. We then applied it to the genomes from 15 worldwide populations to reveal their pairwise separation history ranging from a few thousand up to several million years ago. Among other results, we find evidence for remarkably deep population structure in some African population pairs, suggesting that deep ancestry dating to one million years ago and older is still present in human populations in small amounts today.


GigaScience ◽  
2020 ◽  
Vol 9 (3) ◽  
Author(s):  
Ekaterina Noskova ◽  
Vladimir Ulyantsev ◽  
Klaus-Peter Koepfli ◽  
Stephen J O’Brien ◽  
Pavel Dobrynin

Abstract Background The demographic history of any population is imprinted in the genomes of the individuals that make up the population. One of the most popular and convenient representations of genetic information is the allele frequency spectrum (AFS), the distribution of allele frequencies in populations. The joint AFS is commonly used to reconstruct the demographic history of multiple populations, and several methods based on diffusion approximation (e.g., ∂a∂i) and ordinary differential equations (e.g., moments) have been developed and applied for demographic inference. These methods provide an opportunity to simulate AFS under a variety of researcher-specified demographic models and to estimate the best model and associated parameters using likelihood-based local optimizations. However, there are no known algorithms to perform global searches of demographic models with a given AFS. Results Here, we introduce a new method that implements a global search using a genetic algorithm for the automatic and unsupervised inference of demographic history from joint AFS data. Our method is implemented in the software GADMA (Genetic Algorithm for Demographic Model Analysis, https://github.com/ctlab/GADMA). Conclusions We demonstrate the performance of GADMA by applying it to sequence data from humans and non-model organisms and show that it is able to automatically infer a demographic model close to or even better than the one that was previously obtained manually. Moreover, GADMA is able to infer multiple demographic models at different local optima close to the global one, providing a larger set of possible scenarios to further explore demographic history.


2018 ◽  
Author(s):  
Valentin Thouzeau ◽  
Antonin Affholder ◽  
Philippe Mennecier ◽  
Paul Verdu ◽  
Frédéric Austerlitz

AbstractHistorical linguistics highly benefited from recent methodological advances inspired by phylogenetics. Nevertheless, no currently available method uses contemporaneous within-population linguistic diversity to reconstruct the history of human populations. Here, we develop an approach inspired from population genetics to perform historical linguistic inferences from linguistic data sampled at the individual scale, within a population. We built four demographic models of linguistic transmission at this scale, each model differing by the number of teachers involved during the language acquisition, and the relative roles of these teachers. We then compared the simulated data obtained with these models with real contemporaneous linguistic data sampled in Tajik speakers in Central Asia, an area known for its high within-population linguistic diversity, using approximate Bayesian computation methods. With these statistical methods, we were able to select the models that best explained the data, and inferred the best-fitting parameters under these selected models, demonstrating the feasibility of using contemporaneous within-population linguistic diversity to infer historical features of human cultural evolution.


2019 ◽  
Author(s):  
Xiaofeng Zhu ◽  
Xumin Ni ◽  
Mengshi Zhou ◽  
Heming Wang ◽  
Karen He ◽  
...  

Abstract Background: Fitness epistasis, the interaction effect of genes at different loci on fitness, has an important contribution for adaptive evolution. Although fitness interaction evidence has been observed in model organisms, it is less detectable and remains poorly understood in human populations owing to the limited statistical power and experimental constraints. Fitness epistasis is inferred from non-independence between unlinked loci. We previously observed ancestral block correlation between chromosomes 4 and 6 in African Americans. The same approach fails when examining ancestral blocks on the same chromosome due to strong confounding effect in a recently admixed population. Results: We developed a novel approach to eliminate the bias caused by admixture linkage disequilibrium when searching for fitness epistasis on the same chromosome. We applied this approach in 16,252 unrelated African Americans and identified significant ancestral correlations in two pairs of genomic regions (P-value<8.11×10 -7 ) on chromosomes 1 and 10. The ancestral correlations were not explained by population admixture. Historical African-European crossover events are reduced between pair of epistatic regions. We observed multiple pairs of co-expressed genes between the two regions on each chromosome, including ADAR being co-expressed with IFI44 in almost all tissues and DARC being co-expressed with VCAM1, S1PR1 and ELTD1 in multiple tissues in GTEx. Moreover, the co-expressed gene pairs are associated with the same diseases/traits in the GWAS Catalog, such as white blood cell count, blood pressure, lung function, inflammatory bowel disease and educational attainment. Conclusions: Our analyses revealed two instances of fitness epistasis on chromosomes 1 and 10, and the findings suggest a potential approach to better understand adaptive evolution.


2020 ◽  
Author(s):  
Xumin Ni ◽  
Mengshi Zhou ◽  
Heming Wang ◽  
Karen He ◽  
Uli Broeckel ◽  
...  

Abstract Background: Fitness epistasis, the interaction effect of genes at different loci on fitness, has an important contribution for adaptive evolution. Although fitness interaction evidence has been observed in model organisms, it is less detectable and remains poorly understood in human populations owing to the limited statistical power and experimental constraints. Fitness epistasis is inferred from non-independence between unlinked loci. We previously observed ancestral block correlation between chromosomes 4 and 6 in African Americans. The same approach fails when examining ancestral blocks on the same chromosome due to strong confounding effect in a recently admixed population.Results: We developed a novel approach to eliminate the bias caused by admixture linkage disequilibrium when searching for fitness epistasis on the same chromosome. We applied this approach in 16,252 unrelated African Americans and identified significant ancestral correlations in two pairs of genomic regions (P-value<8.11×10-7) on chromosomes 1 and 10. The ancestral correlations were not explained by population admixture. Historical African-European crossover events are reduced between pair of epistatic regions. We observed multiple pairs of co-expressed genes between the two regions on each chromosome, including ADAR being co-expressed with IFI44 in almost all tissues and DARC being co-expressed with VCAM1, S1PR1 and ELTD1 in multiple tissues in The Genotype-Tissue Expression (GTEx) data. Moreover, the co-expressed gene pairs are associated with the same diseases/traits in the GWAS Catalog, such as white blood cell count, blood pressure, lung function, inflammatory bowel disease and educational attainment.Conclusions: Our analyses revealed two instances of fitness epistasis on chromosomes 1 and 10, and the findings suggest a potential approach to better understand adaptive evolution.


2019 ◽  
Author(s):  
Raul Torres ◽  
Markus G Stetter ◽  
Ryan D Hernandez ◽  
Jeffrey Ross-Ibarra

ABSTRACTNeutral genetic diversity across the genome is determined by the complex interplay of mutation, demographic history, and natural selection. While the direct action of natural selection is limited to functional loci across the genome, its impact can have effects on nearby neutral loci due to genetic linkage. These effects of selection at linked sites, referred to as genetic hitchhiking and background selection (BGS), are pervasive across natural populations. However, only recently has there been a focus on the joint consequences of demography and selection at linked sites, and empirical studies have sometimes come to apparently contradictory conclusions as to their combined effects. In order to understand the relationship between demography and selection at linked sites, we conducted an extensive forward simulation study of BGS under a range of demographic models. We found that the relative levels of diversity in BGS and neutral regions vary over time and that the initial dynamics after a population size change are often in the opposite direction of the long-term expected trajectory. Our detailed observations of the temporal dynamics of neutral diversity in the context of selection at linked sites in non-equilibrium populations provides new intuition about why patterns of diversity under BGS vary through time in natural populations and help reconcile previously contradictory observations. Most notably, our results highlight that classical models of BGS are poorly suited for predicting diversity in non-equilibrium populations.


2020 ◽  
Author(s):  
Xumin Ni ◽  
Mengshi Zhou ◽  
Heming Wang ◽  
Karen He ◽  
Uli Broeckel ◽  
...  

Abstract Background: Fitness epistasis, the interaction effect of genes at different loci on fitness, has an important contribution for adaptive evolution. Although fitness interaction evidence has been observed in model organisms, it is less detectable and remains poorly understood in human populations owing to the limited statistical power and experimental constraints. Fitness epistasis is inferred from non-independence between unlinked loci. We previously observed ancestral block correlation between chromosomes 4 and 6 in African Americans. The same approach fails when examining ancestral blocks on the same chromosome due to strong confounding effect in a recently admixed population. Results: We developed a novel approach to eliminate the bias caused by admixture linkage disequilibrium when searching for fitness epistasis on the same chromosome. We applied this approach in 16,252 unrelated African Americans and identified significant ancestral correlations in two pairs of genomic regions (P-value<8.11×10-7) on chromosomes 1 and 10. The ancestral correlations were not explained by population admixture. Historical African-European crossover events are reduced between pair of epistatic regions. We observed multiple pairs of co-expressed genes between the two regions on each chromosome, including ADAR being co-expressed with IFI44 in almost all tissues and DARC being co-expressed with VCAM1, S1PR1 and ELTD1 in multiple tissues in GTEx. Moreover, the co-expressed gene pairs are associated with the same diseases/traits in the GWAS Catalog, such as white blood cell count, blood pressure, lung function, inflammatory bowel disease and educational attainment. Conclusions: Our analyses revealed two instances of fitness epistasis on chromosomes 1 and 10, and the findings suggest a potential approach to better understand adaptive evolution.


Sign in / Sign up

Export Citation Format

Share Document