scholarly journals Properties and unbiased estimation of F- and D-statistics in samples containing related and inbred individuals

2020 ◽  
Author(s):  
Mehreen R. Mughal ◽  
Michael DeGiorgio

AbstractThe Patterson F- and D-statistics are commonly-used measures for quantifying population relationships and for testing hypotheses about demographic history. These statistics make use of allele frequency information across populations to infer different aspects of population history, such as population structure and introgression events. Inclusion of related or inbred individuals can bias such statistics, which may often lead to the filtering of such individuals. Here we derive statistical properties of the F- and D-statistics, including their biases due to finite sample size or the inclusion of related or inbred individuals, their variances, and their corresponding mean squared errors. Moreover, for those statistics that are biased, we develop unbiased estimators and evaluate the variances of these new quantities. Comparisons of the new unbiased statistics to the originals demonstrates that our newly-derived statistics often have lower error across a wide population parameter space. Furthermore, we apply these unbiased estimators using several global human populations with the inclusion of related individuals to highlight their application on an empirical dataset. Finally, we implement these unbiased estimators in open-source software package funbiased for easy application by the scientific community.

2016 ◽  
Author(s):  
Yuval B. Simons ◽  
Guy Sella

AbstractOver the past decade, there has been both great interest and confusion about whether recent demographic events—notably the Out-of-Africa-bottleneck and recent population growth—have led to differences in mutation load among human populations. The confusion can be traced to the use of different summary statistics to measure load, which lead to apparently conflicting results. We argue, however, that when statistics more directly related to load are used, the results of different studies and data sets consistently reveal little or no difference in the load of non-synonymous mutations among human populations. Theory helps to understand why no such differences are seen, as well as to predict in what settings they are to be expected. In particular, as predicted by modeling, there is evidence for changes in the load of recessive loss of function mutations in founder and inbred human populations. Also as predicted, eastern subspecies of gorilla, Neanderthals and Denisovans, who are thought to have undergone reductions in population sizes that exceed the human Out-of-Africa bottleneck in duration and severity, show evidence for increased load of non-synonymous mutations (relative to western subspecies of gorillas and modern humans, respectively). A coherent picture is thus starting to emerge about the effects of demographic history on the mutation load in populations of humans and close evolutionary relatives.


2018 ◽  
Author(s):  
Leonardo Arias ◽  
Roland Schröder ◽  
Alexander Hübner ◽  
Guillermo Barreto ◽  
Mark Stoneking ◽  
...  

ABSTRACTHuman populations often exhibit contrasting patterns of genetic diversity in the mtDNA and the non-recombining portion of the Y-chromosome (NRY), which reflect sex-specific cultural behaviors and population histories. Here, we sequenced 2.3 Mb of the NRY from 284 individuals representing more than 30 Native-American groups from Northwestern Amazonia (NWA) and compared these data to previously generated mtDNA genomes from the same groups, to investigate the impact of cultural practices on genetic diversity and gain new insights about NWA population history. Relevant cultural practices in NWA include postmarital residential rules and linguistic-exogamy, a marital practice in which men are required to marry women speaking a different language.We identified 2,969 SNPs in the NRY sequences; only 925 SNPs were previously described. The NRY and mtDNA data showed that males and females experienced different demographic histories: the female effective population size has been larger than that of males through time, and both markers show an increase in lineage diversification beginning ~5,000 years ago, with a male-specific expansion occurring ~3,500 years ago. These dates are too recent to be associated with agriculture, therefore we propose that they reflect technological innovations and the expansion of regional trade networks documented in the archaeological evidence. Furthermore, our study provides evidence of the impact of postmarital residence rules and linguistic exogamy on genetic diversity patterns. Finally, we highlight the importance of analyzing high-resolution mtDNA and NRY sequences to reconstruct demographic history, since this can differ considerably between males and females.


1992 ◽  
Vol 74 (3) ◽  
pp. 867-873 ◽  
Author(s):  
Frank O'Brien

The author's three-parameter square-root model for the measurement of discrete spatial density in human populations was previously derived under the assumption that exact coordinate locations of the density points were available. The model, called the population density index (PDI) model, has been expanded to include a set of routines for calculating two-dimensional spatial density measures based upon in situ geometric approximations of the interobject Euclidean distance measure for any finite sample size. The derivation and specification of the algorithm for the abbreviated calculation routines are presented and exemplified. The author has been able to apply the methods of the PDI model to submarine environments at the U.S. Naval Underwater Systems Center, resulting in several U.S. Patent applications.


2018 ◽  
Author(s):  
Clare Bycroft ◽  
Ceres Fernandez-Rozadilla ◽  
Clara Ruiz-Ponte ◽  
Inés Quintela-García ◽  
Ángel Carracedo ◽  
...  

Genetic differences within or between human populations (population structure) has been studied using a variety of approaches over many years. Recently there has been an increasing focus on studying genetic differentiation at fine geographic scales, such as within countries. Identifying such structure allows the study of recent population history, and identifies the potential for confounding in association studies, particularly when testing rare, often recently arisen variants. The Iberian Peninsula is linguistically diverse, has a complex demographic history, and is unique among European regions in having a centuries-long period of Muslim rule. Previous genetic studies of Spain have examined either a small fraction of the genome or only a few Spanish regions. Thus, the overall pattern of fine-scale population structure within Spain remains uncharacterised. Here we analyse genome-wide genotyping array data for 1,413 Spanish individuals sampled from all regions of Spain. We identify extensive fine-scale structure, down to unprecedented scales, smaller than 10 Km in some places. We observe a major axis of genetic differentiation that runs from east to west of the peninsula. In contrast, we observe remarkable genetic similarity in the north-south direction, and evidence of historical north-south population movement. Finally, without making particular prior assumptions about source populations, we show that modern Spanish people have regionally varying fractions of ancestry from a group most similar to modern north Moroccans. The north African ancestry results from an admixture event, which we date to 860 - 1120 CE, corresponding to the early half of Muslim rule. Our results indicate that it is possible to discern clear genetic impacts of the Muslim conquest and population movements associated with the subsequent Reconquista.


2018 ◽  
Author(s):  
John A. Kamm ◽  
Jonathan Terhorst ◽  
Richard Durbin ◽  
Yun S. Song

AbstractThe sample frequency spectrum (SFS), or histogram of allele counts, is an important summary statistic in evolutionary biology, and is often used to infer the history of population size changes, migrations, and other demographic events affecting a set of populations. The expected multipopulation SFS under a given demographic model can be efficiently computed when the populations in the model are related by a tree, scaling to hundreds of populations. Admixture, back-migration, and introgression are common natural processes that violate the assumption of a tree-like population history, however, and until now the expected SFS could be computed for only a handful of populations when the demographic history is not a tree. In this article, we present a new method for efficiently computing the expected SFS and linear functionals of it, for demographies described by general directed acyclic graphs. This method can scale to more populations than previously possible for complex demographic histories including admixture. We apply our method to an 8-population SFS to estimate the timing and strength of a proposed “basal Eurasian” admixture event in human history. We implement and release our method in a new open-source software package momi2.


2018 ◽  
Author(s):  
Aaron P. Ragsdale ◽  
Simon Gravel

AbstractWe learn about population history and underlying evolutionary biology through patterns of genetic polymorphism. Many approaches to reconstruct evolutionary histories focus on a limited number of informative statistics describing distributions of allele frequencies or patterns of linkage disequilibrium. We show that many commonly used statistics are part of a broad family of two-locus moments whose expectation can be computed jointly and rapidly under a wide range of scenarios, including complex multi-population demographies with continuous migration and admixture events. A full inspection of these statistics reveals that widely used models of human history fail to predict simple patterns of linkage disequilibrium. To jointly capture the information contained in classical and novel statistics, we implemented a tractable likelihood-based inference framework for demographic history. Using this approach, we show that human evolutionary models that include archaic admixture in Africa, Asia, and Europe provide a much better description of patterns of genetic diversity across the human genome. We estimate that an unidentified, deeply diverged population admixed with modern humans within Africa both before and after the split of African and Eurasian populations, contributing 4-8% genetic ancestry to individuals in world-wide populations.Author SummaryThroughout human history, populations have expanded and contracted, split and merged, and ex-changed migrants. Because these events affected genetic diversity, we can learn about human history by comparing predictions from evolutionary models to genetic data. Here, we show how to rapidly compute such predictions for a wide range of diversity measures within and across populations under complex demographic scenarios. While widely used models of human history accurately predict common measures of diversity, we show that they strongly underestimate the co-occurence of low frequency mutations within human populations in Asia, Europe, and Africa. Models allowing for archaic admixture, the relatively recent mixing of human populations with deeply diverged human lineages, resolve this discrepancy. We use such models to infer demographic models that include both recent and ancient features of human history. We recover the well-characterized admixture of Neanderthals in Eurasian populations, as well as admixture from an as-yet unknown diverged human population within Africa, further suggesting that admixture with deeply diverged lineages occurred multiple times in human history. By simultaneously testing model predictions for a broad range of diversity statistics, we can assess the robustness of common evolutionary models, identify missing historical events, and build more informed models of human demography.


2020 ◽  
Vol 26 (2) ◽  
pp. 113-129
Author(s):  
Hamza M. Ruzayqat ◽  
Ajay Jasra

AbstractIn the following article, we consider the non-linear filtering problem in continuous time and in particular the solution to Zakai’s equation or the normalizing constant. We develop a methodology to produce finite variance, almost surely unbiased estimators of the solution to Zakai’s equation. That is, given access to only a first-order discretization of solution to the Zakai equation, we present a method which can remove this discretization bias. The approach, under assumptions, is proved to have finite variance and is numerically compared to using a particular multilevel Monte Carlo method.


Genetics ◽  
2000 ◽  
Vol 155 (3) ◽  
pp. 1429-1437
Author(s):  
Oliver G Pybus ◽  
Andrew Rambaut ◽  
Paul H Harvey

Abstract We describe a unified set of methods for the inference of demographic history using genealogies reconstructed from gene sequence data. We introduce the skyline plot, a graphical, nonparametric estimate of demographic history. We discuss both maximum-likelihood parameter estimation and demographic hypothesis testing. Simulations are carried out to investigate the statistical properties of maximum-likelihood estimates of demographic parameters. The simulations reveal that (i) the performance of exponential growth model estimates is determined by a simple function of the true parameter values and (ii) under some conditions, estimates from reconstructed trees perform as well as estimates from perfect trees. We apply our methods to HIV-1 sequence data and find strong evidence that subtypes A and B have different demographic histories. We also provide the first (albeit tentative) genetic evidence for a recent decrease in the growth rate of subtype B.


2010 ◽  
Vol 60 (4) ◽  
pp. 449-465
Author(s):  
Wen Longying ◽  
Zhang Lixun ◽  
An Bei ◽  
Luo Huaxing ◽  
Liu Naifa ◽  
...  

AbstractWe have used phylogeographic methods to investigate the genetic structure and population history of the endangered Himalayan snowcock (Tetraogallus himalayensis) in northwestern China. The mitochondrial cytochrome b gene was sequenced of 102 individuals sampled throughout the distribution range. In total, we found 26 different haplotypes defined by 28 polymorphic sites. Phylogenetic analyses indicated that the samples were divided into two major haplogroups corresponding to one western and one eastern clade. The divergence time between these major clades was estimated to be approximately one million years. An analysis of molecular variance showed that 40% of the total genetic variability was found within local populations, 12% among populations within regional groups and 48% among groups. An analysis of the demographic history of the populations suggested that major expansions have occurred in the Himalayan snowcock populations and these correlate mainly with the first and the second largest glaciations during the Pleistocene. In addition, the data indicate that there was a population expansion of the Tianshan population during the uplift of the Qinghai-Tibet Plateau, approximately 2 million years ago.


2020 ◽  
Vol 12 (4) ◽  
pp. 407-412 ◽  
Author(s):  
Iain Mathieson ◽  
Federico Abascal ◽  
Lasse Vinner ◽  
Pontus Skoglund ◽  
Cristina Pomilla ◽  
...  

Abstract Baboons are one of the most abundant large nonhuman primates and are widely studied in biomedical, behavioral, and anthropological research. Despite this, our knowledge of their evolutionary and demographic history remains incomplete. Here, we report a 0.9-fold coverage genome sequence from a 5800-year-old baboon from the site of Ha Makotoko in Lesotho. The ancient baboon is closely related to present-day Papio ursinus individuals from southern Africa—indicating a high degree of continuity in the southern African baboon population. This level of population continuity is rare in recent human populations but may provide a good model for the evolution of Homo and other large primates over similar timespans in structured populations throughout Africa.


Sign in / Sign up

Export Citation Format

Share Document