scholarly journals New kinship and FST estimates reveal higher levels of differentiation in the global human population

2019 ◽  
Author(s):  
Alejandro Ochoa ◽  
John D. Storey

Kinship coefficients and FST, which measure genetic relatedness and the overall population structure, respectively, have important biomedical applications. However, existing estimators are only accurate under restrictive conditions that most natural population structures do not satisfy. We recently derived new kinship and FST estimators for arbitrary population structures [1, 2]. Our estimates on human datasets reveal a complex population structure driven by founder effects due to dispersal from Africa and admixture. Notably, our new approach estimates larger FST values of 26% for native worldwide human populations and 23% for admixed Hispanic individuals, whereas the existing approach estimates 9.8% and 2.6%, respectively. While previous work correctly measured FST between subpopulation pairs, our generalized FST measures genetic distances among all individuals and their most recent common ancestor (MRCA) population, revealing that genetic differentiation is greater than previously appreciated. This analysis demonstrates that estimating kinship and FST under more realistic assumptions is important for modern population genetic analysis.


2016 ◽  
Author(s):  
Kimberly F. McManus ◽  
Angela Taravella ◽  
Brenna Henn ◽  
Carlos D. Bustamante ◽  
Martin Sikora ◽  
...  

AbstractThe human DARC (Duffy antigen receptor for chemokines) gene encodes a membrane-bound chemokine receptor crucial for the infection of red blood cells by Plasmodium vivax, a major causative agent of malaria. Of the three major allelic classes segregating in human populations, the FY*O allele has been shown to protect against P. vivax infection and is near fixation in sub-Saharan Africa, while FY*B and FY*A are common in Europe and Asia, respectively. Due to the combination of its strong geographic differentiation and association with malaria resistance, DARC is considered a canonical example of a locus under positive selection in humans.Here, we use sequencing data from over 1,000 individuals in twenty-one human populations, as well as ancient human and great ape genomes, to analyze the fine scale population structure of DARC. We estimate the time to most recent common ancestor (TMRCA) of the FY*O mutation to be 42 kya (95% CI: 34–49 kya). We infer the FY*O null mutation swept to fixation in Africa from standing variation with very low initial frequency (0.1%) and a selection coefficient of 0.043 (95% CI:0.011–0.18), which is among the strongest estimated in the genome. We estimate the TMRCA of the FY*A mutation to be 57 kya (95% CI: 48–65 kya) and infer that, prior to the sweep of FY*O, all three alleles were segregating in Africa, as highly diverged populations from Asia and ≠Khomani San hunter-gatherers share the same FY*A haplotypes. We test multiple models of admixture that may account for this observation and reject recent Asian or European admixture as the cause.Author SummaryInfectious diseases have undoubtedly played an important role in ancient and modern human history. Yet, there are relatively few regions of the genome involved in resistance to pathogens that have shown a strong selection signal. We revisit the evolutionary history of a gene associated with resistance to the most common malaria-causing parasite, Plasmodium vivax, and show that it is one of regions of the human genome that has been under strongest selective pressure in our evolutionary history (selection coefficient: 5%). Our results are consistent with a complex evolutionary history of the locus involving selection on a mutation that was at a very low frequency in the ancestral African population (standing variation) and a large differentiation between European, Asian and African populations.



Animals ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 1136
Author(s):  
Cai Chen ◽  
Xiaoyan Wang ◽  
Wencheng Zong ◽  
Enrico D’Alessandro ◽  
Domenico Giosa ◽  
...  

RIPs have been developed as effective genetic markers and popularly applied for genetic analysis in plants, but few reports are available for domestic animals. Here, we established 30 new molecular markers based on the SINE RIPs, and applied them for population genetic analysis in seven Chinese miniature pigs. The data revealed that the closed herd (BM-clo), inbreeding herd (BM-inb) of Bama miniature pigs were distinctly different from the BM-cov herds in the conservation farm, and other miniature pigs (Wuzhishan, Congjiang Xiang, Tibetan, and Mingguang small ear). These later five miniature pig breeds can further be classified into two clades based on a phylogenetic tree: one included BM-cov and Wuzhishan, the other included Congjiang Xiang, Tibetan, and Mingguang small ear, which was well-supported by structure analysis. The polymorphic information contents estimated by using SINE RIPs are lower than the predictions based on microsatellites. Overall, the genetic distances and breed-relationships between these populations revealed by 30 SINE RIPs generally agree with their evolutions and geographic distributions. We demonstrated the potential of SINE RIPs as new genetic markers for genetic monitoring and population structure analysis in pigs, which can even be extended to other livestock animals.



2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Guang Yao Fan ◽  
Yi Ye ◽  
Yi Ping Hou

Abstract Detecting population structure and estimating individual biogeographical ancestry are very important in population genetics studies, biomedical research and forensics. Single-nucleotide polymorphism (SNP) has long been considered to be a primary ancestry-informative marker (AIM), but it is constrained by complex and time-consuming genotyping protocols. Following up on our previous study, we propose that a multi-insertion-deletion polymorphism (Multi-InDel) with multiple haplotypes can be useful in ancestry inference and hierarchical genetic population structures. A validation study for the X chromosome Multi-InDel marker (X-Multi-InDel) as a novel AIM was conducted. Genetic polymorphisms and genetic distances among three Chinese populations and 14 worldwide populations obtained from the 1000 Genomes database were analyzed. A Bayesian clustering method (STRUCTURE) was used to discern the continental origins of Europe, East Asia, and Africa. A minimal panel of ten X-Multi-InDels was verified to be sufficient to distinguish human ancestries from three major continental regions with nearly the same efficiency of the earlier panel with 21 insertion-deletion AIMs. Along with the development of more X-Multi-InDels, an approach using this novel marker has the potential for broad applicability as a cost-effective tool toward more accurate determinations of individual biogeographical ancestry and population stratification.



2008 ◽  
Vol 89 (12) ◽  
pp. 2933-2942 ◽  
Author(s):  
Miranda de Graaf ◽  
Albert D. M. E. Osterhaus ◽  
Ron A. M. Fouchier ◽  
Edward C. Holmes

Human (HMPV) and avian (AMPV) metapneumoviruses are closely related viruses that cause respiratory tract illnesses in humans and birds, respectively. Although HMPV was first discovered in 2001, retrospective studies have shown that HMPV has been circulating in humans for at least 50 years. AMPV was first isolated in the 1970s, and can be classified into four subgroups, A–D. AMPV subgroup C is more closely related to HMPV than to any other AMPV subgroup, suggesting that HMPV has emerged from AMPV-C upon zoonosis. Presently, at least four genetic lineages of HMPV circulate in human populations – A1, A2, B1 and B2 – of which lineages A and B are antigenically distinct. We used a Bayesian Markov Chain Monte Carlo (MCMC) framework to determine the evolutionary and epidemiological dynamics of HMPV and AMPV-C. The rates of nucleotide substitution, relative genetic diversity and time to the most recent common ancestor (TMRCA) were estimated using large sets of sequences of the nucleoprotein, the fusion protein and attachment protein genes. The sampled genetic diversity of HMPV was found to have arisen within the past 119–133 years, with consistent results across all three genes, while the TMRCA for HMPV and AMPV-C was estimated to have existed around 200 years ago. The relative genetic diversity observed in the four HMPV lineages was low, most likely reflecting continual population bottlenecks, with only limited evidence for positive selection.



2014 ◽  
Author(s):  
Prem Gopalan ◽  
Wei Hao ◽  
David M. Blei ◽  
John D. Storey

One of the major goals of population genetics is to quantitatively understand variation of genetic polymorphisms among individuals. To this end, researchers have developed sophisticated statistical methods to capture the complex population structure that underlies observed genotypes in humans, and such methods have been effective for analyzing modestly sized genomic data sets. However, the number of genotyped humans has grown significantly in recent years, and it is accelerating. In aggregate about 1M individuals have been genotyped to date. Analyzing these data will bring us closer to a nearly complete picture of human genetic variation; but existing methods for population genetics analysis do not scale to data of this size. To solve this problem we developed TeraStructure. TeraStructure is a new algorithm to fit Bayesian models of genetic variation in human populations on tera-sample-sized data sets (1012observed genotypes, e.g., 1M individuals at 1M SNPs). It is a principled approach to Bayesian inference that iterates between subsampling locations of the genome and updating an estimate of the latent population structure of the individuals. On data sets of up to 2K individuals, TeraStructure matches the existing state of the art in terms of both speed and accuracy. On simulated data sets of up to 10K individuals, TeraStructure is twice as fast as existing methods and has higher accuracy in recovering the latent population structure. On genomic data simulated at the tera-sample-size scales, TeraStructure continues to be accurate and is the only method that can complete its analysis.



Genes ◽  
2020 ◽  
Vol 11 (9) ◽  
pp. 1063
Author(s):  
Vincent G. Martinson

While the majority of symbiosis research is focused on bacteria, microbial eukaryotes play important roles in the microbiota and as pathogens, especially the incredibly diverse Fungi kingdom. The recent emergence of widespread pathogens in wildlife (bats, amphibians, snakes) and multidrug-resistant opportunists in human populations (Candida auris) has highlighted the importance of better understanding animal–fungus interactions. Regardless of their prominence there are few animal–fungus symbiosis models, but modern technological advances are allowing researchers to utilize novel organisms and systems. Here, I review a forgotten system of animal–fungus interactions: the beetle–fungus symbioses of Drugstore and Cigarette beetles with their symbiont Symbiotaphrina. As pioneering systems for the study of mutualistic symbioses, they were heavily researched between 1920 and 1970, but have received only sporadic attention in the past 40 years. Several features make them unique research organisms, including (1) the symbiont is both extracellular and intracellular during the life cycle of the host, and (2) both beetle and fungus can be cultured in isolation. Specifically, fungal symbionts intracellularly infect cells in the larval and adult beetle gut, while accessory glands in adult females harbor extracellular fungi. In this way, research on the microbiota, pathogenesis/infection, and mutualism can be performed. Furthermore, these beetles are economically important stored-product pests found worldwide. In addition to providing a historical perspective of the research undertaken and an overview of beetle biology and their symbiosis with Symbiotaphrina, I performed two analyses on publicly available genomic data. First, in a preliminary comparative genomic analysis of the fungal symbionts, I found striking differences in the pathways for the biosynthesis of two B vitamins important for the host beetle, thiamine and biotin. Second, I estimated the most recent common ancestor for Drugstore and Cigarette beetles at 8.8–13.5 Mya using sequence divergence (CO1 gene). Together, these analyses demonstrate that modern methods and data (genomics, transcriptomes, etc.) have great potential to transform these beetle–fungus systems into model systems again.



2013 ◽  
Vol 9 (2) ◽  
pp. 20121098 ◽  
Author(s):  
Sebastian Klaus ◽  
José C. E. Mendoza ◽  
Jia Huan Liew ◽  
Martin Plath ◽  
Rudolf Meier ◽  
...  

This study asked whether reductive traits in cave organisms evolve at a slower pace (suggesting neutral evolution under relaxed selection) than constructive changes, which are likely to evolve under directional selection. We investigated 11 subterranean and seven surface populations of Sundathelphusa freshwater crabs on Bohol Island, Philippines, and examined constructive traits associated with improved food finding in darkness (increased leg and setae length) and reductive traits (reduced cornea size and eyestalk length). All changes occurred rapidly, given that the age of the most recent common ancestor was estimated to be 722–271 ka based on three mitochondrial markers. In order to quantify the speed of character change, we correlated the degree of morphological change with genetic distances between surface and subterranean individuals. The temporal pattern of character change following the transition to subterranean life was indistinguishable for constructive and reductive traits, characterized by an immediate onset and rapid evolutionary change. We propose that the evolution of these reductive traits—just like constructive traits—is most likely driven by strong directional selection.



PLoS Genetics ◽  
2021 ◽  
Vol 17 (1) ◽  
pp. e1009241
Author(s):  
Alejandro Ochoa ◽  
John D. Storey

FST and kinship are key parameters often estimated in modern population genetics studies in order to quantitatively characterize structure and relatedness. Kinship matrices have also become a fundamental quantity used in genome-wide association studies and heritability estimation. The most frequently-used estimators of FST and kinship are method-of-moments estimators whose accuracies depend strongly on the existence of simple underlying forms of structure, such as the independent subpopulations model of non-overlapping, independently evolving subpopulations. However, modern data sets have revealed that these simple models of structure likely do not hold in many populations, including humans. In this work, we analyze the behavior of these estimators in the presence of arbitrarily-complex population structures, which results in an improved estimation framework specifically designed for arbitrary population structures. After generalizing the definition of FST to arbitrary population structures and establishing a framework for assessing bias and consistency of genome-wide estimators, we calculate the accuracy of existing FST and kinship estimators under arbitrary population structures, characterizing biases and estimation challenges unobserved under their originally-assumed models of structure. We then present our new approach, which consistently estimates kinship and FST when the minimum kinship value in the dataset is estimated consistently. We illustrate our results using simulated genotypes from an admixture model, constructing a one-dimensional geographic scenario that departs nontrivially from the independent subpopulations model. Our simulations reveal the potential for severe biases in estimates of existing approaches that are overcome by our new framework. This work may significantly improve future analyses that rely on accurate kinship and FST estimates.



2018 ◽  
Author(s):  
Abayomi S Olabode ◽  
Mariano Avino ◽  
Tammy Ng ◽  
Faisal Abu-Sardanah ◽  
David W Dick ◽  
...  

AbstractReconstructing the early dynamics of the HIV-1 pandemic can provide crucial insights into the socioeconomic drivers of emerging infectious diseases in human populations, including the roles of urbanization and transportation networks. Current evidence indicates that the global pandemic comprising almost entirely of HIV-1/M originated around the 1920s in central Africa. However, these estimates are based on molecular clock estimates that are assumed to apply uniformly across the virus genome. There is growing evidence that recombination has played a significant role in the early history of the HIV-1 pandemic, such that different regions of the HIV-1 genome have different evolutionary histories. In this study, we have conducted a dated-tip analysis of all near full-length HIV-1/M genome sequences that were published in the GenBank database. We used a sliding window approach similar to the ‘bootscanning’ method for detecting breakpoints in intersubtype recombinant sequences. We found evidence of substantial variation in estimated root dates among windows, with an estimated mean time to the most recent common ancestor (tMRCA) of 1922. Estimates were significantly autocorrelated, which was more consistent with an early recombination event than with stochastic error variation in phylogenetic reconstruction and dating analyses. A piecewise regression analysis supported the existence of at least one recombination breakpoint in the HIV-1/M genome with interval-specific means around 1929 and 1913, respectively. This analysis demonstrates that a sliding window approach can accommodate early recombination events outside the established nomenclature of HIV-1/M subtypes, although it is difficult to incorporate the earliest available samples due to their limited genome coverage.



2017 ◽  
Vol 19 (1) ◽  
pp. 1 ◽  
Author(s):  
Basengere Ayagirwe ◽  
Felix Meutchieye ◽  
Appolinaire Djikeng ◽  
Robert Skilton ◽  
Sarah Osama ◽  
...  

Although domestic cavies are widely used in sub-Saharan Africa as a source of meat and income, there are only a few studies of their population structure and genetic relatedness. This seminal study was designed with the main objective to assess the genetic diversity and determine the population structure of cavy populations from Cameroon to guide the development of a cavy improvement program. Sixteen microsatellite markers were used to genotype 109 individuals from five cavy populations (Wouri, Moungo and Nkongsamba in the Littoral region, and Mémé and Fako in the Southwest region of Cameroon). Twelve markers worked in the five populations with a total of 17 alleles identified, with a range of 2.9 to 4.0 alleles per locus. Observed heterozygosity (from 0.022 to 0.277) among populations was lower than expected heterozygosity (from 0.42 to 0.54). Inbreeding rates between individuals of the populations and between individuals in each population were 59.3% and 57.2%, respectively, against a moderate differentiation rate of 4.9%. All the tested loci deviated from Hardy-Weinberg equilibrium, except for locus 3. Genetic distances between populations were small (from 0.008 to 0.277), with a high rate of variability among individuals within each population (54.4%). Three distinct genetic groups were structured. This study has shown that microsatellites are useful for the genetic characterization of cavy populations in Cameroon and that the populations investigated have sufficient genetic diversity that can be used to be deployed as a basis for weight, prolificacy and disease resistance improvement. The genetic of diversity in Southern Cameroon is wide and constitute an opportunity for cavy breeding program. 



Sign in / Sign up

Export Citation Format

Share Document