scholarly journals Signatures of archaic adaptive introgression in present-day human populations

2016 ◽  
Author(s):  
Fernando Racimo ◽  
Davide Marnetto ◽  
Emilia Huerta-Sánchez

AbstractComparisons of DNA from archaic and modern humans show that these groups interbred, and in some cases received an evolutionary advantage from doing so. This process - adaptive introgression - may lead to a faster rate of adaptation than is predicted from models with mutation and selection alone. Within the last couple of years, a series of studies have identified regions of the genome that are likely examples of adaptive introgression. In many cases, once a region was ascertained as being introgressed, commonly used statistics based on both haplotype as well as allele frequency information were employed to test for positive selection. Introgression by itself, however, changes both the haplotype structure and the distribution of allele frequencies, thus confounding traditional tests for detecting positive selection. Therefore, patterns generated by introgression alone may lead to false inferences of positive selection. Here we explore models involving both introgression and positive selection to investigate the behavior of various statistics under adaptive introgression. In particular, we find that the number and allelic frequencies of sites that are uniquely shared between archaic humans and specific present-day populations are particularly useful for detecting adaptive introgression. We then examine the 1000 Genomes dataset to characterize the landscape of uniquely shared archaic alleles in human populations. Finally, we identify regions that were likely subject to adaptive introgression and discuss some of the most promising candidate genes located in these regions.


2020 ◽  
Author(s):  
Nathan S. Harris ◽  
Alan R. Rogers

AbstractSelection in humans often leaves subtle signatures at individual loci. Few studies have measured the extent to which these signals are shared among human populations. Here a new method is developed to compare weak signals of selection in aggregate across the genome using the 1000 Genomes Phase 3 Data. Results presented here show that selection producing weak selection serves to increase population differences around coding areas of the genome.



2020 ◽  
Vol 10 (10) ◽  
pp. 3663-3673 ◽  
Author(s):  
Vladimir Shchur ◽  
Jesper Svedberg ◽  
Paloma Medina ◽  
Russell Corbett-Detig ◽  
Rasmus Nielsen

Admixture is increasingly being recognized as an important factor in evolutionary genetics. The distribution of genomic admixture tracts, and the resulting effects on admixture linkage disequilibrium, can be used to date the timing of admixture between species or populations. However, the theory used for such prediction assumes selective neutrality despite the fact that many famous examples of admixture involve natural selection acting for or against admixture. In this paper, we investigate the effects of positive selection on the distribution of tract lengths. We develop a theoretical framework that relies on approximating the trajectory of the selected allele using a logistic function. By numerically calculating the expected allele trajectory, we also show that the approach can be extended to cases where the logistic approximation is poor due to the effects of genetic drift. Using simulations, we show that the model is highly accurate under most scenarios. We use the model to show that positive selection on average will tend to increase the admixture tract length. However, perhaps counter-intuitively, conditional on the allele frequency at the time of sampling, positive selection will actually produce shorter expected tract lengths. We discuss the consequences of our results in interpreting the timing of the introgression of EPAS1 from Denisovans into the ancestors of Tibetans.



2019 ◽  
Author(s):  
Vladimir Shchur ◽  
Jesper Svedberg ◽  
Paloma Medina ◽  
Russ Corbett-Detig ◽  
RASMUS Nielsen

ABSTRACTAdmixture is increasingly being recognized as an important factor in evolutionary genetics. The distribution of genomic admixture tracts, and the resulting effects on admixture linkage disequilibrium, can be used to date the timing of admixture between species or populations. However, the theory used for such prediction assumes selective neutrality despite the fact that many famous examples of admixture involve natural selection acting for or against admixture. In this paper, we investigate the effects of positive selection on the distribution of tract lengths. We develop a theoretical framework that relies on approximating the trajectory of the selected allele using a logistic function. By numerically calculating the expected allele trajectory, we also show that the approach can be extended to cases where the logistic approximation is poor due to the effects of genetic drift. Using simulations, we show that the model is highly accurate under most scenarios. We use the model to show that positive selection on average will tend to increase the admixture tract length. However, perhaps counter-intuitively, conditional on the allele frequency at the time of sampling, positive selection will actually produce shorter expected tract lengths. We discuss the consequences of our results in interpreting the timing of the introgression of EPAS1 from Denisovans into the ancestors of Tibetans.



2017 ◽  
Author(s):  
Kelsey Elizabeth Johnson ◽  
Benjamin F. Voight

ABSTRACTScans for positive selection in human populations have identified hundreds of sites across the genome with evidence of recent adaptation. These signatures often overlap across populations, but the question of how often these overlaps represent a single ancestral event remains unresolved. If a single positive selection event spread across many populations, the same sweeping haplotype should appear in each population and the selective pressure could be common across diverse populations and environments. Identifying such shared selective events would be of fundamental interest, pointing to genomic loci and human traits important in recent history across the globe. Additionally, genomic annotations that recently became available could help attach these signatures to a potential gene and molecular phenotype that may have been selected across multiple populations. We performed a scan for positive selection using the integrated haplotype score on 20 populations, and compared sweeping haplotypes using the haplotype-clustering capability of fastPHASE to create a catalog of shared and unshared overlapping selective sweeps in these populations. Using additional genomic annotations, we connect these multi-population sweep overlaps with potential biological mechanisms at several loci, including potential new sites of adaptive introgression, the glycophorin locus associated with malarial resistance, and the alcohol dehydrogenase cluster associated with alcohol dependency.



2020 ◽  
Author(s):  
Minhui Chen ◽  
Charleston W. K. Chiang

AbstractPolygenic adaptation is thought to be an important mechanism of phenotypic evolution in humans, although recent evidence of confounding due to residual stratification in consortium GWAS made studies of polygenic adaptation more difficult to interpret. Using FST as a measure of allele frequency differentiation, a previous study has shown that the mean FST among African, East Asian, and European populations is significantly higher at height-associated SNPs than that found at matched non-associated SNPs, suggesting that polygenic adaptation is one of the reasons for differences in human height among these continental populations. However, we showed here even though the height-associated SNPs were identified using only European ancestry individuals, the estimated effect sizes are significantly associated with structures across continental populations, potentially explaining the elevated level of differentiation previously reported. To alleviate concerns of biased ascertainment of SNPs, we re-examined the distribution of FST at height-associated alleles ascertained from two biobank level GWAS (UK Biobank, UKB, and Biobank Japan, BBJ). We showed that when compared to non-associated SNPs, height-associated SNPs remain significantly differentiated among African, East Asian, and European populations from both 1000 Genomes (p = 0.0012 and p = 0.0265 when height SNPs were ascertained from UKB and BBJ, respectively), and Human Genome Diversity Panels (p = 0.0225 for UKB and p = 0.0032 for BBJ analyses). In contrast to FST-based analyses, we found no significant difference or consistent ranked order among continental populations in polygenic height scores constructed from SNPs ascertained from UKB and BBJ. In summary, our results suggest that, consistent with previous reports, height-associated SNPs are significantly differentiated in frequencies among continental populations after removing concerns of confounding by uncorrected stratification. Polygenic score-based analysis in this context appears to be susceptible to the choice of SNPs and, as we compared to FST-based statistics in simulations, would lose power in detecting polygenic adaptation if there are independent converging selections in more than one population.



2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
R Morale. Sabater ◽  
B Lledo ◽  
J A Ortiz ◽  
F Lozano ◽  
A Bernabeu ◽  
...  

Abstract Study question Is it possible to identify a genetic cause of familial premature ovarian failure (POF) with whole-exome sequencing (WES)? Summary answer Whole-exome sequencing is the most efficient strategy to identify probably pathogenic mutations in different genes in pathologies of polygenic etiology such as premature ovarian failure. What is known already Premature ovarian failure is the loss of ovarian function before the age of 40, and it is a common cause of infertility in women. This pathology has a heterogeneous etiology. Some chromosomal and genetic alterations have been described, and could explain approximately 20% of cases. However, in most patients the origin remains unknown. Recent studies with next-generation sequencing (NGS) have identified new variants in candidate genes related with premature ovarian insufficiency (POI) or premature ovarian failure (POF). These genes are not only involved in processes such as folliculogenesis, but also with DNA damage repair, homologous recombination, and meiosis. Study design, size, duration Fourteen women, from 7 families, affected by idiopathic POF were included in the study from October 2019 to September 2020. Seven POF patients were recruited when they came to our clinic to undergo assisted reproductive treatment. In the anamnesis, it was found that they had relatives with a diagnosis of POF, who were also recruited for the study. The inclusion criteria were amenorrhea before 38 years old and analytical and ultrasound signs of ovarian failure. Participants/materials, setting, methods WES was performed using TrusightOne (Illumina®). Sequenced data were aligned through BWA tool and GATK algorithm was used for SNVs/InDel identification. VCF files were annotated using Variant Interpreter software. Only the variants shared by each family were extracted for analysis and these criteria were followed: (1) Exonic/splicing variants in genes related with POF or involved in biological ovarian functions (2) Variants with minor allele frequency (MAF) ≤0.05 and (3) having potentially moderate/strong functional effects. Main results and the role of chance Seventy-nine variants possibly related with the POF phenotype were identified in the seven families. All these variants had a minor allele frequency (MAF) ≤0.05 in the gnomAD database and 1000 genomes project. Among these candidate variants, two were nonsense, six splice region, one frameshift, two inframe deletion and 68 missense. Thirty-two of the missense variants were predicted to have deleterious effects by minimum two of the four in silico algorithms used (SIFT, PolyPhen–2, MutationTaster and PROVEAN). All variants were heterozygous, and all the families carried three or more candidate variants. Altogether, 43 probably damaging genetic variants were identified in 39 genes expressed in the ovary and related with POF/POI or linked to ovarian physiology. We have described genes that have never been associated to POF pathology, however they may be involved in key biological processes for ovarian function. Moreover, some of these genes were found in two families, for example DDX11, VWF, PIWIL3 and HSD3B1. DDX11 may function at the interface of replication-coupled DNA repair and sister chromatid cohesion. VWF gene is suggested to be associated with follicular atresia in previous studies. PIWIL3 functions in development and maintenance of germline stem cells, and HSD3B1 is implicated in ovarian steroidogenesis. Limitations, reasons for caution Whole-exome sequencing has some limitations: does not cover noncoding regions of the genome, it also cannot detect large rearrangements, copy-number variants (large deletions/duplications), mosaic mutations, mutations in repetitive or high GC rich regions and mutations in genes with corresponding pseudogenes or other highly homologous sequences. Wider implications of the findings: WES has previously shown to be an efficient tool to identify genes as cause of POF, and has demonstrated the polygenic etiology. Although some studies have focused on it, and many genes are identified, this study proposes new candidate genes and variants, having potentially moderate/strong functional effects, associated with POF. Trial registration number Not applicable



2021 ◽  
Author(s):  
Victoria Oberreiter ◽  
Tobias Goellner ◽  
David L. Morris ◽  
Helmut Schaschl

Abstract Background: Systemic lupus erythematosus (SLE) shows marked population-specific disparities in disease prevalence, including substantial variation in manifestations and complications according to genetic ancestry. Several recent studies suggest that a substantial proportion of variation of gene expression shows genetic ancestry-associated differences in gene regulation on immune responses. Positive selection may act in a population-specific manner on expression quantitative trait loci (eQTLs) and thereby contributes to the difference in the differences of SLE prevalence and manifestation in human populations. We tested the hypothesises that some of the identified SLE risk polymorphisms display pleiotropic effects or polygenicity driven by positive selection. We performed a genome-wide scan for recent positive selection by using integrated Haplotype Score (iHS) statistics in different human populations. In addition, we estimated the timing of beneficial mutations to understand what possible selective pressures drive positive selection at SLE-associated loci. Results: We identified several SLE risk loci that are population-specifically under positive selection. Almost all SNPs that are under positive selection function as cis-eQTLs in different tissue types. We determined that adaptive eQTLs affect the expression of fewer genes than non-adaptive eQTLs, suggesting a limited range of effect of an eQTL at SLE risk sites that show signatures of positive selection. Furthermore, some positively selected SNPs are located in transcription factor binding sequences. The timing of positive selection for the studied loci suggests that both environmental and recent lifestyle changes during as well as after the Neolithic Transition may have become selectively effective. We propose a novel link between positively selected eQTLs at a certain SLE risk locus in Europeans and a physiological pathway not previously considered in SLE.Conclusions: We conclude that population-specific adaptive eQTLs contribute to the observed variation in specific manifestations and complications of SLE in different ethnicities. Our results suggest also that human populations adapt more rapidly to environmental and lifestyle stimuli via modification of gene expression without having to alter the genetic code.



2020 ◽  
Author(s):  
Xinjun Zhang ◽  
Bernard Kim ◽  
Kirk E. Lohmueller ◽  
Emilia Huerta-Sánchez

AbstractAdmixture with archaic hominins has altered the landscape of genomic variation in modern human populations. Several gene regions have been previously identified as candidates of adaptive introgression (AI) that facilitated human adaptation to specific environments. However, simulation-based studies have suggested that population genetics processes other than adaptive mutations, such as heterosis from recessive deleterious variants private to populations before admixture, can also lead to patterns in genomic data that resemble adaptive introgression. The extent to which the presence of deleterious variants affect the false-positive rate and the power of current methods to detect AI has not been fully assessed. Here, we used extensive simulations to show that recessive deleterious mutations can increase the false positive rates of tests for AI compared to models without deleterious variants. We further examined candidates of AI in modern humans identified from previous studies and show that, although deleterious variants may hinder the performance of AI detection in modern humans, most signals remained robust when deleterious variants are included in the null model. While deleterious variants may have a limited impact on detecting signals of adaptive introgression in humans, we found that at least two AI candidate genes, HYAL2 and HLA, are particularly susceptible to high false positive rates due to the recessive deleterious mutations. By quantifying parameters that affect heterosis, we show that the high false positives are largely attributed to the high exon densities together with low recombination rates in the genomic regions, which can further be exaggerated by the population growth in recent human evolution. Although the combination of such parameters is rare in the human genome, caution is still warranted in other species with different genomic composition and demographic histories.



Sign in / Sign up

Export Citation Format

Share Document