scholarly journals Detecting ancient positive selection in humans using extended lineage sorting

2016 ◽  
Author(s):  
Stéphane Peyrégne ◽  
Michael James Boyle ◽  
Michael Dannemann ◽  
Kay Prüfer

ABSTRACTNatural selection that affected modern humans early in their evolution has likely shaped some of the traits that set present-day humans apart from their closest extinct and living relatives. The ability to detect ancient natural selection in the human genome could provide insights into the molecular basis for these human-specific traits. Here, we introduce a method for detecting ancient selective sweeps by scanning for extended genomic regions where our closest extinct relatives, Neandertals and Denisovans, fall outside of the present-day human variation. Regions that are unusually long indicate the presence of lineages that reached fixation in the human population faster than expected under neutral evolution. Using simulations we show that the method is able to detect ancient events of positive selection and that it can differentiate those from background selection. Applying our method to the 1000 Genomes dataset, we find evidence for ancient selective sweeps favoring regulatory changes and present a list of genomic regions that are predicted to underlie positively selected human specific traits.

2018 ◽  
Author(s):  
Alba Refoyo-Martínez ◽  
Rute R. da Fonseca ◽  
Katrín Halldórsdóttir ◽  
Einar Árnason ◽  
Thomas Mailund ◽  
...  

AbstractDetailed modeling of a species’ history is of prime importance for understanding how natural selection operates over time. Most methods designed to detect positive selection along sequenced genomes, however, use simplified representations of past histories as null models of genetic drift. Here, we present the first method that can detect signatures of strong local adaptation across the genome using arbitrarily complex admixture graphs, which are typically used to describe the history of past divergence and admixture events among any number of populations. The method—called Graph-aware Retrieval of Selective Sweeps (GRoSS)—has good power to detect loci in the genome with strong evidence for past selective sweeps and can also identify which branch of the graph was most affected by the sweep. As evidence of its utility, we apply the method to bovine, codfish and human population genomic data containing multiple population panels related in complex ways. We find new candidate genes for important adaptive functions, including immunity and metabolism in under-studied human populations, as well as muscle mass, milk production and tameness in specific bovine breeds. We are also able to pinpoint the emergence of large regions of differentiation due to inversions in the history of Atlantic codfish.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Paula S. Ramos ◽  
Stephanie R. Shaftman ◽  
Ralph C. Ward ◽  
Carl D. Langefeld

The reasons for the ethnic disparities in the prevalence of systemic lupus erythematosus (SLE) and the relative high frequency of SLE risk alleles in the population are not fully understood. Population genetic factors such as natural selection alter allele frequencies over generations and may help explain the persistence of such common risk variants in the population and the differential risk of SLE. In order to better understand the genetic basis of SLE that might be due to natural selection, a total of 74 genomic regions with compelling evidence for association with SLE were tested for evidence of recent positive selection in the HapMap and HGDP populations, using population differentiation, allele frequency, and haplotype-based tests. Consistent signs of positive selection across different studies and statistical methods were observed at several SLE-associated loci, includingPTPN22,TNFSF4,TET3-DGUOK,TNIP1,UHRF1BP1,BLK, andITGAMgenes. This study is the first to evaluate and report that several SLE-associated regions show signs of positive natural selection. These results provide corroborating evidence in support of recent positive selection as one mechanism underlying the elevated population frequency of SLE risk loci and supports future research that integrates signals of natural selection to help identify functional SLE risk alleles.


2017 ◽  
Author(s):  
Jiyun M. Moon ◽  
David M. Aronoff ◽  
John A. Capra ◽  
Patrick Abbot ◽  
Antonis Rokas

AbstractSialic acids are nine carbon sugars ubiquitously found on the surfaces of vertebrate cells and are involved in various immune response-related processes. In humans, at least 58 genes spanning diverse functions, from biosynthesis and activation to recycling and degradation, are involved in sialic acid biology. Because of their role in immunity, sialic acid biology genes have been hypothesized to exhibit elevated rates of evolutionary change. Consistent with this hypothesis, several genes involved in sialic acid biology have experienced higher rates of non-synonymous substitutions in the human lineage than their counterparts in other great apes, perhaps in response to ancient pathogens that infected hominins millions of years ago (paleopathogens). To test whether sialic acid biology genes have also experienced more recent positive selection during the evolution of the modern human lineage, reflecting adaptation to contemporary cosmopolitan or geographically-restricted pathogens, we examined whether their protein-coding regions showed evidence of recent hard and soft selective sweeps. This examination involved the calculation of four measures that quantify changes in allele frequency spectra, extent of population differentiation, and haplotype homozygosity caused by recent hard and soft selective sweeps for 55 sialic acid biology genes using publicly available whole genome sequencing data from 1,668 humans from three ethnic groups. To disentangle evidence for selection from confounding demographic effects, we compared the observed patterns in sialic acid biology genes to simulated sequences of the same length under a model of neutral evolution that takes into account human demographic history. We found that the patterns of genetic variation of most sialic acid biology genes did not significantly deviate from neutral expectations and were not significantly different among genes belonging to different functional categories. Those few sialic acid biology genes that significantly deviated from neutrality either experienced soft sweeps or population-specific hard sweeps. Interestingly, while most hard sweeps occurred on genes involved in sialic acid recognition, most soft sweeps involved genes associated with recycling, degradation and activation, transport, and transfer functions. We propose that the lack of signatures of recent positive selection for the majority of the sialic acid biology genes is consistent with the view that these genes regulate immune responses against ancient rather than contemporary cosmopolitan or geographically restricted pathogens.


2020 ◽  
Author(s):  
Xi Wang ◽  
Pär K Ingvarsson

AbstractDetecting natural selection is one of the major goals of evolutionary genomics. Here, we sequence whole genomes of 34 Picea abies individuals and quantify the amount of selection across the genome. Using an estimate of the distribution of fitness effects, we show that negative selection is very limited in coding regions, while positive selection is rare in coding regions but very strong in non-coding regions, suggesting the great importance of regulatory changes in evolution of Norway spruce. Additionally, we found a positive correlation between adaptive rate with recombination rate and a negative correlation between adaptive rate and gene density, suggesting a widespread influence from Hill-Robertson interference to efficiency of protein adaptation in P. abies. Finally, the distinct population statistics between genomic regions under either positive or balancing selection with that under neutral regions indicated impact from selection to genomic architecture of Norway spruce. Further gene ontology enrichment analysis for genes located in regions identified as undergoing either positive or long-term balancing selection also highlighted specific molecular functions and biological processes in that appear to be targets of selection in Norway spruce.


2021 ◽  
Author(s):  
Helmut E Simon ◽  
Gavin A Huttley

We present a new statistic for testing for neutral evolution from allele frequency data summarised as a site frequency spectrum, which we call the relative likelihood neutrality test or ρ. Classical methods of testing for natural selection, such as Tajima's D and its relatives, require the null model to have constant population size over time and therefore can confound demographic change with natural selection. ρ can directly incorporate a null hypothesis reflecting general demographic histories. It has a natural Bayesian interpretation as an approximation to the log-probability of the null model, given the data. We use simulations to show that ρ has greater power than Tajima's D to detect departure from neutrality for a range of scenarios of positive and negative selection. We also show how ρ can be adapted to account for sequencing error. Application to the ACKR1 (FYO) gene in humans supported previous studies inferring positive selection in sub-Saharan populations which were based on inter-population comparisons. However, we did not find the signal of selection to be maximal in the region of the FY*O or Duffy-null allele in these populations. We also applied ρ to investigate in greater detail a region on the 2q11.1 band of the human genome that has previously been identified as showing evidence of selection. This was done for a range of populations: for the European populations we incorporated a demographic history with a bottleneck corresponding to the putative out of Africa event. We were able to localise signals of selection to some specific regions and genes. Overall, we suggest that ρ will be a useful tool for identifying genomic regions that may be subject to natural selection.


2016 ◽  
Author(s):  
Daniel R. Schrider ◽  
Alexander G. Shanku ◽  
Andrew D. Kern

AbstractThe availability of large-scale population genomic sequence data has resulted in an explosion in efforts to infer the demographic histories of natural populations across a broad range of organisms. As demographic events alter coalescent genealogies they leave detectable signatures in patterns of genetic variation within and between populations. Accordingly, a variety of approaches have been designed to leverage population genetic data to uncover the footprints of demographic change in the genome. The vast majority of these methods make the simplifying assumption that the measures of genetic variation used as their input are unaffected by natural selection. However, natural selection can dramatically skew patterns of variation not only at selected sites, but at linked, neutral loci as well. Here we assess the impact of recent positive selection on demographic inference by characterizing the performance of three popular methods through extensive simulation of datasets with varying numbers of linked selective sweeps. In particular, we examined three different demographic models relevant to a number of species, finding that positive selection can bias parameter estimates of each of these models—often severely. Moreover, we find that selection can lead to incorrect inferences of population size changes when none have occurred. We argue that the amount of recent positive selection required to skew inferences may often be acting in natural populations. These results suggest that demographic studies conducted in many species to date may have exaggerated the extent and frequency of population size changes.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Elena N. Judd ◽  
Alison R. Gilchrist ◽  
Nicholas R. Meyerson ◽  
Sara L. Sawyer

Abstract Background The Type I interferon response is an important first-line defense against viruses. In turn, viruses antagonize (i.e., degrade, mis-localize, etc.) many proteins in interferon pathways. Thus, hosts and viruses are locked in an evolutionary arms race for dominance of the Type I interferon pathway. As a result, many genes in interferon pathways have experienced positive natural selection in favor of new allelic forms that can better recognize viruses or escape viral antagonists. Here, we performed a holistic analysis of selective pressures acting on genes in the Type I interferon family. We initially hypothesized that the genes responsible for inducing the production of interferon would be antagonized more heavily by viruses than genes that are turned on as a result of interferon. Our logic was that viruses would have greater effect if they worked upstream of the production of interferon molecules because, once interferon is produced, hundreds of interferon-stimulated proteins would activate and the virus would need to counteract them one-by-one. Results We curated multiple sequence alignments of primate orthologs for 131 genes active in interferon production and signaling (herein, “induction” genes), 100 interferon-stimulated genes, and 100 randomly chosen genes. We analyzed each multiple sequence alignment for the signatures of recurrent positive selection. Counter to our hypothesis, we found the interferon-stimulated genes, and not interferon induction genes, are evolving significantly more rapidly than a random set of genes. Interferon induction genes evolve in a way that is indistinguishable from a matched set of random genes (22% and 18% of genes bear signatures of positive selection, respectively). In contrast, interferon-stimulated genes evolve differently, with 33% of genes evolving under positive selection and containing a significantly higher fraction of codons that have experienced selection for recurrent replacement of the encoded amino acid. Conclusion Viruses may antagonize individual products of the interferon response more often than trying to neutralize the system altogether.


Genetics ◽  
1996 ◽  
Vol 144 (2) ◽  
pp. 689-703 ◽  
Author(s):  
Michael J Ford ◽  
Charles F Aquadro

Abstract We present the results of a restriction site survey of variation at five loci in Drosophila athabasca, complimenting a previous study of the period locus. There is considerably greater differentiation between the three semispecies of D. athabasca at the period locus and two other X-linked genes (neon-transient-A and E74A) than at three autosomal genes (Xdh, Adh and RC98). Using a modification of the HKA test, which uses fixed differences between the semispecies and a test based on differences in Fst among loci, we show that the greater differentiation of the X-linked loci compared with the autosomal loci is inconsistent with a neutral model of molecular evolution. We explore several evolutionary scenarios by computer simulation, including differential migration of X and autosomal genes, very low levels of migration among the semispecies, selective sweeps, and background selection, and conclude that X-linked selective sweeps in at least two of the semispecies are the best explanation for the data. This evidence that natural selection acted on the X-chromosome suggests that another X-linked trait, mating song differences among the semispecies, may have been the target of selection.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Melina Campos ◽  
Luisa D. P. Rona ◽  
Katie Willis ◽  
George K. Christophides ◽  
Robert M. MacCallum

Abstract Background Whole genome re-sequencing provides powerful data for population genomic studies, allowing robust inferences of population structure, gene flow and evolutionary history. For the major malaria vector in Africa, Anopheles gambiae, other genetic aspects such as selection and adaptation are also important. In the present study, we explore population genetic variation from genome-wide sequencing of 765 An. gambiae and An. coluzzii specimens collected from across Africa. We used t-SNE, a recently popularized dimensionality reduction method, to create a 2D-map of An. gambiae and An. coluzzii genes that reflect their population structure similarities. Results The map allows intuitive navigation among genes distributed throughout the so-called “mainland” and numerous surrounding “island-like” gene clusters. These gene clusters of various sizes correspond predominantly to low recombination genomic regions such as inversions and centromeres, and also to recent selective sweeps. Because this mosquito species complex has been studied extensively, we were able to support our interpretations with previously published findings. Several novel observations and hypotheses are also made, including selective sweeps and a multi-locus selection event in Guinea-Bissau, a known intense hybridization zone between An. gambiae and An. coluzzii. Conclusions Our results present a rich dataset that could be utilized in functional investigations aiming to shed light onto An. gambiae s.l genome evolution and eventual speciation. In addition, the methodology presented here can be used to further characterize other species not so well studied as An. gambiae, shortening the time required to progress from field sampling to the identification of genes and genomic regions under unique evolutionary processes.


Author(s):  
Gaotian Zhang ◽  
Jake D Mostad ◽  
Erik C Andersen

Abstract Life history traits underlie the fitness of organisms and are under strong natural selection. A new mutation that positively impacts a life history trait will likely increase in frequency and become fixed in a population (e.g. a selective sweep). The identification of the beneficial alleles that underlie selective sweeps provides insights into the mechanisms that occurred during the evolution of a species. In the global population of Caenorhabditis elegans, we previously identified selective sweeps that have drastically reduced chromosomal-scale genetic diversity in the species. Here, we measured the fecundity of 121 wild C. elegans strains, including many recently isolated divergent strains from the Hawaiian islands and found that strains with larger swept genomic regions have significantly higher fecundity than strains without evidence of the recent selective sweeps. We used genome-wide association (GWA) mapping to identify three quantitative trait loci (QTL) underlying the fecundity variation. Additionally, we mapped previous fecundity data from wild C. elegans strains and C. elegans recombinant inbred advanced intercross lines that were grown in various conditions and detected eight QTL using GWA and linkage mappings. These QTL show the genetic complexity of fecundity across this species. Moreover, the haplotype structure in each GWA QTL region revealed correlations with recent selective sweeps in the C. elegans population. North American and European strains had significantly higher fecundity than most strains from Hawaii, a hypothesized origin of the C. elegans species, suggesting that beneficial alleles that caused increased fecundity could underlie the selective sweeps during the worldwide expansion of C. elegans.


Sign in / Sign up

Export Citation Format

Share Document