scholarly journals Identifying loci under positive selection in complex population histories

2018 ◽  
Author(s):  
Alba Refoyo-Martínez ◽  
Rute R. da Fonseca ◽  
Katrín Halldórsdóttir ◽  
Einar Árnason ◽  
Thomas Mailund ◽  
...  

AbstractDetailed modeling of a species’ history is of prime importance for understanding how natural selection operates over time. Most methods designed to detect positive selection along sequenced genomes, however, use simplified representations of past histories as null models of genetic drift. Here, we present the first method that can detect signatures of strong local adaptation across the genome using arbitrarily complex admixture graphs, which are typically used to describe the history of past divergence and admixture events among any number of populations. The method—called Graph-aware Retrieval of Selective Sweeps (GRoSS)—has good power to detect loci in the genome with strong evidence for past selective sweeps and can also identify which branch of the graph was most affected by the sweep. As evidence of its utility, we apply the method to bovine, codfish and human population genomic data containing multiple population panels related in complex ways. We find new candidate genes for important adaptive functions, including immunity and metabolism in under-studied human populations, as well as muscle mass, milk production and tameness in specific bovine breeds. We are also able to pinpoint the emergence of large regions of differentiation due to inversions in the history of Atlantic codfish.

2019 ◽  
Vol 20 (S9) ◽  
Author(s):  
Luis Torada ◽  
Lucrezia Lorenzon ◽  
Alice Beddis ◽  
Ulas Isildak ◽  
Linda Pattini ◽  
...  

Abstract Background The genetic bases of many complex phenotypes are still largely unknown, mostly due to the polygenic nature of the traits and the small effect of each associated mutation. An alternative approach to classic association studies to determining such genetic bases is an evolutionary framework. As sites targeted by natural selection are likely to harbor important functionalities for the carrier, the identification of selection signatures in the genome has the potential to unveil the genetic mechanisms underpinning human phenotypes. Popular methods of detecting such signals rely on compressing genomic information into summary statistics, resulting in the loss of information. Furthermore, few methods are able to quantify the strength of selection. Here we explored the use of deep learning in evolutionary biology and implemented a program, called , to apply convolutional neural networks on population genomic data for the detection and quantification of natural selection. Results enables genomic information from multiple individuals to be represented as abstract images. Each image is created by stacking aligned genomic data and encoding distinct alleles into separate colors. To detect and quantify signatures of positive selection, implements a convolutional neural network which is trained using simulations. We show how the method implemented in can be affected by data manipulation and learning strategies. In particular, we show how sorting images by row and column leads to accurate predictions. We also demonstrate how the misspecification of the correct demographic model for producing training data can influence the quantification of positive selection. We finally illustrate an approach to estimate the selection coefficient, a continuous variable, using multiclass classification techniques. Conclusions While the use of deep learning in evolutionary genomics is in its infancy, here we demonstrated its potential to detect informative patterns from large-scale genomic data. We implemented methods to process genomic data for deep learning in a user-friendly program called . The joint inference of the evolutionary history of mutations and their functional impact will facilitate mapping studies and provide novel insights into the molecular mechanisms associated with human phenotypes.


Bionomina ◽  
2013 ◽  
Vol 6 (1) ◽  
pp. 26-48 ◽  
Author(s):  
Taizo KIJIMA ◽  
Thierry HOQUET

This paper focuses on terminological issues related to the translation of Darwin’s concept of “natural selection” in Japanese. We analyze the historical fate of the different phrases used as translations, from the first attempts in the late 1870s until recent times. Our first finding is that the first part of the Japanese translations never changed during the period considered: “natural” was constantly rendered by “shizen”. By contrast, the Japanese terms for “selection” have dramatically changed over time. We identify some major breaks in the history of Japanese translations for “natural selection”. From the end of the 1870s to the early 1880s, several translations were suggested in books and periodicals: “shizen kanbatsu”, “shizen tōta”, “tensen”. Katō Hiroyuki adopted “shizen tōta” in 1882 and he undeniably played an important role in spreading this phrase as the standard translation for “natural selection”. The most common Japanese translation of the Origin during the first half of the 20th century (by Oka Asajirō in 1905) also used “shizen tōta”. Adramatic shift occurred after WWII, from “tōta” to “sentaku”. While a linear interpretation could suggest a move from a “bad” translation to a better one, a closer analysis leads to more challenging insights. Especially we stress the role of the kanji restriction policy, which specified which kanji should be taught in schools and thus should be used in textbooks: “tōta” was not included in the list, which may have led to the good fortune of “sentaku” in the 1950–1960s. We think the hypothesis of the influence of Chinese translations is not a plausible one. As to conceptual differences between “shizen tōta” and “sentaku”, they remain unconvincing as both terms could be interpreted as a positive or negative process: there is no clear reason to prefer one term over the other from the strict point of view of their meanings or etymology. Then, turning to the way terms are used, we compare translations of natural selection with translations of artificial or sexual selection. First we turn to the field of thremmatology (breeders): there, “tōta” (sometimes spelled in hiragana instead of kanji) often bore the meaning of culling; since 1917, breeders often used “sentaku” as a translation for “selection”. However, quite surprisingly, breeders used two different terms for selection as a practice (“senbatsu”), and “selection” as in “natural selection” (“shizen sentaku”). Finally, we compare possible translations for “sexual selection” and “matechoice”: here again, there are some good reasons to favour “tōta” over “sentaku” to avoid lexical confusion.


2014 ◽  
Author(s):  
Prem Gopalan ◽  
Wei Hao ◽  
David M. Blei ◽  
John D. Storey

One of the major goals of population genetics is to quantitatively understand variation of genetic polymorphisms among individuals. To this end, researchers have developed sophisticated statistical methods to capture the complex population structure that underlies observed genotypes in humans, and such methods have been effective for analyzing modestly sized genomic data sets. However, the number of genotyped humans has grown significantly in recent years, and it is accelerating. In aggregate about 1M individuals have been genotyped to date. Analyzing these data will bring us closer to a nearly complete picture of human genetic variation; but existing methods for population genetics analysis do not scale to data of this size. To solve this problem we developed TeraStructure. TeraStructure is a new algorithm to fit Bayesian models of genetic variation in human populations on tera-sample-sized data sets (1012observed genotypes, e.g., 1M individuals at 1M SNPs). It is a principled approach to Bayesian inference that iterates between subsampling locations of the genome and updating an estimate of the latent population structure of the individuals. On data sets of up to 2K individuals, TeraStructure matches the existing state of the art in terms of both speed and accuracy. On simulated data sets of up to 10K individuals, TeraStructure is twice as fast as existing methods and has higher accuracy in recovering the latent population structure. On genomic data simulated at the tera-sample-size scales, TeraStructure continues to be accurate and is the only method that can complete its analysis.


2020 ◽  
Vol 12 (3) ◽  
pp. 77-87 ◽  
Author(s):  
Muthukrishnan Eaaswarkhanth ◽  
Andre Luiz Campelo dos Santos ◽  
Omer Gokcumen ◽  
Fahd Al-Mulla ◽  
Thangavel Alphonse Thanaraj

Abstract Despite the extreme and varying environmental conditions prevalent in the Arabian Peninsula, it has experienced several waves of human migrations following the out-of-Africa diaspora. Eventually, the inhabitants of the peninsula region adapted to the hot and dry environment. The adaptation and natural selection that shaped the extant human populations of the Arabian Peninsula region have been scarcely studied. In an attempt to explore natural selection in the region, we analyzed 662,750 variants in 583 Kuwaiti individuals. We searched for regions in the genome that display signatures of positive selection in the Kuwaiti population using an integrative approach in a conservative manner. We highlight a haplotype overlapping TNKS that showed strong signals of positive selection based on the results of the multiple selection tests conducted (integrated Haplotype Score, Cross Population Extended Haplotype Homozygosity, Population Branch Statistics, and log-likelihood ratio scores). Notably, the TNKS haplotype under selection potentially conferred a fitness advantage to the Kuwaiti ancestors for surviving in the harsh environment while posing a major health risk to present-day Kuwaitis.


Genetics ◽  
2002 ◽  
Vol 160 (2) ◽  
pp. 753-763 ◽  
Author(s):  
Christian Schlötterer

AbstractWith the availability of completely sequenced genomes, multilocus scans of natural variability have become a feasible approach for the identification of genomic regions subjected to natural and artificial selection. Here, I introduce a new multilocus test statistic, ln RV, which is based on the ratio of observed variances in repeat number at a set of microsatellite loci in two groups of populations. The distribution of ln RV values captures demographic history of the populations as well as variation in microsatellite mutation among loci. Given that microsatellite loci associated with a recent selective sweep differ from the remainder of the genome, they are expected to fall outside of the distribution of neutral ln RV values. The ln RV test statistic is applied to a data set of 94 loci typed in eight non-African and two African human populations.


2019 ◽  
Author(s):  
Muthukrishnan Eaaswarkhanth ◽  
Andre Luiz Campelo dos Santos ◽  
Omer Gokcumen ◽  
Fahd Al-Mulla ◽  
Thangavel Alphonse Thanaraj

AbstractDespite the extreme and varying environmental conditions prevalent in the Arabian Peninsula, it has experienced several waves of human migrations following the out-of-Africa diaspora. Eventually, the inhabitants of the peninsula region adapted to the hot and dry environment. The adaptation and natural selection that shaped the extant human populations of the Arabian Peninsula region have been scarcely studied. In an attempt to explore natural selection in the region, we analyzed 662,750 variants in 583 Kuwaiti individuals. We searched for regions in the genome that display signatures of positive selection in the Kuwaiti population using an integrative approach in a conservative manner. We highlight a haplotype overlapping TNKS that showed strong signals of positive selection based on the results of the multiple selection tests conducted (integrated Haplotype Score, Cross Population Extended Haplotype Homozygosity, Population Branch Statistics, and log-likelihood ratio scores). Notably, the TNKS haplotype under selection potentially conferred a fitness advantage to the Kuwaiti ancestors for surviving in the harsh environment while posing a major health risk to present-day Kuwaitis.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Joao M. Alves ◽  
Sonia Prado-López ◽  
José Manuel Cameselle-Teijeiro ◽  
David Posada

Abstract How and when tumoral clones start spreading to surrounding and distant tissues is currently unclear. Here we leveraged a model-based evolutionary framework to investigate the demographic and biogeographic history of a colorectal cancer. Our analyses strongly support an early monoclonal metastatic colonization, followed by a rapid population expansion at both primary and secondary sites. Moreover, we infer a hematogenous metastatic spread under positive selection, plus the return of some tumoral cells from the liver back to the colon lymph nodes. This study illustrates how sophisticated techniques typical of organismal evolution can provide a detailed, quantitative picture of the complex tumoral dynamics over time and space.


2016 ◽  
Author(s):  
Sean G. Byars ◽  
Qin Qin Huang ◽  
Lesley-Ann Gray ◽  
Samuli Ripatti ◽  
Gad Abraham ◽  
...  

AbstractTraditional genome-wide scans for positive selection have mainly uncovered selective sweeps associated with monogenic traits. While selection on quantitative traits is much more common, very few signals have been detected because of their polygenic nature. We searched for positive selection signals underlying coronary artery disease (CAD) in worldwide populations, using novel approaches to quantify relationships between polygenic selection signals and CAD genetic risk. We identified new candidate adaptive loci that appear to have been directly modified by disease pressures given their significant associations with CAD genetic risk. These candidates were all uniquely and consistently associated with many different male and female reproductive traits suggesting selection may have also targeted these because of their direct effects on fitness. This suggests the presence of widespread antagonistic-pleiotropic tradeoffs on CAD loci, which provides a novel explanation for the maintenance and high prevalence of CAD in modern humans. Lastly, we found that positive selection more often targeted CAD gene regulatory variants using HapMap3 lymphoblastoid cell lines, which further highlights the unique biological significance of candidate adaptive loci underlying CAD. Our study provides a novel approach for detecting selection on polygenic traits and evidence that modern human genomes have evolved in response to CAD-induced selection pressures and other early-life traits sharing pleiotropic links with CAD.Author SummaryHow genetic variation contributes to disease is complex, especially for those such as coronary artery disease (CAD) that develop over the lifetime of individuals. One of the fundamental questions about CAD — whose progression begins in young adults with arterial plaque accumulation leading to life-threatening outcomes later in life — is why natural selection has not removed or reduced this costly disease. It is the leading cause of death worldwide and has been present in human populations for thousands of years, implying considerable pressures that natural selection should have operated on. Our study provides new evidence that genes underlying CAD have recently been modified by natural selection and that these same genes uniquely and extensively contribute to human reproduction, which suggests that natural selection may have maintained genetic variation contributing to CAD because of its beneficial effects on fitness. This study provides novel evidence that CAD has been maintained in modern humans as a byproduct of the fitness advantages those genes provide early in human lifecycles.


2013 ◽  
Author(s):  
Yaniv Brandvain ◽  
Amanda M Kenney ◽  
Lex Fagel ◽  
Graham Coop ◽  
Andrea L Sweigart

Mimulus guttatus and M. nasutus are an evolutionary and ecological model sister species pair differentiated by ecology, mating system, and partial reproductive isolation. Despite extensive research on this system, the history of divergence and differentiation in this sister pair is unclear. We present and analyze a novel population genomic data set which shows that M. nasutus "budded" off of a central Californian M. guttatus population within the last 200 to 500 thousand years. In this time, the M. nasutus genome has accrued numerous genomic signatures of the transition to predominant selfing. Despite clear biological differentiation, we document ongoing, bidirectional introgression. We observe a negative relationship between the recombination rate and divergence between M. nasutus and sympatric M. guttatus samples, suggesting that selection acts against M. nasutus ancestry in M. guttatus.


Sign in / Sign up

Export Citation Format

Share Document