scholarly journals An evolutionary compass for detecting signals of polygenic selection and mutational bias

2017 ◽  
Author(s):  
Lawrence H. Uricchio ◽  
Hugo C. Kitano ◽  
Alexander Gusev ◽  
Noah A. Zaitlen

Selection and mutation shape genetic variation underlying human traits, but the specific evolutionary mechanisms driving complex trait variation are largely unknown. We developed a statistical method that uses polarized GWAS summary statistics from a single population to detect signals of mutational bias and selection. We found evidence for non-neutral signals on variation underlying several traits (BMI, schizophrenia, Crohn’s disease, educational attainment, and height). We then used simulations that incorporate simultaneous negative and positive selection to show that these signals are consistent with mutational bias and shifts in the fitness-phenotype relationship, but not stabilizing selection or mutational bias alone. We additionally replicate two of our top three signals (BMI and educational attainment) in an external cohort, and show that population stratification may have confounded GWAS summary statistics for height in the GIANT cohort. Our results provide a flexible and powerful framework for evolutionary analysis of complex phenotypes in humans and other species, and offer insights into the evolutionary mechanisms driving variation in human polygenic traits.Impact summaryMany traits are variable within human populations and are likely to have a substantial and complex genetic component. This implies that mutations that have a functional impact on complex human traits have arisen throughout our species’ evolutionary history. However, it remains unclear how processes such as natural selection may have acted to shape trait variation at the genetic and phenotypic level. Better understanding of the mechanisms driving trait variation could provide insights into our evolutionary past and help clarify why it has been so difficult to map the preponderance of causal variation for common heritable diseases.In this study, we developed and applied methods for detecting signatures of mutation bias (i.e., the propensity of a new variant to be either trait-increasing or trait-decreasing) and natural selection acting on trait variation. We applied our approach to several heritable traits, and found evidence for both natural selection and mutation bias, including selection for decreased BMI and decreased risk for Crohn’s disease and schizophrenia.While our results are consistent with plausible evolutionary scenarios shaping a range of traits, it should be noted that the field of polygenic selection detection is still new, and current methods (including ours) rely on data from genome-wide association studies (GWAS). The data produced by these studies may be vulnerable to certain cryptic biases, especially population stratification, which could induce false selection signals. We therefore repeated our analyses for the top three hits in a cohort that should be less susceptible to this problem – we found that two of our top three signals replicated (BMI and educational attainment), while height did not. Our results highlight both the promise and pitfalls of polygenic selection detection approaches, and suggest a need for further work disentangling stratification from selection.

2021 ◽  
Author(s):  
Alexander L Cope ◽  
Premal Shah

Patterns of non-uniform usage of synonymous codons (codon bias) varies across genes in an organism and across species from all domains of life. The bias in codon usage is due to a combination of both non-adaptive (e.g. mutation biases) and adaptive (e.g. natural selection for translation efficiency/accuracy) evolutionary forces. Most population genetics models quantify the effects of mutation bias and selection on shaping codon usage patterns assuming a uniform mutation bias across the genome. However, mutation biases can vary both along and across chromosomes due to processes such as biased gene conversion, potentially obfuscating signals of translational selection. Moreover, estimates of variation in genomic mutation biases are often lacking for non-model organisms. Here, we combine an unsupervised learning method with a population genetics model of synonymous codon bias evolution to assess the impact of intragenomic variation in mutation bias on the strength and direction of natural selection on synonymous codon usage across 49 Saccharomycotina budding yeasts. We find that in the absence of a priori information, unsupervised learning approaches can be used to identify regions evolving under different mutation biases. We find that the impact of intragenomic variation in mutation bias varies widely, even among closely-related species. We show that the overall strength and direction of selection on codon usage can be underestimated by failing to account for intragenomic variation in mutation biases. Interestingly, genes falling into clusters identified by machine learning are also often physically clustered across chromosomes, consistent with processes such as biased gene conversion. Our results indicate the need for more nuanced models of sequence evolution that systematically incorporate the effects of variable mutation biases on codon frequencies.


2017 ◽  
Author(s):  
Ronald de Vlaming ◽  
Magnus Johannesson ◽  
Patrik K.E. Magnusson ◽  
M. Arfan Ikram ◽  
Peter M. Visscher

AbstractLD-score (LDSC) regression disentangles the contribution of polygenic signal, in terms of SNP-based heritability, and population stratification, in terms of a so-called intercept, to GWAS test statistics. Whereas LDSC regression uses summary statistics, methods like Haseman-Elston (HE) regression and genomic-relatedness-matrix (GRM) restricted maximum likelihood infer parameters such as SNP-based heritability from individual-level data directly. Therefore, these two types of methods are typically considered to be profoundly different. Nevertheless, recent work has revealed that LDSC and HE regression yield near-identical SNP-based heritability estimates when confounding stratification is absent. We now extend the equivalence; under the stratification assumed by LDSC regression, we show that the intercept can be estimated from individual-level data by transforming the coefficients of a regression of the phenotype on the leading principal components from the GRM. Using simulations, considering various degrees and forms of population stratification, we find that intercept estimates obtained from individual-level data are nearly equivalent to estimates from LDSC regression (R2> 99%). An empirical application corroborates these findings. Hence, LDSC regression is not profoundly different from methods using individual-level data; parameters that are identified by LDSC regression are also identified by methods using individual-level data. In addition, our results indicate that, under strong stratification, there is misattribution of stratification to the slope of LDSC regression, inflating estimates of SNP-based heritability from LDSC regression ceteris paribus. Hence, the intercept is not a panacea for population stratification. Consequently, LDSC-regression estimates should be interpreted with caution, especially when the intercept estimate is significantly greater than one.


Psych ◽  
2019 ◽  
Vol 1 (1) ◽  
pp. 55-75 ◽  
Author(s):  
Davide Piffer

Genetic variants identified by three large genome-wide association studies (GWAS) of educational attainment (EA) were used to test a polygenic selection model. Weighted and unweighted polygenic scores (PGS) were calculated and compared across populations using data from the 1000 Genomes (n = 26), HGDP-CEPH (n = 52) and gnomAD (n = 8) datasets. The PGS from the largest EA GWAS was highly correlated to two previously published PGSs (r = 0.96–0.97, N = 26). These factors are both highly predictive of average population IQ (r = 0.9, N = 23) and Learning index (r = 0.8, N = 22) and are robust to tests of spatial autocorrelation. Monte Carlo simulations yielded highly significant p values. In the gnomAD samples, the correlation between PGS and IQ was almost perfect (r = 0.98, N = 8), and ANOVA showed significant population differences in allele frequencies with positive effect. Socioeconomic variables slightly improved the prediction accuracy of the model (from 78–80% to 85–89%), but the PGS explained twice as much of the variance in IQ compared to socioeconomic variables. In both 1000 Genomes and gnomAD, there was a weak trend for lower GWAS significance SNPs to be less predictive of population IQ. Additionally, a subset of SNPs were found in the HGDP-CEPH sample (N = 127). The analysis of this sample yielded a positive correlation with latitude and a low negative correlation with distance from East Africa. This study provides robust results after accounting for spatial autocorrelation with Fst distances and random noise via an empirical Monte Carlo simulation using null SNPs.


1999 ◽  
Vol 5 (4) ◽  
pp. 291-318 ◽  
Author(s):  
Keith Downing ◽  
Peter Zvirinsky

Gaia theory, which states that organisms both affect and regulate their environment, poses an interesting problem to Neo-Darwinian evolutionary biologists and provides an exciting set of phenomena for artificial-life investigation. The key challenge is to explain the emergence of biotic communities that are capable, via their implicit coordination, of regulating large-scale biogeochemical factors such as the temperature and chemical composition of the biosphere, but to assume no evolutionary mechanisms beyond contemporary natural selection. Along with providing an introduction to Gaia theory, this article presents simulations of Gaian emergence based on an artificial-life model involving genetic algorithms and guilds of simple metabolizing agents. In these simulations, resource competition leads to guild diversity; the ensemble of guilds then manifests life-sustaining nutrient recycling and exerts distributed control over environmental nutrient ratios. These results illustrate that standard individual-based natural selection is sufficient to explain Gaian self-organization, and they help clarify the relationships between two key metrics of Gaian activity: recycling and regulation.


2020 ◽  
Author(s):  
Davide Piffer

Using the latest methods to detect divergent evolution and polygenic selection, I test the hypothesis that race differences (European-African) in IQ are due to genetic differences.The genetic variants identified by the largest GWAS of education showed clear signatures of differentiation between Africans and Europeans. Across different phenotypes (educational attainment, cognitive performance, math ability), GWAS SNPs had significantly higher average Fst than control SNPs. Contrary to a previous report, the same effect was found also for a GWAS based on a within-family design, that used differences in educational attainment between siblings to partial out shared environmental effects. Polygenic scores for all phenotypes and GWAS types (including within-family design) were higher for Europeans than for Africans.


Author(s):  
Marieke Woensdregt ◽  
Kenny Smith

Pragmatics is the branch of linguistics that deals with language use in context. It looks at the meaning linguistic utterances can have beyond their literal meaning (implicature), and also at presupposition and turn taking in conversation. Thus, pragmatics lies on the interface between language and social cognition. From the point of view of both speaker and listener, doing pragmatics requires reasoning about the minds of others. For instance, a speaker has to think about what knowledge they share with the listener to choose what information to explicitly encode in their utterance and what to leave implicit. A listener has to make inferences about what the speaker meant based on the context, their knowledge about the speaker, and their knowledge of general conventions in language use. This ability to reason about the minds of others (usually referred to as “mindreading” or “theory of mind”) is a cognitive capacity that is uniquely developed in humans compared to other animals. What we know about how pragmatics (and the underlying ability to make inferences about the minds of others) has evolved. Biological evolution and cultural evolution are the two main processes that can lead to the development of a complex behavior over generations, and we can explore to what extent they account for what we know about pragmatics. In biological evolution, changes happen as a result of natural selection on genetically transmitted traits. In cultural evolution on the other hand, selection happens on skills that are transmitted through social learning. Many hypotheses have been put forward about the role that natural selection may have played in the evolution of social and communicative skills in humans (for example, as a result of changes in food sources, foraging strategy, or group size). The role of social learning and cumulative culture, however, has been often overlooked. This omission is particularly striking in the case of pragmatics, as language itself is a prime example of a culturally transmitted skill, and there is solid evidence that the pragmatic capacities that are so central to language use may themselves be partially shaped by social learning. In light of empirical findings from comparative, developmental, and experimental research, we can consider the potential contributions of both biological and cultural evolutionary mechanisms to the evolution of pragmatics. The dynamics of types of evolutionary processes can also be explored using experiments and computational models.


2016 ◽  
Author(s):  
Gaurav Bhatia ◽  
Nicholas A. Furlotte ◽  
Po-Ru Loh ◽  
Xuanyao Liu ◽  
Hilary K. Finucane ◽  
...  

AbstractPopulation stratification is a well-documented confounder in GWASes, and is often addressed by including principal component (PC) covariates computed from common SNPs (SNP-PCs). In our analyses of summary statistics from 36 GWASes (mean n=88k), including 20 GWASes using 23andMe data that included SNP-PC covariates, we observed a significantly inflated LD score regression (LDSC) intercept for several traits—suggesting that residual stratification remains a concern, even when SNPPC covariates are included.Here we propose a new method, PC loading regression, to correct for stratification in summary statistics by leveraging SNP loadings for PCs computed in a large reference panel. In addition to SNP-PCs, the method can be applied to haploSNP-PCs, i.e. PCs computed from a larger number of rare haplotype variants that better capture subtle structure. Using simulations based on real genotypes from 54,000 individuals of diverse European ancestry from the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort, we show that PC loading regression effectively corrects for stratification along top PCs.We applied PC loading regression to several traits with inflated LDSC intercepts. Correcting for the top four SNP-PCs in GERA data, we observe a significant reduction in LDSC intercept height summary statistics from the Genetic Investigation of ANthropometric Traits (GIANT) consortium, but not for 23andMe summary statistics, which already included SNP-PC covariates. However, when correcting for additional haploSNP-PCs in 23andMe GWASes, inflation in the LDSC intercept was eliminated for eye color, hair color, and skin color and substantially reduced for height (1.41 to 1.16; n=430k). Correcting for haploSNP-PCs in GIANT height summary statistics eliminated inflation in the LDSC intercept (from 1.35 to 1.00; n=250k), eliminating 27 significant association signals including one at the LCT locus, which is highly differentiated among European populations and widely known to produce spurious signals. Overall, our results suggest that uncorrected population stratification is a concern in GWASes of large sample size and that PC loading regression can correct for this stratification.


Sign in / Sign up

Export Citation Format

Share Document