scholarly journals Population differentiation of polygenic score predictions under stabilizing selection

2021 ◽  
Author(s):  
Sivan Yair ◽  
Graham Coop

1AbstractGiven the many loci uncovered by genome-wide association studies (GWAS), polygenic scores have become central to the drive for genomic medicine and have spread into various areas including evolutionary studies of adaptation. While promising, these scores are fraught with issues of portability across populations, due to the mis-estimation of effect sizes and missing causal loci across populations not represented in large-scale GWAS. The poor portability of polygenic scores at first seems at odds with the view that much of common genetic variation is shared among populations (Lewontin, 1972). Here we investigate one potential cause of this discrepancy: phenotypic stabilizing selection drives the turnover of genetic variation shared between populations at causal loci. Somewhat counter-intuitively, while stabilizing selection to the same optimum phenotype leads to lower phenotypic differentiation among populations, it increases genetic differentiation at GWAS loci and reduces the portability of polygenic scores constructed for unrepresented populations. We also find that stabilizing selection can lead to potentially misleading signals of the differentiation of average polygenic scores among populations. We extend our baseline model to investigate the impact of pleiotropy, gene-by-environment interactions, and directional selection on polygenic score predictions. Our work emphasizes stabilizing selection as a null evolutionary model to understand patterns of allele frequency differentiation and its impact on polygenic score portability and differentiation.

2018 ◽  
Author(s):  
LE Duncan ◽  
H Shen ◽  
B Gelaye ◽  
KJ Ressler ◽  
MW Feldman ◽  
...  

AbstractStudies examining relationships between genotypic and phenotypic variation have historically been carried out on people of European ancestry. Efforts are underway to address this limitation, but until they succeed, the legacy of a Euro-centric bias will continue to hinder research, including the use of polygenic scores, which are individual-level metrics of genetic risk. Ongoing debate surrounds the generalizability of polygenic scores based on genome-wide association studies (GWAS) conducted in European ancestry samples, to non-European ancestry samples. We analyzed the first decade of polygenic scoring studies (2008-2017, inclusive), and found that 67% of studies included exclusively European ancestry participants and another 19% included only East Asian ancestry participants. Only 3.8% of studies were carried out on samples of African, Hispanic, or Indigenous peoples. We find that effect sizes for European ancestry-derived polygenic scores are only 36% as large in African ancestry samples, as in European ancestry samples (t=−10.056, df=22, p=5.5×10−10). Analyzing global populations, we show that relationships between height polygenic scores and height are highly dependent on methodological choices in polygenic score construction, highlighting the need for caution in interpreting population level differences in distributions of polygenic scores, as currently calculated. These findings bolster the rationale for large-scale GWAS in diverse human populations and highlight the need for better handling of linkage disequilibrium and variant frequencies when applying scores to non-European samples.


PLoS Genetics ◽  
2021 ◽  
Vol 17 (1) ◽  
pp. e1008748
Author(s):  
Benedict Wieters ◽  
Kim A. Steige ◽  
Fei He ◽  
Evan M. Koch ◽  
Sebastián E. Ramos-Onsins ◽  
...  

The rate at which plants grow is a major functional trait in plant ecology. However, little is known about its evolution in natural populations. Here, we investigate evolutionary and environmental factors shaping variation in the growth rate of Arabidopsis thaliana. We used plant diameter as a proxy to monitor plant growth over time in environments that mimicked latitudinal differences in the intensity of natural light radiation, across a set of 278 genotypes sampled within four broad regions, including an outgroup set of genotypes from China. A field experiment conducted under natural conditions confirmed the ecological relevance of the observed variation. All genotypes markedly expanded their rosette diameter when the light supply was decreased, demonstrating that environmental plasticity is a predominant source of variation to adapt plant size to prevailing light conditions. Yet, we detected significant levels of genetic variation both in growth rate and growth plasticity. Genome-wide association studies revealed that only 2 single nucleotide polymorphisms associate with genetic variation for growth above Bonferroni confidence levels. However, marginally associated variants were significantly enriched among genes with an annotated role in growth and stress reactions. Polygenic scores computed from marginally associated variants confirmed the polygenic basis of growth variation. For both light regimes, phenotypic divergence between the most distantly related population (China) and the various regions in Europe is smaller than the variation observed within Europe, indicating that the evolution of growth rate is likely to be constrained by stabilizing selection. We observed that Spanish genotypes, however, reach a significantly larger size than Northern European genotypes. Tests of adaptive divergence and analysis of the individual burden of deleterious mutations reveal that adaptive processes have played a more important role in shaping regional differences in rosette growth than maladaptive evolution.


Author(s):  
Benedict Wieters ◽  
Kim A. Steige ◽  
Fei He ◽  
Evan M. Koch ◽  
Sebastián E. Ramos-Onsins ◽  
...  

AbstractThe rate at which plants grow is a major functional trait in plant ecology. However, little is known about its evolution in natural populations. Here, we investigate evolutionary and environmental factors shaping variation in the growth rate of Arabidopsis thaliana. We used plant diameter as a proxy to monitor plant growth over time in environments that mimicked latitudinal differences in the intensity of natural light radiation, across a set of 278 genotypes sampled within four broad regions, including an outgroup set of genotypes from China. A field experiment conducted under natural conditions confirmed the ecological relevance of the observed variation. All genotypes markedly expanded their rosette diameter when the light supply was decreased, demonstrating that environmental plasticity is a predominant source of variation to adapt plant size to prevailing light conditions. Yet, we detected significant levels of genetic variation both in growth rate and growth plasticity. Genome-wide association studies revealed that only 2 single nucleotide polymorphisms associate with genetic variation for growth above Bonferroni confidence levels. However, marginally associated variants were significantly enriched among genes with an annotated role in growth and stress reactions. Polygenic scores computed from marginally associated variants confirmed the polygenic basis of growth variation. For both light regimes, phenotypic divergence between the most distantly related population (China) and the various regions in Europe is smaller than the variation observed within Europe, indicating that some level of stabilizing selection constrains the evolution of growth rate. We observed that Spanish genotypes, however, reach a significantly larger size than Northern European genotypes. Tests of adaptive divergence and analysis of the individual burden of deleterious mutations reveal that adaptive processes have played a more important role in shaping regional differences in rosette growth than maladaptive evolution.


2014 ◽  
Vol 96 (5) ◽  
pp. e38-1-10 ◽  
Author(s):  
Nandina Paria ◽  
Lawson A Copley ◽  
John A Herring ◽  
Harry KW Kim ◽  
B Stephens Richards ◽  
...  

2019 ◽  
Author(s):  
Eriko Sasaki ◽  
Taiji Kawakatsu ◽  
Joseph Ecker ◽  
Magnus Nordborg

AbstractDNA cytosine methylation is an epigenetic mark associated with silencing of transposable elements (TEs) and heterochromatin formation. In plants, it occurs in three sequence contexts: CG, CHG, and CHH (where H is A, T, or C). The latter does not allow direct inheritance of methylation during DNA replication due to lack of symmetry, and methylation must therefore be re-established every cell generation. Genome-wide association studies (GWAS) have previously shown that CMT2 and NRPE1 are major determinants of genome-wide patterns of TE CHH-methylation. Here we instead focus on CHH-methylation of individual TEs and TE-families, allowing us to identify the pathways involved in CHH-methylation simply from natural variation and confirm the associations by comparing them with mutant phenotypes. Methylation at TEs targeted by the RNA-directed DNA methylation (RdDM) pathway is unaffected by CMT2 variation, but is strongly affected by variation at NRPE1, which is largely responsible for the longitudinal cline in this phenotype. In contrast, CMT2-targeted TEs are affected by both loci, which jointly explain 7.3% of the phenotypic variation (13.2% of total genetic effects). There is no longitudinal pattern for this phenotype, however, because the geographic patterns appear to compensate for each other in a pattern suggestive of stabilizing selection.Author SummaryDNA methylation is a major component of transposon silencing, and essential for genomic integrity. Recent studies revealed large-scale geographic variation as well as the existence of major trans-acting polymorphisms that partly explained this variation. In this study, we re-analyze previously published data (The 1001 Epigenomes), focusing on de novo DNA methylation patterns of individual TEs and TE families rather than on genome-wide averages (as was done in previous studies). GWAS of the patterns reveals the underlying regulatory networks, and allowed us to comprehensively characterize trans-regulation of de novo DNA methylation and its role in the striking geographic pattern for this phenotype.


Author(s):  
Davide Piffer

Background: The genetic variants identified by three large genome-wide association studies (GWAS) of educational attainment were used to test a polygenic selection model. Methods: Average frequencies of alleles with positive effect (polygenic scores or PS) were compared across populations (N=26) using data from 1000 Genomes. A null model was created using frequencies of random SNPs. Results: Polygenic selection signal of educational attainment GWAS hits is high among a handful of SNPs within genomic regions replicated across GWAS publications. A polygenic score comprising 9 SNPs predicts population IQ (r=0.88), outperforming 99% of the polygenic scores obtained from sets of random SNPs (Monte Carlo p= 0.011). Its predictive power remains unaffected after controlling for spatial autocorrelation (Beta= 0.83). The largest polygenic score (161 SNPs) exhibits similar predictive power (Beta=0.8). Random polygenic scores are moderate predictors of population IQ (thanks to spatial autocorrelation), and their predictive power increases logarithmically with the number of SNPs, indicating an exponential reduction in noise. Conclusion: This study provides guidance for using GWAS hits together with random SNPs for testing polygenic selection using Monte Carlo simulations.


Author(s):  
Sebastian Ocklenburg ◽  
Dorothea Metzen ◽  
Caroline Schlüter ◽  
Christoph Fraenz ◽  
Larissa Arning ◽  
...  

AbstractHandedness is the most widely investigated motor preference in humans. The genetics of handedness and especially the link between genetic variation, brain structure, and right-left preference have not been investigated in detail. Recently, several well-powered genome-wide association studies (GWAS) on handedness have been published, significantly advancing the understanding of the genetic determinants of left and right-handedness. In the present study, we estimated polygenic scores (PGS) of handedness-based on the GWAS by de Kovel and Francks (Sci Rep 9: 5986, 2019) in an independent validation cohort (n = 296). PGS reflect the sum effect of trait-associated alleles across many genetic loci. For the first time, we could show that these GWAS-based PGS are significantly associated with individual handedness lateralization quotients in an independent validation cohort. Additionally, we investigated whether handedness-derived polygenic scores are associated with asymmetries in gray matter macrostructure across the whole brain determined using magnetic resonance imaging. None of these associations reached significance after correction for multiple comparisons. Our results implicate that PGS obtained from large-scale handedness GWAS are significantly associated with individual handedness in smaller validation samples with more detailed phenotypic assessment.


Blood ◽  
2013 ◽  
Vol 122 (21) ◽  
pp. SCI-12-SCI-12
Author(s):  
Stuart H. Orkin

Abstract Expression of fetal hemoglobin (HbF, α2γ2) greatly ameliorates the severity of the major hemoglobin disorders, sickle cell disease (SCD), and the β-thalassemias. Efforts to reactivate HbF in adults with these disorders have relied on empirical observations or therapeutic modalities that are indirect. A major goal for the field is the development of targeted reactivation of HbF through relief of γ-globin gene silencing. The regulatory factors that participate in the switch from HbF to HbA in ontogeny and in γ-gene silencing in the adult have been elusive, therefore precluding mechanism-based reactivation of HbF. Recent findings, largely derived from genome-wide association studies (GWAS), have transformed the current understanding of globin switching. This presentation will review recent evidence supporting direct involvement of the zinc-finger repressor protein BCL11A in both developmental switching of globins and HbF silencing in the adult. These studies include the impact of naturally occurring genetic variation at the BCL11A locus on HbF levels, proof-of-principle experiments in genetically engineered mice suggesting that interference with BCL11A action alone may be sufficient to provide therapeutic elevation of HbF, and the nature of protein partners of BCL11A that may mediate some aspects of BCL11A function. Recent findings on the manner in which genetic variation within the BCL11A locus influences BCL11A expression provide special insight into quantitative aspects of HbF regulation and raise the possibility of new strategies to cripple BCL11A. The opportunities and challenges for the development of mechanism-based reactivation of HbF will be discussed in the context of ongoing efforts to exploit small molecule and genetic approaches. The tools are in hand to translate an improved understanding of globin gene regulation for the benefit of patients with the major hemoglobin disorders. Disclosures: No relevant conflicts of interest to declare.


Author(s):  
Nitao Cheng ◽  
Xinran Cui ◽  
Chen Chen ◽  
Changsheng Li ◽  
Jingyu Huang

Lung carcinoma is one of the most deadly malignant tumors in mankind. With the rising incidence of lung cancer, searching for the high effective cures become more and more imperative. There has been sufficient research evidence that living habits and situations such as smoking and air pollution are associated with an increased risk of lung cancer. Simultaneously, the influence of individual genetic susceptibility on lung carcinoma morbidity has been confirmed, and a growing body of evidence has been accumulated on the relationship between various risk factors and the risk of different pathological types of lung cancer. Additionally, the analyses from many large-scale cancer registries have shown a degree of familial aggregation of lung cancer. To explore lung cancer-related genetic factors, Genome-Wide Association Studies (GWAS) have been used to identify several lung cancer susceptibility sites and have been widely validated. However, the biological mechanism behind the impact of these site mutations on lung cancer remains unclear. Therefore, this study applied the Summary data-based Mendelian Randomization (SMR) model through the integration of two GWAS datasets and four expression Quantitative Trait Loci (eQTL) datasets to identify susceptibility genes. Using this strategy, we found ten of Single Nucleotide Polymorphisms (SNPs) sites that affect the occurrence and development of lung tumors by regulating the expression of seven genes. Further analysis of the signaling pathway about these genes not only provides important clues to explain the pathogenesis of lung cancer but also has critical significance for the diagnosis and treatment of lung cancer.


Cells ◽  
2019 ◽  
Vol 8 (10) ◽  
pp. 1180 ◽  
Author(s):  
Kwon ◽  
Chun ◽  
Kim ◽  
Mak

Systemic lupus erythematosus (SLE) is a chronic autoimmune disease of complex etiology that primarily affects women of childbearing age. The development of SLE is attributed to the breach of immunological tolerance and the interaction between SLE-susceptibility genes and various environmental factors, resulting in the production of pathogenic autoantibodies. Working in concert with the innate and adaptive arms of the immune system, lupus-related autoantibodies mediate immune-complex deposition in various tissues and organs, leading to acute and chronic inflammation and consequent end-organ damage. Over the past two decades or so, the impact of genetic susceptibility on the development of SLE has been well demonstrated in a number of large-scale genetic association studies which have uncovered a large fraction of genetic heritability of SLE by recognizing about a hundred SLE-susceptibility loci. Integration of genetic variant data with various omics data such as transcriptomic and epigenomic data potentially provides a unique opportunity to further understand the roles of SLE risk variants in regulating the molecular phenotypes by various disease-relevant cell types and in shaping the immune systems with high inter-individual variances in disease susceptibility. In this review, the catalogue of SLE susceptibility loci will be updated, and biological signatures implicated by the SLE-risk variants will be critically discussed. It is optimistically hoped that identification of SLE risk variants will enable the prognostic and therapeutic biomarker armamentarium of SLE to be strengthened, a major leap towards precision medicine in the management of the condition.


Sign in / Sign up

Export Citation Format

Share Document