Controlling for Background Genetic Effects Using Polygenic Scores Improves the Power of Genome-wide Association Studies

Abstract Ongoing increases in the size of human genotype and phenotype collections offer the promise of improved understanding of the genetics of complex diseases. In addition to the biological insights that can be gained from the nature of the variants that contribute to the genetic component of complex trait variability, these data bring forward the prospect of predicting complex traits and the risk of complex genetic diseases from genotype data. Here we show that advances in phenotype prediction can be applied to improve the power of genome-wide association studies. We demonstrate a simple and efficient method to model genetic background effects using polygenic scores derived from SNPs that are not on the same chromosome as the target SNP. Using simulated and real data we found that this can result in a substantial increase in the number of variants passing genome-wide significance thresholds. This increase in power to detect trait-associated variants also translates into an increase in the accuracy with which the resulting polygenic score predicts the phenotype from genotype data. Our results suggest that advances in methods for phenotype prediction can be exploited to improve the control of background genetic effects, leading to more accurate GWAS results and further improvements in phenotype prediction.

Download Full-text

Controlling for background genetic effects using polygenic scores improves the power of genome-wide association studies

Scientific Reports ◽

10.1038/s41598-021-99031-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Declan Bennett ◽

Donal O’Shea ◽

John Ferguson ◽

Derek Morris ◽

Cathal Seoighe

Keyword(s):

Complex Traits ◽

Association Studies ◽

Real Data ◽

Genetic Effects ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genotype Data ◽

Phenotype Prediction ◽

Genome Wide ◽

Polygenic Scores

AbstractOngoing increases in the size of human genotype and phenotype collections offer the promise of improved understanding of the genetics of complex diseases. In addition to the biological insights that can be gained from the nature of the variants that contribute to the genetic component of complex trait variability, these data bring forward the prospect of predicting complex traits and the risk of complex genetic diseases from genotype data. Here we show that advances in phenotype prediction can be applied to improve the power of genome-wide association studies. We demonstrate a simple and efficient method to model genetic background effects using polygenic scores derived from SNPs that are not on the same chromosome as the target SNP. Using simulated and real data we found that this can result in a substantial increase in the number of variants passing genome-wide significance thresholds. This increase in power to detect trait-associated variants also translates into an increase in the accuracy with which the resulting polygenic score predicts the phenotype from genotype data. Our results suggest that advances in methods for phenotype prediction can be exploited to improve the control of background genetic effects, leading to more accurate GWAS results and further improvements in phenotype prediction.

Download Full-text

Controlling for off-target genetic effects using polygenic scores improves the power of genome-wide association studies

10.1101/2020.05.21.097691 ◽

2020 ◽

Cited By ~ 1

Author(s):

Declan Bennett ◽

Derek Morris ◽

Cathal Seoighe

Keyword(s):

Complex Traits ◽

Association Studies ◽

Genetic Diseases ◽

Real Data ◽

Genetic Effects ◽

Genome Wide Association Studies ◽

Genotype Data ◽

Phenotype Prediction ◽

Genome Wide ◽

Polygenic Scores

ABSTRACTOngoing increases in the size of human genotype and phenotype collections offer the promise of improved understanding of the genetics of complex diseases. In addition to the biological insights that can be gained from the nature of the variants that contribute to the genetic component of complex trait variability, these data have brought forward the prospect of predicting complex traits and the risk of complex genetic diseases from genotype data. Optimal realization of these objectives requires ongoing methodological developments, designed to identify true trait-associated variants and accurately predict phenotype from genotype. These methods must be computationally efficient, in order to remain tractable in the context of high variant densities and very large sample sizes. Here we show that the power of linear mixed models that are in widespread use for GWAS can be increased significantly by modeling off-target genetic effects using polygenic scores derived from SNPs that are not on the same chromosome as the target SNP. Using simulated and real data we found that this can result in a substantial increase in the number of variants passing genome-wide significance thresholds. This increase in power to detect trait-associated variants also translates into an increase in the accuracy with which the resulting polygenic score predicts the phenotype from genotype data. Our results suggest that advances in methods for phenotype prediction can be exploited to improve the control of off-target genetic effects, leading to more accurate GWAS results and further improvements in phenotype prediction.

Download Full-text

Exploring the predictive power of polygenic scores derived from genome-wide association studies: a study of 10 complex traits

Bioinformatics ◽

10.1093/bioinformatics/btw745 ◽

2017 ◽

pp. btw745 ◽

Cited By ~ 8

Author(s):

Hon-Cheong So ◽

Pak C. Sham

Keyword(s):

Complex Traits ◽

Predictive Power ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Polygenic Scores

Download Full-text

Invited review: Genome-wide association analysis for quantitative traits in livestock – a selective review of statistical models and experimental designs

Archives Animal Breeding ◽

10.5194/aab-60-335-2017 ◽

2017 ◽

Vol 60 (3) ◽

pp. 335-346 ◽

Cited By ~ 17

Author(s):

Markus Schmid ◽

Jörn Bennewitz

Keyword(s):

Statistical Models ◽

Complex Traits ◽

Quantitative Traits ◽

Association Studies ◽

Real Data ◽

Genome Wide Association ◽

Future Research ◽

Genome Wide Association Studies ◽

Livestock Breeding ◽

Genome Wide

Abstract. Quantitative or complex traits are controlled by many genes and environmental factors. Most traits in livestock breeding are quantitative traits. Mapping genes and causative mutations generating the genetic variance of these traits is still a very active area of research in livestock genetics. Since genome-wide and dense SNP panels are available for most livestock species, genome-wide association studies (GWASs) have become the method of choice in mapping experiments. Different statistical models are used for GWASs. We will review the frequently used single-marker models and additionally describe Bayesian multi-marker models. The importance of nonadditive genetic and genotype-by-environment effects along with GWAS methods to detect them will be briefly discussed. Different mapping populations are used and will also be reviewed. Whenever possible, our own real-data examples are included to illustrate the reviewed methods and designs. Future research directions including post-GWAS strategies are outlined.

Download Full-text

Multi-ethnic genome-wide association study of decomposed cardioelectric phenotypes illustrates strategies to identify and characterize evidence of shared genetic effects for complex traits

10.1101/654012 ◽

2019 ◽

Cited By ~ 1

Author(s):

Antoine R. Baldassari ◽

Colleen M. Sitlani ◽

Heather M. Highland ◽

Dan E. Arking ◽

Steve Buyske ◽

...

Keyword(s):

Complex Traits ◽

Genome Wide Association Study ◽

Association Studies ◽

Genetic Effects ◽

Genome Wide Association ◽

P Value ◽

Genome Wide Association Studies ◽

Genetic Loci ◽

Genome Wide ◽

Trait Loci

ABSTRACTBackgroundPublished genome-wide association studies (GWAS) are mainly European-centric, examine a narrow view of phenotypic variation, and infrequently interrogate genetic effects shared across traits. We therefore examined the extent to which a multi-ethnic, combined trait GWAS of phenotypes that map to well-defined biology can enable detection and characterization of complex trait loci.MethodsWith 1000 Genomes Phase 3 imputed data in 34,668 participants (15% African American; 3% Chinese American; 51% European American; 30% Hispanic/Latino), we performed covariate-adjusted univariate GWAS of six contiguous electrocardiogram (ECG) traits that decomposed an average heartbeat and two commonly reported composite ECG traits that summed contiguous traits. Combined phenotype testing was performed using the adaptive sum of powered scores test (aSPU).ResultsWe identified six novel and 87 known ECG trait loci (aSPU p-value < 5E-9). Lead SNP rs3211938 at novel locus CD36 was common in African Americans (minor allele frequency=10%) and near-monomorphic in European Americans, with effect sizes for the composite trait, QT interval, among the largest reported. Only one novel locus was detected for the composite traits, due to opposite directions of effects across contiguous traits that summed to near-zero. Combined phenotype testing did not detect novel loci unapparent by univariate testing. However, this approach aided locus characterization, particularly when loci harbored multiple independent signals that differed by trait.ConclusionsDespite including one-third as few participants as the largest published GWAS of ECG traits, our study identifies multiple novel ECG genetic loci, emphasizing the importance of ancestral diversity and phenotype measurement in this era of ever-growing GWAS.AUTHOR SUMMARYWe leveraged a multiethnic cohort with precise measures of cardioelectric function to identify novel genetic loci affecting this complex, multifaceted phenotype. The success of our approach stresses the importance of phenotypic precision and participant diversity for future locus discovery and characterization efforts, and cautions against compromises made in genome-wide association studies to pursue ever-growing sample sizes.

Download Full-text

Joint Genotype- and Ancestry-based Genome-wide Association Studies in Admixed Populations

10.1101/062554 ◽

2016 ◽

Cited By ~ 2

Author(s):

Piotr Szulc ◽

Malgorzata Bogdan ◽

Florian Frommlet ◽

Hua Tang

Keyword(s):

Linkage Disequilibrium ◽

Complex Traits ◽

Association Studies ◽

Real Data ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Single Marker Analysis ◽

Marker Analysis ◽

Genome Wide ◽

Single Marker

AbstractIn Genome-Wide Association Studies (GWAS) genetic loci that influence complex traits are localized by inspecting associations between genotypes of genetic markers and the values of the trait of interest. On the other hand Admixture Mapping, which is performed in case of populations consisting of a recent mix of two ancestral groups, relies on the ancestry information at each locus (locus-specific ancestry).Recently it has been proposed to jointly model genotype and locus-specific ancestry within the framework of single marker tests. Here we extend this approach for population-based GWAS in the direction of multi marker models. A modified version of the Bayesian Information Criterion is developed for building a multi-locus model, which accounts for the differential correlation structure due to linkage disequilibrium and admixture linkage disequilibrium. Simulation studies and a real data example illustrate the advantages of this new approach compared to single-marker analysis and modern model selection strategies based on separately analyzing genotype and ancestry data, as well as to single-marker analysis combining genotypic and ancestry information. Depending on the signal strength our procedure automatically chooses whether genotypic or locus-specific ancestry markers are added to the model. This results in a good compromise between the power to detect causal mutations and the precision of their localization. The proposed method has been implemented in R and is available at http://www.math.uni.wroc.pl/~mbogdan/admixtures/.

Download Full-text

Faculty Opinions recommendation of Exploring the predictive power of polygenic scores derived from genome-wide association studies: a study of 10 complex traits.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.727188351.793562988 ◽

2019 ◽

Author(s):

John Nurnberger

Keyword(s):

Complex Traits ◽

Predictive Power ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Polygenic Scores

Download Full-text

Unraveling the polygenic architecture of complex traits using blood eQTL metaanalysis

10.1101/447367 ◽

2018 ◽

Cited By ~ 99

Author(s):

Urmo Võsa ◽

Annique Claringbould ◽

Harm-Jan Westra ◽

Marc Jan Bonder ◽

Patrick Deelen ◽

...

Keyword(s):

Complex Traits ◽

Genome Wide Association Study ◽

Association Studies ◽

Complex Trait ◽

Genome Wide Association ◽

Eqtl Analysis ◽

Genome Wide Association Studies ◽

Disease Etiology ◽

Genome Wide ◽

Polygenic Scores

SummaryWhile many disease-associated variants have been identified through genome-wide association studies, their downstream molecular consequences remain unclear.To identify these effects, we performedcis-andtrans-expressionquantitative trait locus (eQTL) analysis in blood from 31,684 individuals through the eQTLGen Consortium.We observed thatcis-eQTLs can be detected for 88% of the studied genes, but that they have a different genetic architecture compared to disease-associated variants, limiting our ability to usecis-eQTLs to pinpoint causal genes within susceptibility loci.In contrast, trans-eQTLs (detected for 37% of 10,317 studied trait-associated variants) were more informative. Multiple unlinked variants, associated to the same complex trait, often converged on trans-genes that are known to play central roles in disease etiology.We observed the same when ascertaining the effect of polygenic scores calculated for 1,263 genome-wide association study (GWAS) traits. Expression levels of 13% of the studied genes correlated with polygenic scores, and many resulting genes are known to drive these traits.

Download Full-text

Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations

10.1101/2020.01.14.905927 ◽

2020 ◽

Cited By ~ 2

Author(s):

Ying Wang ◽

Jing Guo ◽

Guiyan Ni ◽

Jian Yang ◽

Peter M. Visscher ◽

...

Keyword(s):

Complex Traits ◽

Association Studies ◽

African Ancestry ◽

Real Data ◽

European Ancestry ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Polygenic Scores ◽

Causal Variants ◽

The Uk

AbstractPolygenic scores (PGS) have been widely used to predict complex traits and risk of diseases using variants identified from genome-wide association studies (GWASs). To date, most GWASs have been conducted in populations of European ancestry, which limits the use of GWAS-derived PGS in non-European populations. Here, we develop a new theory to predict the relative accuracy (RA, relative to the accuracy in populations of the same ancestry as the discovery population) of PGS across ancestries. We used simulations and real data from the UK Biobank to evaluate our results. We found across various simulation scenarios that the RA of PGS based on trait-associated SNPs can be predicted accurately from modelling linkage disequilibrium (LD), minor allele frequencies (MAF), cross-population correlations of SNP effect sizes and heritability. Altogether, we find that LD and MAF differences between ancestries explain alone up to ~70% of the loss of RA using European-based PGS in African ancestry for traits like body mass index and height. Our results suggest that causal variants underlying common genetic variation identified in European ancestry GWASs are mostly shared across continents.

Download Full-text

Quick approximation of threshold values for genome-wide association studies

Briefings in Bioinformatics ◽

10.1093/bib/bby082 ◽

2018 ◽

Vol 20 (6) ◽

pp. 2217-2223

Author(s):

Zhiyu Hao ◽

Li Jiang ◽

Jin Gao ◽

Jinhua Ye ◽

Jingli Zhao ◽

...

Keyword(s):

Association Studies ◽

Nuisance Parameter ◽

Real Data ◽

Genetic Effects ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Critical Thresholds ◽

Genome Wide ◽

Chi Squared

Abstract Standard normal statistics, chi-squared statistics, Student’s t statistics and F statistics are used to map quantitative trait nucleotides for both small and large sample sizes. In genome-wide association studies (GWASs) of single-nucleotide polymorphisms (SNPs), the statistical distributions depend on both genetic effects and SNPs but are independent of SNPs under the null hypothesis of no genetic effects. Therefore, hypothesis testing when a nuisance parameter is present only under the alternative was introduced to quickly approximate the critical thresholds of these test statistics for GWASs. When only the statistical probabilities are available for high-throughput SNPs, the approximate critical thresholds can be estimated with chi-squared statistics, formulated by statistical probabilities with a degree of freedom of two. High similarities in the critical thresholds between the accurate and approximate estimations were demonstrated by extensive simulations and real data analysis.

Download Full-text