Controlling for off-target genetic effects using polygenic scores improves the power of genome-wide association studies

ABSTRACTOngoing increases in the size of human genotype and phenotype collections offer the promise of improved understanding of the genetics of complex diseases. In addition to the biological insights that can be gained from the nature of the variants that contribute to the genetic component of complex trait variability, these data have brought forward the prospect of predicting complex traits and the risk of complex genetic diseases from genotype data. Optimal realization of these objectives requires ongoing methodological developments, designed to identify true trait-associated variants and accurately predict phenotype from genotype. These methods must be computationally efficient, in order to remain tractable in the context of high variant densities and very large sample sizes. Here we show that the power of linear mixed models that are in widespread use for GWAS can be increased significantly by modeling off-target genetic effects using polygenic scores derived from SNPs that are not on the same chromosome as the target SNP. Using simulated and real data we found that this can result in a substantial increase in the number of variants passing genome-wide significance thresholds. This increase in power to detect trait-associated variants also translates into an increase in the accuracy with which the resulting polygenic score predicts the phenotype from genotype data. Our results suggest that advances in methods for phenotype prediction can be exploited to improve the control of off-target genetic effects, leading to more accurate GWAS results and further improvements in phenotype prediction.

Download Full-text

Controlling for background genetic effects using polygenic scores improves the power of genome-wide association studies

Scientific Reports ◽

10.1038/s41598-021-99031-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Declan Bennett ◽

Donal O’Shea ◽

John Ferguson ◽

Derek Morris ◽

Cathal Seoighe

Keyword(s):

Complex Traits ◽

Association Studies ◽

Real Data ◽

Genetic Effects ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genotype Data ◽

Phenotype Prediction ◽

Genome Wide ◽

Polygenic Scores

AbstractOngoing increases in the size of human genotype and phenotype collections offer the promise of improved understanding of the genetics of complex diseases. In addition to the biological insights that can be gained from the nature of the variants that contribute to the genetic component of complex trait variability, these data bring forward the prospect of predicting complex traits and the risk of complex genetic diseases from genotype data. Here we show that advances in phenotype prediction can be applied to improve the power of genome-wide association studies. We demonstrate a simple and efficient method to model genetic background effects using polygenic scores derived from SNPs that are not on the same chromosome as the target SNP. Using simulated and real data we found that this can result in a substantial increase in the number of variants passing genome-wide significance thresholds. This increase in power to detect trait-associated variants also translates into an increase in the accuracy with which the resulting polygenic score predicts the phenotype from genotype data. Our results suggest that advances in methods for phenotype prediction can be exploited to improve the control of background genetic effects, leading to more accurate GWAS results and further improvements in phenotype prediction.

Download Full-text

Controlling for Background Genetic Effects Using Polygenic Scores Improves the Power of Genome-wide Association Studies

10.21203/rs.3.rs-873301/v1 ◽

2021 ◽

Author(s):

Declan Bennett ◽

Dónal O'Shea ◽

John Ferguson ◽

Derek Morris ◽

Cathal Seoighe

Keyword(s):

Complex Traits ◽

Association Studies ◽

Real Data ◽

Genetic Effects ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genotype Data ◽

Phenotype Prediction ◽

Genome Wide ◽

Polygenic Scores

Abstract Ongoing increases in the size of human genotype and phenotype collections offer the promise of improved understanding of the genetics of complex diseases. In addition to the biological insights that can be gained from the nature of the variants that contribute to the genetic component of complex trait variability, these data bring forward the prospect of predicting complex traits and the risk of complex genetic diseases from genotype data. Here we show that advances in phenotype prediction can be applied to improve the power of genome-wide association studies. We demonstrate a simple and efficient method to model genetic background effects using polygenic scores derived from SNPs that are not on the same chromosome as the target SNP. Using simulated and real data we found that this can result in a substantial increase in the number of variants passing genome-wide significance thresholds. This increase in power to detect trait-associated variants also translates into an increase in the accuracy with which the resulting polygenic score predicts the phenotype from genotype data. Our results suggest that advances in methods for phenotype prediction can be exploited to improve the control of background genetic effects, leading to more accurate GWAS results and further improvements in phenotype prediction.

Download Full-text

Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations

10.1101/2020.01.14.905927 ◽

2020 ◽

Cited By ~ 2

Author(s):

Ying Wang ◽

Jing Guo ◽

Guiyan Ni ◽

Jian Yang ◽

Peter M. Visscher ◽

...

Keyword(s):

Complex Traits ◽

Association Studies ◽

African Ancestry ◽

Real Data ◽

European Ancestry ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Polygenic Scores ◽

Causal Variants ◽

The Uk

AbstractPolygenic scores (PGS) have been widely used to predict complex traits and risk of diseases using variants identified from genome-wide association studies (GWASs). To date, most GWASs have been conducted in populations of European ancestry, which limits the use of GWAS-derived PGS in non-European populations. Here, we develop a new theory to predict the relative accuracy (RA, relative to the accuracy in populations of the same ancestry as the discovery population) of PGS across ancestries. We used simulations and real data from the UK Biobank to evaluate our results. We found across various simulation scenarios that the RA of PGS based on trait-associated SNPs can be predicted accurately from modelling linkage disequilibrium (LD), minor allele frequencies (MAF), cross-population correlations of SNP effect sizes and heritability. Altogether, we find that LD and MAF differences between ancestries explain alone up to ~70% of the loss of RA using European-based PGS in African ancestry for traits like body mass index and height. Our results suggest that causal variants underlying common genetic variation identified in European ancestry GWASs are mostly shared across continents.

Download Full-text

Exploring the predictive power of polygenic scores derived from genome-wide association studies: a study of 10 complex traits

Bioinformatics ◽

10.1093/bioinformatics/btw745 ◽

2017 ◽

pp. btw745 ◽

Cited By ~ 8

Author(s):

Hon-Cheong So ◽

Pak C. Sham

Keyword(s):

Complex Traits ◽

Predictive Power ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Polygenic Scores

Download Full-text

Polygenic Prediction of Complex Traits with Iterative Screen Regression Models

10.1101/2020.11.29.402180 ◽

2020 ◽

Author(s):

Meng Luo ◽

Shiliang Gu

Keyword(s):

Genetic Variants ◽

Complex Traits ◽

Regression Models ◽

Association Studies ◽

Prediction Methods ◽

Genome Wide Association Studies ◽

Genotype Data ◽

Genetic Prediction ◽

Genome Wide ◽

Genome Prediction

AbstractAlthough genome-wide association studies have successfully identified thousands of markers associated with various complex traits and diseases, our ability to predict such phenotypes remains limited. A perhaps ignored explanation lies in the limitations of the genetic models and statistical techniques commonly used in association studies. However, using genotype data for individuals to perform accurate genetic prediction of complex traits can promote genomic selection in animal and plant breeding and can lead to the development of personalized medicine in humans. Because most complex traits have a polygenic architecture, accurate genetic prediction often requires modeling genetic variants together via polygenic methods. Here, we also utilize our proposed polygenic methods, which refer to as the iterative screen regression model (ISR) for genome prediction. We compared ISR with several commonly used prediction methods with simulations. We further applied ISR to predicting 15 traits, including the five species of cattle, rice, wheat, maize, and mice. The results of the study indicate that the ISR method performs well than several commonly used polygenic methods and stability.

Download Full-text

Estimating genetic nurture with summary statistics of multigenerational genome-wide association studies

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2023184118 ◽

2021 ◽

Vol 118 (25) ◽

pp. e2023184118

Author(s):

Yuchang Wu ◽

Xiaoyuan Zhong ◽

Yunong Lin ◽

Zijie Zhao ◽

Jiawen Chen ◽

...

Keyword(s):

Complex Traits ◽

Association Studies ◽

Genetic Correlations ◽

Genetic Effects ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Phenotypic Data ◽

Individual Level ◽

Indirect Genetic Effects ◽

Genome Wide

Marginal effect estimates in genome-wide association studies (GWAS) are mixtures of direct and indirect genetic effects. Existing methods to dissect these effects require family-based, individual-level genetic, and phenotypic data with large samples, which is difficult to obtain in practice. Here, we propose a statistical framework to estimate direct and indirect genetic effects using summary statistics from GWAS conducted on own and offspring phenotypes. Applied to birth weight, our method showed nearly identical results with those obtained using individual-level data. We also decomposed direct and indirect genetic effects of educational attainment (EA), which showed distinct patterns of genetic correlations with 45 complex traits. The known genetic correlations between EA and higher height, lower body mass index, less-active smoking behavior, and better health outcomes were mostly explained by the indirect genetic component of EA. In contrast, the consistently identified genetic correlation of autism spectrum disorder (ASD) with higher EA resides in the direct genetic component. A polygenic transmission disequilibrium test showed a significant overtransmission of the direct component of EA from healthy parents to ASD probands. Taken together, we demonstrate that traditional GWAS approaches, in conjunction with offspring phenotypic data collection in existing cohorts, could greatly benefit studies on genetic nurture and shed important light on the interpretation of genetic associations for human complex traits.

Download Full-text

Invited review: Genome-wide association analysis for quantitative traits in livestock – a selective review of statistical models and experimental designs

Archives Animal Breeding ◽

10.5194/aab-60-335-2017 ◽

2017 ◽

Vol 60 (3) ◽

pp. 335-346 ◽

Cited By ~ 17

Author(s):

Markus Schmid ◽

Jörn Bennewitz

Keyword(s):

Statistical Models ◽

Complex Traits ◽

Quantitative Traits ◽

Association Studies ◽

Real Data ◽

Genome Wide Association ◽

Future Research ◽

Genome Wide Association Studies ◽

Livestock Breeding ◽

Genome Wide

Abstract. Quantitative or complex traits are controlled by many genes and environmental factors. Most traits in livestock breeding are quantitative traits. Mapping genes and causative mutations generating the genetic variance of these traits is still a very active area of research in livestock genetics. Since genome-wide and dense SNP panels are available for most livestock species, genome-wide association studies (GWASs) have become the method of choice in mapping experiments. Different statistical models are used for GWASs. We will review the frequently used single-marker models and additionally describe Bayesian multi-marker models. The importance of nonadditive genetic and genotype-by-environment effects along with GWAS methods to detect them will be briefly discussed. Different mapping populations are used and will also be reviewed. Whenever possible, our own real-data examples are included to illustrate the reviewed methods and designs. Future research directions including post-GWAS strategies are outlined.

Download Full-text

Multi-ethnic genome-wide association study of decomposed cardioelectric phenotypes illustrates strategies to identify and characterize evidence of shared genetic effects for complex traits

10.1101/654012 ◽

2019 ◽

Cited By ~ 1

Author(s):

Antoine R. Baldassari ◽

Colleen M. Sitlani ◽

Heather M. Highland ◽

Dan E. Arking ◽

Steve Buyske ◽

...

Keyword(s):

Complex Traits ◽

Genome Wide Association Study ◽

Association Studies ◽

Genetic Effects ◽

Genome Wide Association ◽

P Value ◽

Genome Wide Association Studies ◽

Genetic Loci ◽

Genome Wide ◽

Trait Loci

ABSTRACTBackgroundPublished genome-wide association studies (GWAS) are mainly European-centric, examine a narrow view of phenotypic variation, and infrequently interrogate genetic effects shared across traits. We therefore examined the extent to which a multi-ethnic, combined trait GWAS of phenotypes that map to well-defined biology can enable detection and characterization of complex trait loci.MethodsWith 1000 Genomes Phase 3 imputed data in 34,668 participants (15% African American; 3% Chinese American; 51% European American; 30% Hispanic/Latino), we performed covariate-adjusted univariate GWAS of six contiguous electrocardiogram (ECG) traits that decomposed an average heartbeat and two commonly reported composite ECG traits that summed contiguous traits. Combined phenotype testing was performed using the adaptive sum of powered scores test (aSPU).ResultsWe identified six novel and 87 known ECG trait loci (aSPU p-value < 5E-9). Lead SNP rs3211938 at novel locus CD36 was common in African Americans (minor allele frequency=10%) and near-monomorphic in European Americans, with effect sizes for the composite trait, QT interval, among the largest reported. Only one novel locus was detected for the composite traits, due to opposite directions of effects across contiguous traits that summed to near-zero. Combined phenotype testing did not detect novel loci unapparent by univariate testing. However, this approach aided locus characterization, particularly when loci harbored multiple independent signals that differed by trait.ConclusionsDespite including one-third as few participants as the largest published GWAS of ECG traits, our study identifies multiple novel ECG genetic loci, emphasizing the importance of ancestral diversity and phenotype measurement in this era of ever-growing GWAS.AUTHOR SUMMARYWe leveraged a multiethnic cohort with precise measures of cardioelectric function to identify novel genetic loci affecting this complex, multifaceted phenotype. The success of our approach stresses the importance of phenotypic precision and participant diversity for future locus discovery and characterization efforts, and cautions against compromises made in genome-wide association studies to pursue ever-growing sample sizes.

Download Full-text

Joint Genotype- and Ancestry-based Genome-wide Association Studies in Admixed Populations

10.1101/062554 ◽

2016 ◽

Cited By ~ 2

Author(s):

Piotr Szulc ◽

Malgorzata Bogdan ◽

Florian Frommlet ◽

Hua Tang

Keyword(s):

Linkage Disequilibrium ◽

Complex Traits ◽

Association Studies ◽

Real Data ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Single Marker Analysis ◽

Marker Analysis ◽

Genome Wide ◽

Single Marker

AbstractIn Genome-Wide Association Studies (GWAS) genetic loci that influence complex traits are localized by inspecting associations between genotypes of genetic markers and the values of the trait of interest. On the other hand Admixture Mapping, which is performed in case of populations consisting of a recent mix of two ancestral groups, relies on the ancestry information at each locus (locus-specific ancestry).Recently it has been proposed to jointly model genotype and locus-specific ancestry within the framework of single marker tests. Here we extend this approach for population-based GWAS in the direction of multi marker models. A modified version of the Bayesian Information Criterion is developed for building a multi-locus model, which accounts for the differential correlation structure due to linkage disequilibrium and admixture linkage disequilibrium. Simulation studies and a real data example illustrate the advantages of this new approach compared to single-marker analysis and modern model selection strategies based on separately analyzing genotype and ancestry data, as well as to single-marker analysis combining genotypic and ancestry information. Depending on the signal strength our procedure automatically chooses whether genotypic or locus-specific ancestry markers are added to the model. This results in a good compromise between the power to detect causal mutations and the precision of their localization. The proposed method has been implemented in R and is available at http://www.math.uni.wroc.pl/~mbogdan/admixtures/.

Download Full-text

Faculty Opinions recommendation of Exploring the predictive power of polygenic scores derived from genome-wide association studies: a study of 10 complex traits.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.727188351.793562988 ◽

2019 ◽

Author(s):

John Nurnberger

Keyword(s):

Complex Traits ◽

Predictive Power ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Polygenic Scores

Download Full-text