A model and test for coordinated polygenic epistasis in complex traits

Interactions between genetic variants—epistasis—is pervasive in model systems and can profoundly impact evolutionary adaption, population disease dynamics, genetic mapping, and precision medicine efforts. In this work, we develop a model for structured polygenic epistasis, called coordinated epistasis (CE), and prove that several recent theories of genetic architecture fall under the formal umbrella of CE. Unlike standard epistasis models that assume epistasis and main effects are independent, CE captures systematic correlations between epistasis and main effects that result from pathway-level epistasis, on balance skewing the penetrance of genetic effects. To test for the existence of CE, we propose the even-odd (EO) test and prove it is calibrated in a range of realistic biological models. Applying the EO test in the UK Biobank, we find evidence of CE in 18 of 26 traits spanning disease, anthropometric, and blood categories. Finally, we extend the EO test to tissue-specific enrichment and identify several plausible tissue–trait pairs. Overall, CE is a dimension of genetic architecture that can capture structured, systemic forms of epistasis in complex human traits.

Download Full-text

Coordinated Interaction: A model and test for globally signed epistasis in complex traits

10.1101/2020.02.14.949883 ◽

2020 ◽

Cited By ~ 1

Author(s):

Brooke Sheppard ◽

Nadav Rappoport ◽

Po-Ru Loh ◽

Stephan J. Sanders ◽

Andy Dahl ◽

...

Keyword(s):

Complex Traits ◽

Genetic Architecture ◽

Model Systems ◽

Disease Dynamics ◽

Uk Biobank ◽

Biological Models ◽

Model Sets ◽

The Uk ◽

Complex Human Traits ◽

Main Effects

AbstractInteractions between genetic variants – epistasis – is pervasive in model systems and can profoundly impact evolutionary adaption, population disease dynamics, genetic mapping, and precision medicine efforts. In this work we develop a model for structured polygenic epistasis, called Coordinated Interaction (CI), and prove that several recent theories of genetic architecture fall under the formal umbrella of CI. Unlike standard polygenic epistasis models that assume interaction and main effects are independent, in the CI model, sets of SNPs broadly interact positively or negatively, on balance skewing the penetrance of main genetic effects. To test for the existence of CI we propose the even-odd (EO) test and prove it is calibrated in a range of realistic biological models. Applying the EO test in the UK Biobank, we find evidence of CI in 14 of 26 traits spanning disease, anthropometric, and blood categories. Finally, we extend the EO test to tissue-specific enrichment and identify several plausible tissue-trait pairs. Overall, CI is a new dimension of genetic architecture that can capture structured, systemic interactions in complex human traits.

Download Full-text

Detection and quantification of inbreeding depression for complex traits from SNP data

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1621096114 ◽

2017 ◽

Vol 114 (32) ◽

pp. 8602-8607 ◽

Cited By ~ 15

Author(s):

Loic Yengo ◽

Zhihong Zhu ◽

Naomi R. Wray ◽

Bruce S. Weir ◽

Jian Yang ◽

...

Keyword(s):

Inbreeding Depression ◽

Complex Traits ◽

Genetic Architecture ◽

Handgrip Strength ◽

Uk Biobank ◽

Snp Data ◽

Causal Variants ◽

Auditory Acuity ◽

The Uk ◽

Detection And Quantification

Quantifying the effects of inbreeding is critical to characterizing the genetic architecture of complex traits. This study highlights through theory and simulations the strengths and shortcomings of three SNP-based inbreeding measures commonly used to estimate inbreeding depression (ID). We demonstrate that heterogeneity in linkage disequilibrium (LD) between causal variants and SNPs biases ID estimates, and we develop an approach to correct this bias using LD and minor allele frequency stratified inference (LDMS). We quantified ID in 25 traits measured in ∼140,000 participants of the UK Biobank, using LDMS, and confirmed previously published ID for 4 traits. We find unique evidence of ID for handgrip strength, waist/hip ratio, and visual and auditory acuity (ID between −2.3 and −5.2 phenotypic SDs for complete inbreeding; P<0.001). Our results illustrate that a careful choice of the measure of inbreeding combined with LDMS stratification improves both detection and quantification of ID using SNP data.

Download Full-text

Genetic Architecture of Complex Traits and Disease Risk Predictors

10.1101/2020.02.12.946608 ◽

2020 ◽

Cited By ~ 3

Author(s):

Soke Yuen Yong ◽

Timothy G. Raben ◽

Louis Lello ◽

Stephen D.H. Hsu

Keyword(s):

Genetic Variants ◽

Complex Traits ◽

Genetic Architecture ◽

Disease Risk ◽

Sequencing Data ◽

Exome Data ◽

Coding Regions ◽

Disease Risks ◽

Risk Predictors ◽

Complex Human Traits

AbstractGenomic prediction of complex human traits (e.g., height, cognitive ability, bone density) and disease risks (e.g., breast cancer, diabetes, heart disease, atrial fibrillation) has advanced considerably in recent years. Predictors have been constructed using penalized algorithms that favor sparsity: i.e., which use as few genetic variants as possible. We analyze the specific genetic variants (SNPs) utilized in these predictors, which can vary from dozens to as many as thirty thousand. We find that the fraction of SNPs in or near genic regions varies widely by phenotype. For the majority of disease conditions studied, a large amount of the variance is accounted for by SNPs outside of coding regions. The state of these SNPs cannot be determined from exome-sequencing data. This suggests that exome data alone will miss much of the heritability for these traits – i.e., existing PRS cannot be computed from exome data alone. We also study the fraction of SNPs and of variance that is in common between pairs of predictors. The DNA regions used in disease risk predictors so far constructed seem to be largely disjoint (with a few interesting exceptions), suggesting that individual genetic disease risks are largely uncorrelated. It seems possible in theory for an individual to be a low-risk outlier in all conditions simultaneously.

Download Full-text

Common genetic variants and health outcomes appear geographically structured in the UK Biobank sample: Old concerns returning and their implications

10.1101/294876 ◽

2018 ◽

Cited By ~ 12

Author(s):

Simon Haworth ◽

Ruth Mitchell ◽

Laura Corbin ◽

Kaitlin H Wade ◽

Tom Dudding ◽

...

Keyword(s):

Genetic Variants ◽

Complex Traits ◽

Large Scale ◽

Genetic Data ◽

Population Based ◽

Risk Scores ◽

Phenotypic Variance ◽

Uk Biobank ◽

Common Genetic Variants ◽

The Uk

Introductory paragraphThe inclusion of genetic data in large studies has enabled the discovery of genetic contributions to complex traits and their application in applied analyses including those using genetic risk scores (GRS) for the prediction of phenotypic variance. If genotypes show structure by location and coincident structure exists for the trait of interest, analyses can be biased. Having illustrated structure in an apparently homogeneous collection, we aimed to a) test for geographical stratification of genotypes in UK Biobank and b) assess whether stratification might induce bias in genetic association analysis.We found that single genetic variants are associated with birth location within UK Biobank and that geographic structure in genetic data could not be accounted for using routine adjustment for study centre and principal components (PCs) derived from genotype data. We found that GRS for complex traits do appear geographically structured and analysis using GRS can yield biased associations. We discuss the likely origins of these observations and potential implications for analysis within large-scale population based genetic studies.

Download Full-text

Accurate estimation of SNP-heritability from biobank-scale data irrespective of genetic architecture

10.1101/526855 ◽

2019 ◽

Cited By ~ 3

Author(s):

Kangcheng Hou ◽

Kathryn S. Burch ◽

Arunabha Majumdar ◽

Huwenbo Shi ◽

Nicholas Mancuso ◽

...

Keyword(s):

Complex Traits ◽

Genetic Architecture ◽

Accurate Estimation ◽

Phenotypic Variance ◽

Uk Biobank ◽

Genome Wide ◽

Wide Range ◽

Fundamental Quantity ◽

The Uk ◽

Scale Data

AbstractThe proportion of phenotypic variance attributable to the additive effects of a given set of genotyped SNPs (i.e. SNP-heritability) is a fundamental quantity in the study of complex traits. Recent works have shown that existing methods to estimate genome-wide SNP-heritability often yield biases when their assumptions are violated. While various approaches have been proposed to account for frequency- and LD-dependent genetic architectures, it remains unclear which estimates of SNP-heritability reported in the literature are reliable. Here we show that genome-wide SNP-heritability can be accurately estimated from biobank-scale data irrespective of the underlying genetic architecture of the trait, without specifying a heritability model or partitioning SNPs by minor allele frequency and/or LD. We use theoretical justifications coupled with extensive simulations starting from real genotypes from the UK Biobank (N=337K) to show that, unlike existing methods, our closed-form estimator for SNP-heritability is highly accurate across a wide range of architectures. We provide estimates of SNP-heritability for 22 complex traits and diseases in the UK Biobank and show that, consistent with our results in simulations, existing biobank-scale methods yield estimates up to 30% different from our theoretically-justified approach.

Download Full-text

Probabilistic inference of the genetic architecture of functional enrichment of complex traits

10.1101/2020.09.04.20188433 ◽

2020 ◽

Author(s):

Marion Patxot ◽

Daniel Trejo Banos ◽

Athanasios Kousathanas ◽

Etienne J Orliac ◽

Sven E Ojavee ◽

...

Keyword(s):

Effect Size ◽

Complex Traits ◽

Genetic Architecture ◽

Penalized Regression ◽

Functional Enrichment ◽

Effect Sizes ◽

Uk Biobank ◽

Regulatory Regions ◽

Coding Regions ◽

The Uk

Due to the complexity of linkage disequilibrium (LD) and gene regulation, understanding the genetic basis of common complex traits remains a major challenge. We develop a Bayesian model (BayesRR-RC) implemented in a hybrid-parallel algorithm that scales to whole-genome sequence data on many hundreds of thousands of individuals, taking 22 seconds per iteration to estimate the inclusion probabilities and effect sizes of 8.4 million markers and 78 SNP-heritability parameters in the UK Biobank. Unlike naive penalized regression or mixed-linear model approaches, BayesRR-RC accurately estimates annotation-specific genetic architecture, determines the underlying joint effect size distribution and provides a probabilistic determination of association within marker groups in a single step. Of the genetic variation captured for height, body mass index, cardiovascular disease, and type-2 diabetes in the UK Biobank, only ≤ 10% is attributable to proximal regulatory regions within 10kb upstream of genes, while 12-25% is attributed to coding regions, up to 40% to intronic regions, and 22-28% to distal 10-500kb upstream regions. ≥60% of the variance contributed by these exonic, intronic and distal 10-500kb regions is underlain by many thousands of common variants, each with larger average effect sizes compared to the rest of the genome. We also find differences in the relationship between effect size and heterozygosity across annotation groups and across traits. Up to 24% of all cis and coding regions of each chromosome are associated with each trait, with over 3,100 independent exonic and intronic regions and over 5,400 independent regulatory regions having ≥95% probability of contributing ≥0.001% to the genetic variance for just these four traits. In the Estonian Biobank, we show improved prediction accuracy over other approaches and generate a posterior predictive distribution for each individual.

Download Full-text

Coffee Consumption and Cardiovascular Diseases: A Mendelian Randomization Study

Nutrients ◽

10.3390/nu13072218 ◽

2021 ◽

Vol 13 (7) ◽

pp. 2218

Author(s):

Shuai Yuan ◽

Paul Carter ◽

Amy M. Mason ◽

Stephen Burgess ◽

Susanna C. Larsson

Keyword(s):

Cardiovascular Disease ◽

Cardiovascular Diseases ◽

Intracerebral Hemorrhage ◽

Genetic Variants ◽

Odds Ratio ◽

Observational Studies ◽

Mendelian Randomization ◽

Coffee Consumption ◽

Uk Biobank ◽

The Uk

Coffee consumption has been linked to a lower risk of cardiovascular disease in observational studies, but whether the associations are causal is not known. We conducted a Mendelian randomization investigation to assess the potential causal role of coffee consumption in cardiovascular disease. Twelve independent genetic variants were used to proxy coffee consumption. Summary-level data for the relations between the 12 genetic variants and cardiovascular diseases were taken from the UK Biobank with up to 35,979 cases and the FinnGen consortium with up to 17,325 cases. Genetic predisposition to higher coffee consumption was not associated with any of the 15 studied cardiovascular outcomes in univariable MR analysis. The odds ratio per 50% increase in genetically predicted coffee consumption ranged from 0.97 (95% confidence interval (CI), 0.63, 1.50) for intracerebral hemorrhage to 1.26 (95% CI, 1.00, 1.58) for deep vein thrombosis in the UK Biobank and from 0.86 (95% CI, 0.50, 1.49) for subarachnoid hemorrhage to 1.34 (95% CI, 0.81, 2.22) for intracerebral hemorrhage in FinnGen. The null findings remained in multivariable Mendelian randomization analyses adjusted for genetically predicted body mass index and smoking initiation, except for a suggestive positive association for intracerebral hemorrhage (odds ratio 1.91; 95% CI, 1.03, 3.54) in FinnGen. This Mendelian randomization study showed limited evidence that coffee consumption affects the risk of developing cardiovascular disease, suggesting that previous observational studies may have been confounded.

Download Full-text

Predicting the effect of statins on cancer risk using genetic variants from a Mendelian randomization study in the UK Biobank

eLife ◽

10.7554/elife.57191 ◽

2020 ◽

Vol 9 ◽

Author(s):

Paul Carter ◽

Mathew Vithayathil ◽

Siddhartha Kar ◽

Rahul Potluri ◽

Amy M Mason ◽

...

Keyword(s):

Standard Deviation ◽

Cancer Risk ◽

Genetic Variants ◽

Ldl Cholesterol ◽

Human Genetics ◽

Density Lipoprotein ◽

Lipid Lowering ◽

Uk Biobank ◽

Standard Deviation Increase ◽

The Uk

Laboratory studies have suggested oncogenic roles of lipids, as well as anticarcinogenic effects of statins. Here we assess the potential effect of statin therapy on cancer risk using evidence from human genetics. We obtained associations of lipid-related genetic variants with the risk of overall and 22 site-specific cancers for 367,703 individuals in the UK Biobank. In total, 75,037 individuals had a cancer event. Variants in the HMGCR gene region, which represent proxies for statin treatment, were associated with overall cancer risk (odds ratio [OR] per one standard deviation decrease in low-density lipoprotein [LDL] cholesterol 0.76, 95% confidence interval [CI] 0.65–0.88, p=0.0003) but variants in gene regions representing alternative lipid-lowering treatment targets (PCSK9, LDLR, NPC1L1, APOC3, LPL) were not. Genetically predicted LDL-cholesterol was not associated with overall cancer risk (OR per standard deviation increase 1.01, 95% CI 0.98–1.05, p=0.50). Our results predict that statins reduce cancer risk but other lipid-lowering treatments do not. This suggests that statins reduce cancer risk through a cholesterol independent pathway.

Download Full-text

Analysis of genetic dominance in the UK Biobank

10.1101/2021.08.15.456387 ◽

2021 ◽

Author(s):

Duncan S Palmer ◽

Wei Zhou ◽

Liam Abbott ◽

Nik Baya ◽

Claire Churchhouse ◽

...

Keyword(s):

Complex Traits ◽

Multiple Testing ◽

Model Organisms ◽

Systematic Evaluation ◽

Hair Color ◽

Phenotypic Variance ◽

Additive Effects ◽

Uk Biobank ◽

Genome Wide ◽

The Uk

In classical statistical genetic theory, a dominance effect is defined as the deviation from a purely additive genetic effect for a biallelic variant. Dominance effects are well documented in model organisms. However, evidence in humans is limited to a handful of traits, particularly those with strong single locus effects such as hair color. We carried out the largest systematic evaluation of dominance effects on phenotypic variance in the UK Biobank. We curated and tested over 1,000 phenotypes for dominance effects through GWAS scans, identifying 175 loci at genome-wide significance correcting for multiple testing (P < 4.7 × 10-11). Power to detect non-additive loci is much lower than power to detect additive effects for complex traits: based on the relative effect sizes at genome-wide significant additive loci, we estimate a factor of 20-30 increase in sample size will be necessary to capture clear evidence of dominance similar to those currently observed for additive effects. However, these localised dominance hits do not extend to a significant aggregate contribution to phenotypic variance genome-wide. By deriving a version of LD-score regression to detect dominance effects tagged by common variation genome-wide (minor allele frequency > 0.05), we found no strong evidence of a contribution to phenotypic variance when accounting for multiple testing. Across the 267 continuous and 793 binary traits the median contribution was 5.73 × 10-4, with unbiased point estimates ranging from -0.261 to 0.131. Finally, we introduce dominance fine-mapping to explore whether the more rapid decay of dominance LD can be leveraged to find causal variants. These results provide the most comprehensive assessment of dominance trait variation in humans to date.

Download Full-text

Pairwise genetic interactions modulate lipid plasma levels and cellular uptake

10.1101/2020.10.29.360818 ◽

2020 ◽

Author(s):

Magdalena Zimon ◽

Yunfeng Huang ◽

Anthi Trasta ◽

Jimmy Z. Liu ◽

Chia-Yen Chen ◽

...

Keyword(s):

Complex Traits ◽

Drug Target ◽

Human Genetics ◽

Large Population ◽

Genetic Interactions ◽

Model Systems ◽

Lipid Lowering ◽

Gene Pairs ◽

Population Sizes ◽

The Uk

SUMMARYGenetic interactions (GIs), the joint impact of different genes or variants on a phenotype, are foundational to the genetic architecture of complex traits. However, identifying GIs through human genetics is challenging since it necessitates very large population sizes, while findings from model systems not always translate to humans. Here, we combined exome-sequencing and genotyping in the UK Biobank with combinatorial RNA-interference (coRNAi) screening to systematically test for pairwise GIs between 30 lipid GWAS genes. Gene-based protein-truncating variant (PTV) burden analyses from 240,970 exomes revealed additive GIs for APOB with PCSK9 and LPL, respectively. Both, genetics and coRNAi identified additive GIs for 12 additional gene pairs. Overlapping non-additive GIs were detected only for TOMM40 at the APOE locus with SORT1 and NCAN. Our study identifies distinct gene pairs that modulate both, plasma and cellular lipid levels via additive and non-additive effects and nominates drug target pairs for improved lipid-lowering combination therapies.

Download Full-text