SUPERGNOVA: local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits

AbstractLocal genetic correlation quantifies the genetic similarity of complex traits in specific genomic regions. However, accurate estimation of local genetic correlation remains challenging, due to linkage disequilibrium in local genomic regions and sample overlap across studies. We introduce SUPERGNOVA, a statistical framework to estimate local genetic correlations using summary statistics from genome-wide association studies. We demonstrate that SUPERGNOVA outperforms existing methods through simulations and analyses of 30 complex traits. In particular, we show that the positive yet paradoxical genetic correlation between autism spectrum disorder and cognitive performance could be explained by two etiologically distinct genetic signatures with bidirectional local genetic correlations.

Download Full-text

Local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits

10.1101/2020.05.08.084475 ◽

2020 ◽

Cited By ~ 5

Author(s):

Yiliang Zhang ◽

Qiongshi Lu ◽

Yixuan Ye ◽

Kunling Huang ◽

Wei Liu ◽

...

Keyword(s):

Correlation Analysis ◽

Genetic Correlation ◽

Complex Traits ◽

Association Studies ◽

Genetic Correlations ◽

Autism Spectrum ◽

Accurate Estimation ◽

Genome Wide Association Studies ◽

Unified Framework ◽

Genomic Regions

AbstractLocal genetic correlation quantifies the genetic similarity of complex traits in specific genomic regions, which could shed unique light on etiologic sharing and provide additional mechanistic insights into the genetic basis of complex traits compared to global genetic correlation. However, accurate estimation of local genetic correlation remains challenging, in part due to extensive linkage disequilibrium in local genomic regions and pervasive sample overlap across studies. We introduce SUPERGNOVA, a unified framework to estimate both global and local genetic correlations using summary statistics from genome-wide association studies. Through extensive simulations and analyses of 30 complex traits, we demonstrate that SUPERGNOVA substantially outperforms existing methods and identifies 150 trait pairs with significant local genetic correlations. In particular, we show that the positive, consistently-identified, yet paradoxical genetic correlation between autism spectrum disorder and cognitive performance could be explained by two etiologically-distinct genetic signatures with bidirectional local genetic correlations. We believe that statistically-rigorous local genetic correlation analysis could accelerate progress in complex trait genetics research.

Download Full-text

Comparison of methods for estimating genetic correlation between complex traits using GWAS summary statistics

10.1101/2020.10.12.336867 ◽

2020 ◽

Cited By ~ 1

Author(s):

Yiliang Zhang ◽

Youshu Cheng ◽

Wei Jiang ◽

Yixuan Ye ◽

Qiongshi Lu ◽

...

Keyword(s):

Genetic Correlation ◽

Complex Traits ◽

Association Studies ◽

Genetic Correlations ◽

Real Data ◽

Estimation Methods ◽

Easy Access ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Correlation Estimation

AbstractGenetic correlation is the correlation of additive genetic effects on two phenotypes. It is an informative metric to quantify the overall genetic similarity between complex traits, which provides insights into their polygenic genetic architecture. Several methods have been proposed to estimate genetic correlations based on data collected from genome-wide association studies (GWAS). Due to the easy access of GWAS summary statistics and computational efficiency, methods only requiring GWAS summary statistics as input have become more popular than methods utilizing individual-level genotype data. Here, we present a benchmark study for different summary-statistics-based genetic correlation estimation methods through simulation and real data applications. We focus on two major technical challenges in estimating genetic correlation: marker dependency caused by linkage disequilibrium (LD) and sample overlap between different studies. To assess the performance of different methods in the presence of these two challenges, we first conducted comprehensive simulations with diverse LD patterns and sample overlaps. Then we applied these methods to real GWAS summary statistics for a wide spectrum of complex traits. Based on these experiments, we conclude that methods relying on accurate LD estimation are less robust in real data applications compared to other methods due to the imprecision of LD obtained from reference panels. Our findings offer a guidance on how to appropriately choose the method for genetic correlation estimation in post-GWAS analysis in interpretation.

Download Full-text

Investigating genetic correlations and causal effects between caffeine consumption and sleep behaviours

10.1101/199828 ◽

2017 ◽

Author(s):

Jorien L. Treur ◽

Mark Gibson ◽

Amy E Taylor ◽

Peter J Rogers ◽

Marcus R Munafò

Keyword(s):

Genetic Correlation ◽

Sleep Duration ◽

Genetic Variants ◽

Mendelian Randomization ◽

Association Studies ◽

Genetic Correlations ◽

Causal Effects ◽

Genome Wide Association Studies ◽

Caffeine Consumption ◽

Insomnia Complaints

AbstractStudy Objectives:Higher caffeine consumption has been linked to poorer sleep and insomnia complaints. We investigated whether these observational associations are the result of genetic risk factors influencing both caffeine consumption and poorer sleep, and/or whether they reflect (possibly bidirectional) causal effects.Methods:Summary-level data were available from genome-wide association studies (GWAS) on caffeine consumption (n=91,462), sleep duration, and chronotype (i.e., being a ‘morning’ versus an ‘evening’ person) (both n=128,266), and insomnia complaints (n=113,006). Linkage disequilibrium (LD) score regression was used to calculate genetic correlations, reflecting the extent to which genetic variants influencing caffeine consumption and sleep behaviours overlap. Causal effects were tested with bidirectional, two-sample Mendelian randomization (MR), an instrumental variable approach that utilizes genetic variants robustly associated with an exposure variable as an instrument to test causal effects. Estimates from individual genetic variants were combined using inverse-variance weighted meta-analysis, weighted median regression and MR Egger regression methods.Results:There was no clear evidence for genetic correlation between caffeine consumption and sleep duration (rg=0.000,p=0.998), chronotype (rg=0.086,p=0.192) or insomnia (rg=-0.034,p=0.700). Two-sample Mendelian randomization analyses did not support causal effects from caffeine consumption to sleep behaviours, or the other way around.Conclusions:We found no evidence in support of genetic correlation or causal effects between caffeine consumption and sleep. While caffeine may have acute effects on sleep when taken shortly before habitual bedtime, our findings suggest that a more sustained pattern of high caffeine consumption is likely associated with poorer sleep through shared environmental factors.

Download Full-text

A systematic analysis of genetically regulated differences in gene expression and the role of co-expression networks across 16 psychiatric disorders and substance use phenotypes

10.1101/2021.01.28.428688 ◽

2021 ◽

Author(s):

Zachary F Gerring ◽

Jackson G Thorp ◽

Eric R Gamazon ◽

Eske M Derks

Keyword(s):

Gene Expression ◽

Mental Health ◽

Substance Use ◽

Prefrontal Cortex ◽

Psychiatric Disorders ◽

Developmental Disorders ◽

Association Studies ◽

Genetic Correlations ◽

Autism Spectrum ◽

Genome Wide Association Studies

ABSTRACTGenome-wide association studies (GWASs) have identified thousands of risk loci for many psychiatric and substance use phenotypes, however the biological consequences of these loci remain largely unknown. We performed a transcriptome-wide association study of 10 psychiatric disorders and 6 substance use phenotypes (collectively termed “mental health phenotypes”) using expression quantitative trait loci data from 532 prefrontal cortex samples. We estimated the correlation due to predicted genetically regulated expression between pairs of mental health phenotypes, and compared the results with the genetic correlations. We identified 1,645 genes with at least one significant trait association, comprising 2,176 significant associations across the 16 mental health phenotypes of which 572 (26%) are novel. Overall, the transcriptomic correlations for phenotype pairs were significantly higher than the respective genetic correlations. For example, attention deficit hyperactivity disorder and autism spectrum disorder, both childhood developmental disorders, showed a much higher transcriptomic correlation (r=0.84) than genetic correlation (r=0.35). Finally, we tested the enrichment of phenotype-associated genes in gene co-expression networks built from prefrontal cortex. Phenotype-associated genes were enriched in multiple gene co-expression modules and the implicated modules contained genes involved in mRNA splicing and glutamatergic receptors, among others. Together, our results highlight the utility of gene expression data in the understanding of functional gene mechanisms underlying psychiatric disorders and substance use phenotypes.

Download Full-text

Gamete simulation improves polygenic transmission disequilibrium analysis

10.1101/2020.10.26.355602 ◽

2020 ◽

Author(s):

Jiawen Chen ◽

Jing You ◽

Zijie Zhao ◽

Zheng Ni ◽

Kunling Huang ◽

...

Keyword(s):

Complex Traits ◽

Statistical Power ◽

Association Studies ◽

Autism Spectrum ◽

Genetic Maps ◽

Risk Scores ◽

Parental Genotype ◽

Genome Wide Association Studies ◽

Transmission Disequilibrium ◽

Polygenic Risk

AbstractPolygenic risk scores (PRS) derived from summary statistics of genome-wide association studies (GWAS) have enjoyed great popularity in human genetics research. Applied to population cohorts, PRS can effectively stratify individuals by risk group and has promising applications in early diagnosis and clinical intervention. However, our understanding of within-family polygenic risk is incomplete, in part because the small samples per family significantly limits power. Here, to address this challenge, we introduce ORIGAMI, a computational framework that uses parental genotype data to simulate offspring genomes. ORIGAMI uses state-of-the-art genetic maps to simulate realistic recombination events on phased parental genomes and allows quantifying the prospective PRS variability within each family. We quantify and showcase the substantially reduced yet highly heterogeneous PRS variation within families for numerous complex traits. Further, we incorporate within-family PRS variability to improve polygenic transmission disequilibrium test (pTDT). Through simulations, we demonstrate that modeling within-family risk substantially improves the statistical power of pTDT. Applied to 7,805 trios of autism spectrum disorder (ASD) probands and healthy parents, we successfully replicated previously reported over-transmission of ASD, educational attainment, and schizophrenia risk, and identified multiple novel traits with significant transmission disequilibrium. These results provided novel etiologic insights into the shared genetic basis of various complex traits and ASD.

Download Full-text

Comparison of GWAS models to identify non-additive genetic control of flowering time in sunflower hybrids

10.1101/188235 ◽

2017 ◽

Cited By ~ 1

Author(s):

Fanny Bonnafous ◽

Ghislain Fievet ◽

Nicolas Blanchet ◽

Marie-Claude Boniface ◽

Sébastien Carrère ◽

...

Keyword(s):

Quantitative Trait Loci ◽

Flowering Time ◽

Genetic Control ◽

Quantitative Trait ◽

Complex Traits ◽

Association Studies ◽

Additive Models ◽

Genome Wide Association Studies ◽

Trait Loci ◽

Genomic Regions

AbstractGenome-wide association studies are a powerful and widely used tool to decipher the genetic control of complex traits. One of the main challenges for hybrid crops, such as maize or sunflower, is to model the hybrid vigor in the linear mixed models, considering the relatedness between individuals. Here, we compared two additive and three non-additive association models for their ability to identify genomic regions associated with flowering time in sunflower hybrids. A panel of 452 sunflower hybrids, corresponding to incomplete crossing between 36 male lines and 36 female lines, was phenotyped in five environments and genotyped for 2,204,423 SNPs. Intra-locus effects were estimated in multi-locus models to detect genomic regions associated with flowering time using the different models. Thirteen quantitative trait loci were identified in total, two with both model categories and one with only non-additive models. A quantitative trait loci on LG09, detected by both the additive and non-additive models, is located near a GAI homolog and is presented in detail. Overall, this study shows the added value of non-additive modeling of allelic effects for identifying genomic regions that control traits of interest and that could participate in the heterosis observed in hybrids.

Download Full-text

Detecting Local Genetic Correlations with Scan Statistics

10.1101/808519 ◽

2019 ◽

Cited By ~ 3

Author(s):

Hanmin Guo ◽

James J. Li ◽

Qiongshi Lu ◽

Lin Hou

Keyword(s):

Genetic Correlation ◽

Association Studies ◽

Genetic Correlations ◽

Scan Statistics ◽

Genome Wide Association Studies ◽

Scan Statistic ◽

Multiple Phenotypes ◽

Genome Wide ◽

Statistic Approach ◽

Shared Genetic

AbstractGenetic correlation analysis has quickly gained popularity in the past few years and provided insights into the genetic etiology of numerous complex diseases. However, existing approaches oversimplify the shared genetic architecture between different phenotypes and cannot effectively identify precise genetic regions contributing to the genetic correlation. In this work, we introduce LOGODetect, a powerful and efficient statistical method to identify small genome segments harboring local genetic correlation signals. LOGODetect automatically identifies genetic regions showing consistent associations with multiple phenotypes through a scan statistic approach. It uses summary association statistics from genome-wide association studies (GWAS) as input and is robust to sample overlap between studies. Applied to five phenotypically distinct but genetically correlated psychiatric disorders, we identified 49 non-overlapping genome regions associated with multiple disorders, including multiple hub regions showing concordant effects on more than two disorders. Our method addresses critical limitations in existing analytic strategies and may have wide applications in post-GWAS analysis.

Download Full-text

Joint Genome-Wide Association Study Identifies Twenty-One Novel Loci for Age at Menarche and Highlights Its Causal Association with Other Complex Diseases

10.21203/rs.3.rs-955340/v1 ◽

2021 ◽

Author(s):

Gui-Juan Feng ◽

Qian Xu ◽

Jing-Jing Ni ◽

Shan-Shan Yang ◽

Bai-Xue Han ◽

...

Keyword(s):

Association Study ◽

Complex Traits ◽

Causal Effect ◽

Association Studies ◽

Genetic Correlations ◽

Genome Wide Association ◽

Age At Menarche ◽

Luteinizing Hormone Level ◽

Genome Wide Association Studies ◽

Genome Wide

Abstract Age at menarche (AAM) is a sign of puberty of females. It is a heritable trait associated with various adult diseases. However, the genetic mechanism that determines AAM and links it to disease risk is poorly understood. Aiming to uncover the genetic basis for AAM, we conducted a joint association study in up to 438,089 participants from 3 genome-wide association studies of European and East Asian ancestries. Twenty-one novel genomic loci were identified at the genome-wide significance level. Besides, we observed significant genetic correlations between AAM and 67 complex traits, and the highest genetic correlation was observed between AAM and body mass index (rg=-0.19, P=6.11×10−31). Latent causal variable analyses demonstrate that there is a genetically causal effect of AAM on high blood pressure (GCP=0.47, P=0.02), forced vital capacity (GCP=0.63, P=0.02), age at first live birth (GCP=0.51, P=0.03), impedance of right arm (GCP=0.41, P<1×10-7) and right leg fat percentage (GCP=-0.10, P=0.02), etc. Enrichment analysis identified 5 enriched tissues and 51 enriched gene sets. Four of the five enriched tissues were related to the nervous system, including the hypothalamus middle, hypothalamo hypophyseal system, neurosecretory systems and hypothalamus. The fifth tissue was the retina in the sensory organ. The most significant gene set was the ‘decreased circulating luteinizing hormone level’ (P=2.45×10-6). Our findings may provide useful insights that elucidate the mechanisms determining AAM and the genetic interplay between AAM and some traits of women.

Download Full-text

Geographic Confounding in Genome-Wide Association Studies

10.21203/rs.3.rs-362358/v1 ◽

2021 ◽

Author(s):

Abdel Abdellaoui ◽

Karin Verweij ◽

Michel G Nivard

Keyword(s):

Educational Attainment ◽

Social Stratification ◽

Complex Traits ◽

Association Studies ◽

Genetic Correlations ◽

Geographic Region ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Gene Environment ◽

Genome Wide

Abstract Gene-environment correlations can bias associations between genetic variants and complex traits in genome-wide association studies (GWASs). Here, we control for geographic sources of gene-environment correlation in GWASs on 56 complex traits (N = 69,772–271,457). Controlling for geographic region significantly decreases heritability signals for SES-related traits, most strongly for educational attainment and income, indicating that socio-economic differences between regions induce gene-environment correlations that become part of the polygenic signal. For most other complex traits investigated, genetic correlations with educational attainment and income are significantly reduced, most significantly for traits related to BMI, sedentary behavior, and substance use. Controlling for current address has greater impact on the polygenic signal than birth place, suggesting both active and passive sources of gene-environment correlations. Our results show that societal sources of social stratification that extend beyond families introduce regional-level gene-environment correlations that affect GWAS results.

Download Full-text

Estimating Heritability and Genetic Correlation in Case Control Studies Directly and with Summary Statistics

10.1101/256388 ◽

2018 ◽

Author(s):

Omer Weissbrod ◽

Jonathan Flint ◽

Saharon Rosset

Keyword(s):

Genetic Correlation ◽

Association Studies ◽

Genetic Correlations ◽

Large Data ◽

Case Control ◽

Data Sets ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Case Control Studies ◽

Individual Level

AbstractMethods that estimate heritability and genetic correlations from genome-wide association studies have proven to be powerful tools for investigating the genetic architecture of common diseases and exposing unexpected relationships between disorders. Many relevant studies employ a case-control design, yet most methods are primarily geared towards analyzing quantitative traits. Here we investigate the validity of three common methods for estimating genetic heritability and genetic correlation. We find that the Phenotype-Correlation-Genotype-Correlation (PCGC) approach is the only method that can estimate both quantities accurately in the presence of important non-genetic risk factors, such as age and sex. We extend PCGC to work with summary statistics that take the case-control sampling into account, and demonstrate that our new method, PCGC-s, accurately estimates both heritability and genetic correlations and can be applied to large data sets without requiring individual-level genotypic or phenotypic information. Finally, we use PCGC-S to estimate the genetic correlation between schizophrenia and bipolar disorder, and demonstrate that previous estimates are biased due to incorrect handling of sex as a strong risk factor. PCGC-s is available at https://github.com/omerwe/PCGCs.

Download Full-text