scholarly journals Fast estimation of genetic correlation for Biobank-scale data

2019 ◽  
Author(s):  
Yue Wu ◽  
Kathryn S. Burch ◽  
Andrea Ganna ◽  
Päivi Pajukanta ◽  
Bogdan Pasaniuc ◽  
...  

AbstractGenetic correlation is an important parameter in efforts to understand the relationships among complex traits. Current methods that analyze individual genotype data for estimating genetic correlation are challenging to scale to large datasets. Methods that analyze summary data, while being computationally efficient, tend to yield estimates of genetic correlation with reduced precision. We propose, SCORE, a randomized method of moments estimator of genetic correlation that is both scalable and accurate. SCORE obtains more precise estimates of genetic correlations relative to summary-statistic methods that can be applied at scale achieving a 50% reduction in standard error relative to LD-score regression (LDSC) and a 26% reduction relative to high-definition likelihood (HDL) (averaged over all simulations). The efficiency of SCORE enables computation of genetic correlations on the UK biobank dataset consisting of ≈ 300K individuals and ≈ 500K SNPs in a few hours (orders of magnitude faster than methods that analyze individual data such as GCTA). Across 780 pairs of traits in 291, 273 unrelated white British individuals in the UK Biobank, SCORE identifies significant genetic correlation between 200 additional pairs of traits over LDSC (beyond the 245 pairs identified by both).

2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Ronald de Vlaming ◽  
Eric A. W. Slob ◽  
Philip R. Jansen ◽  
Alain Dagher ◽  
Philipp D. Koellinger ◽  
...  

AbstractHuman variation in brain morphology and behavior are related and highly heritable. Yet, it is largely unknown to what extent specific features of brain morphology and behavior are genetically related. Here, we introduce a computationally efficient approach for multivariate genomic-relatedness-based restricted maximum likelihood (MGREML) to estimate the genetic correlation between a large number of phenotypes simultaneously. Using individual-level data (N = 20,190) from the UK Biobank, we provide estimates of the heritability of gray-matter volume in 74 regions of interest (ROIs) in the brain and we map genetic correlations between these ROIs and health-relevant behavioral outcomes, including intelligence. We find four genetically distinct clusters in the brain that are aligned with standard anatomical subdivision in neuroscience. Behavioral traits have distinct genetic correlations with brain morphology which suggests trait-specific relevance of ROIs. These empirical results illustrate how MGREML can be used to estimate internally consistent and high-dimensional genetic correlation matrices in large datasets.


2016 ◽  
Author(s):  
Huwenbo Shi ◽  
Nicholas Mancuso ◽  
Sarah Spendlove ◽  
Bogdan Pasaniuc

AbstractAlthough genetic correlations between complex traits provide valuable insights into epidemiological and etiological studies, a precise quantification of which genomic regions contribute to the genome-wide genetic correlation is currently lacking. Here, we introduce ρ-HESS, a technique to quantify the correlation between pairs of traits due to genetic variation at a small region in the genome. Our approach only requires GWAS summary data and makes no distributional assumption on the causal variant effects sizes while accounting for linkage disequilibrium (LD) and overlapping GWAS samples. We analyzed large-scale GWAS summary data across 35 complex traits, and identified 27 genomic regions that contribute significantly to the genetic correlation among these traits. Notably, we find 7 genomic regions that contribute to the genetic correlation of 12 pairs of traits that show negligible genome-wide correlation, further showcasing the power of local genetic correlation analyses. Finally, we leverage the distribution of local genetic correlations across the genome to assign putative direction of causality for 15 pairs of traits.


Neurology ◽  
2021 ◽  
pp. 10.1212/WNL.0000000000012919
Author(s):  
Yanjun Guo ◽  
Iyas Daghlas ◽  
Padhraig Gormley ◽  
Franco Giulianini ◽  
Paul M Ridker ◽  
...  

Background and Objective:To evaluate phenotypic and genetic relationships between migraine and lipoprotein subfractions.Methods:We evaluated phenotypic associations between migraine and 19 lipoprotein subfractions measures in the Women’s Genome Health Study (WGHS, N=22,788). We then investigated genetic relationships between these traits using summary statistics from the International Headache Genetics Consortium (IHGC) for migraine (Ncase=54,552, Ncontrol=297,970) and combined summary data for lipoprotein subfractions (N up to 47,713).Results:There was a significant phenotypic association (odds ratio=1.27 [95% confidence interval:1.12-1.44]) and a significant genetic correlation at 0.18 (P=0.001) between migraine and triglyceride-rich lipoproteins (TRLP) concentration but not for LDL or HDL subfractions. Mendelian randomization (MR) estimates were largely null implying that pleiotropy rather than causality underlies the genetic correlation between migraine and lipoprotein subfractions. Pleiotropy was further supported in cross-trait meta-analysis revealing significant shared signals at four loci (chr2p21 harboring THADA, chr5q13.3 harboring HMGCR, chr6q22.31 harboring HEY2, and chr7q11.23 harboring MLXIPL) between migraine and lipoprotein subfractions. Three of these loci were replicated for migraine (P<0.05) in a smaller sample from the UK Biobank. The shared signal at chr5q13.3 colocalized with expression of HMGCR, ANKDD1B, and COL4A3BP in multiple tissues.Conclusions:The current study supports the association between certain lipoprotein subfractions, especially for TRLP, and migraine in populations of European ancestry. The corresponding shared genetic components may be help identify potential targets for future migraine therapeutics.Classification of Evidence:This study provides Class I evidence that migraine is significantly associated with some lipoprotein subfractions.


2021 ◽  
Author(s):  
Duncan S Palmer ◽  
Wei Zhou ◽  
Liam Abbott ◽  
Nik Baya ◽  
Claire Churchhouse ◽  
...  

In classical statistical genetic theory, a dominance effect is defined as the deviation from a purely additive genetic effect for a biallelic variant. Dominance effects are well documented in model organisms. However, evidence in humans is limited to a handful of traits, particularly those with strong single locus effects such as hair color. We carried out the largest systematic evaluation of dominance effects on phenotypic variance in the UK Biobank. We curated and tested over 1,000 phenotypes for dominance effects through GWAS scans, identifying 175 loci at genome-wide significance correcting for multiple testing (P < 4.7 × 10-11). Power to detect non-additive loci is much lower than power to detect additive effects for complex traits: based on the relative effect sizes at genome-wide significant additive loci, we estimate a factor of 20-30 increase in sample size will be necessary to capture clear evidence of dominance similar to those currently observed for additive effects. However, these localised dominance hits do not extend to a significant aggregate contribution to phenotypic variance genome-wide. By deriving a version of LD-score regression to detect dominance effects tagged by common variation genome-wide (minor allele frequency > 0.05), we found no strong evidence of a contribution to phenotypic variance when accounting for multiple testing. Across the 267 continuous and 793 binary traits the median contribution was 5.73 × 10-4, with unbiased point estimates ranging from -0.261 to 0.131. Finally, we introduce dominance fine-mapping to explore whether the more rapid decay of dominance LD can be leveraged to find causal variants. These results provide the most comprehensive assessment of dominance trait variation in humans to date.


2021 ◽  
Author(s):  
Jennifer Monereo Sánchez ◽  
Miranda T. Schram ◽  
Oleksandr Frei ◽  
Kevin O’Connell ◽  
Alexey A. Shadrin ◽  
...  

ABSTRACTBackgroundAlzheimer’s disease (AD) and depression are debilitating brain disorders that are often comorbid. Shared brain mechanisms have been implicated, yet findings are inconsistent, reflecting the complexity of the underlying pathophysiology. As both disorders are (partly) heritable, characterizing their genetic overlap may provide etiological clues. While previous studies have indicated negligible genetic correlations, this study aims to expose the genetic overlap that may remain hidden due to mixed directions of effects.MethodsWe applied Gaussian mixture modelling, through MiXeR, and conjunctional false discovery rate (cFDR) analysis, through pleioFDR, to genome-wide association study (GWAS) summary statistics of AD (n=79,145) and depression (n=450,619). The effects of identified overlapping loci on AD and depression were tested in 403,029 participants of the UK Biobank (mean age 57.21 52.0% female), and mapped onto brain morphology in 30,699 individuals with brain MRI data.ResultsMiXer estimated 98 causal genetic variants overlapping between the two disorders, with 0.44 concordant directions of effects. Through pleioFDR, we identified a SNP in the TMEM106B gene, which was significantly associated with AD (B=-0.002, p=9.1×10−4) and depression (B=0.007, p=3.2×10−9) in the UK Biobank. This SNP was also associated with several regions of the corpus callosum volume anterior (B>0.024, p<8.6×10−4), third ventricle volume ventricle (B=-0.025, p=5.0×10−6), and inferior temporal gyrus surface area (B=0.017, p=5.3×10−4).DiscussionOur results indicate there is substantial genetic overlap, with mixed directions of effects, between AD and depression. These findings illustrate the value of biostatistical tools that capture such overlap, providing insight into the genetic architectures of these disorders.


Circulation ◽  
2020 ◽  
Vol 141 (Suppl_1) ◽  
Author(s):  
Yanjun Guo ◽  
Wonil Chung ◽  
Zhilei Shan ◽  
Liming Liang

Background: Patients with RA have a 2-10 folds increased risk of cardiovascular diseases (CVD) and CVD accounts for almost 50% of the excess mortality in patients with RA when compared with general population, but the mechanisms underlying such associations are largely unknown. Methods: We examined the genetic correlation, causality, and shared genetic variants between RA (Ncase=6,756, Ncontrol=452,476) and CVD (Ncase=44,246, Ncontrol=414,986) using LD Score regression (LDSC), generalized summary-data-based Mendelian Randomization (GSMR), and cross-trait meta-analysis in the UK Biobank Data. Results: In the present study, RA was significantly genetically correlated with MI, angina, CHD, and CVD after correcting for multiple testing (Rg ranges from 0.40 to 0.43, P<0.05/5). Interestingly, when stratified by frequent usage of aspirin and paracetamol, we observed increased genetic correlation between RA and CVD for participants without aspirin usage ( Rg increased to 0.54 [95%CI: 0.54, 0.78] for angina; P value=6.69х10 -6 ), and for participants with usage of paracetamol ( Rg increased to 0.75 [95%CI: 0.20, 1.29] for MI; P value=8.90х10 -3 ). Cross-trait meta-analysis identified 9 independent loci that were shared between RA and at least one of the genetically correlated CVD traits including PTPN22 at chr1p13.2 , BCL2L11 at chr2q13 , and CCR3 at chr3p21.31 ( P single trait <1х10 -3 and P meta <5х10 -8 ) highlighting potential shared etiology between them which include accelerating atherosclerosis and upregulating oxidative stress and vascular permeability. Finally, Mendelian randomization analyses observed inconsistent instrumental effects and were unable to conclude the causality and directionality between RA and CVD. Conclusion: Our results supported positive genetic correlation between RA and multiple cardiovascular traits, and frequent usage of aspirin and paracetamol may modify their associations, but instrumental analyses were unable to conclude the causality and directionality between them.


2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Katerina Trajanoska ◽  
Lotta J. Seppala ◽  
Carolina Medina-Gomez ◽  
Yi-Hsiang Hsu ◽  
Sirui Zhou ◽  
...  

Abstract Both extrinsic and intrinsic factors predispose older people to fall. We performed a genome-wide association analysis to investigate how much of an individual’s fall susceptibility can be attributed to genetics in 89,076 cases and 362,103 controls from the UK Biobank Study. The analysis revealed a small, but significant SNP-based heritability (2.7%) and identified three novel fall-associated loci (Pcombined ≤ 5 × 10−8). Polygenic risk scores in two independent settings showed patterns of polygenic inheritance. Risk of falling had positive genetic correlations with fractures, identifying for the first time a pathway independent of bone mineral density. There were also positive genetic correlations with insomnia, neuroticism, depressive symptoms, and different medications. Negative genetic correlations were identified with muscle strength, intelligence and subjective well-being. Brain, and in particular cerebellum tissue, showed the highest gene expression enrichment for fall-associated variants. Overall, despite the highly heterogenic nature underlying fall risk, a proportion of the susceptibility can be attributed to genetics.


2018 ◽  
Author(s):  
Carla Márquez-Luna ◽  
Steven Gazal ◽  
Po-Ru Loh ◽  
Samuel S. Kim ◽  
Nicholas Furlotte ◽  
...  

AbstractGenetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a new method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, which includes coding, conserved, regulatory and LD-related annotations. We analytically estimate posterior mean causal effect sizes and then use cross-validation to regularize these estimates, improving prediction accuracy for sparse architectures. LDpred-funct attained higher prediction accuracy than other polygenic prediction methods in simulations using real genotypes. We applied LDpred-funct to predict 21 highly heritable traits in the UK Biobank. We used association statistics from British-ancestry samples as training data (avg N=373K) and samples of other European ancestries as validation data (avg N=22K), to minimize confounding. LDpred-funct attained a +4.6% relative improvement in average prediction accuracy (avg prediction R2=0.144; highest R2=0.413 for height) compared to SBayesR (the best method that does not incorporate functional information). For height, meta-analyzing training data from UK Biobank and 23andMe cohorts (total N=1107K; higher heritability in UK Biobank cohort) increased prediction R2 to 0.431. Our results show that incorporating functional priors improves polygenic prediction accuracy, consistent with the functional architecture of complex traits.


2019 ◽  
Vol 25 (10) ◽  
pp. 2422-2430 ◽  
Author(s):  
Douglas M. Ruderfer ◽  
Colin G. Walsh ◽  
Matthew W. Aguirre ◽  
Yosuke Tanigawa ◽  
Jessica D. Ribeiro ◽  
...  

Abstract Suicide accounts for nearly 800,000 deaths per year worldwide with rates of both deaths and attempts rising. Family studies have estimated substantial heritability of suicidal behavior; however, collecting the sample sizes necessary for successful genetic studies has remained a challenge. We utilized two different approaches in independent datasets to characterize the contribution of common genetic variation to suicide attempt. The first is a patient reported suicide attempt phenotype asked as part of an online mental health survey taken by a subset of participants (n = 157,366) in the UK Biobank. After quality control, we leveraged a genotyped set of unrelated, white British ancestry participants including 2433 cases and 334,766 controls that included those that did not participate in the survey or were not explicitly asked about attempting suicide. The second leveraged electronic health record (EHR) data from the Vanderbilt University Medical Center (VUMC, 2.8 million patients, 3250 cases) and machine learning to derive probabilities of attempting suicide in 24,546 genotyped patients. We identified significant and comparable heritability estimates of suicide attempt from both the patient reported phenotype in the UK Biobank (h2SNP = 0.035, p = 7.12 × 10−4) and the clinically predicted phenotype from VUMC (h2SNP = 0.046, p = 1.51 × 10−2). A significant genetic overlap was demonstrated between the two measures of suicide attempt in these independent samples through polygenic risk score analysis (t = 4.02, p = 5.75 × 10−5) and genetic correlation (rg = 1.073, SE = 0.36, p = 0.003). Finally, we show significant but incomplete genetic correlation of suicide attempt with insomnia (rg = 0.34–0.81) as well as several psychiatric disorders (rg = 0.26–0.79). This work demonstrates the contribution of common genetic variation to suicide attempt. It points to a genetic underpinning to clinically predicted risk of attempting suicide that is similar to the genetic profile from a patient reported outcome. Lastly, it presents an approach for using EHR data and clinical prediction to generate quantitative measures from binary phenotypes that can improve power for genetic studies.


Sign in / Sign up

Export Citation Format

Share Document