scholarly journals E-MAGMA: an eQTL-informed method to identify risk genes using genome-wide association study summary statistics

Author(s):  
Zachary F Gerring ◽  
Angela Mina-Vargas ◽  
Eric R Gamazon ◽  
Eske M Derks

Abstract Motivation Genome-wide association studies have successfully identified multiple independent genetic loci that harbour variants associated with human traits and diseases, but the exact causal genes are largely unknown. Common genetic risk variants are enriched in non-protein-coding regions of the genome and often affect gene expression (expression quantitative trait loci, eQTL) in a tissue-specific manner. To address this challenge, we developed a methodological framework, E-MAGMA, which converts genome-wide association summary statistics into gene-level statistics by assigning risk variants to their putative genes based on tissue-specific eQTL information. Results We compared E-MAGMA to three eQTL informed gene-based approaches using simulated phenotype data. Phenotypes were simulated based on eQTL reference data using GCTA for all genes with at least one eQTL at chromosome 1. We performed 10 simulations per gene. The eQTL-h2 (i.e., the proportion of variation explained by the eQTLs) was set at 1%, 2%, and 5%. We found E-MAGMA outperforms other gene-based approaches across a range of simulated parameters (e.g. the number of identified causal genes). When applied to genome-wide association summary statistics for five neuropsychiatric disorders, E-MAGMA identified more putative candidate causal genes compared to other eQTL-based approaches. By integrating tissue-specific eQTL information, these results show E-MAGMA will help to identify novel candidate causal genes from genome-wide association summary statistics and thereby improve the understanding of the biological basis of complex disorders. Availability A tutorial and input files are made available in a github repository: https://github.com/eskederks/eMAGMA-tutorial. Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Author(s):  
Zachary F Gerring ◽  
Angela Mina-Vargas ◽  
Eske M Derks

AbstractIdentifying genes underlying genetic associations of complex disease is challenging because most common risk variants reside in non-protein coding regions of the genome and likely alter the expression of target genes by disrupting tissue and cell-type specific regulatory elements. To address this challenge, we developed a methodological framework, eQTL-MAGMA (eMAGMA), that converts SNP-level summary statistics into gene-level association statistics by assigning non-coding SNPs to their putative genes based on tissue-specific eQTL information. We compared eMAGMA to three eQTL informed gene-based approaches—S-PrediXcan, FUSION, and SMR—using simulated phenotype data. Phenotypes were simulated based on eQTL reference data using GCTA for all genes with at least one eQTL at chromosome 1 (651 genes). We performed 10 simulations per gene. The eQTL-h2 (i.e., the proportion of variation explained by the eQTLs was set at 1%, 2%, and 5%. We found eMAGMA outperforms other gene-based approaches across a range of simulated parameters (e.g. the number of identified causal genes). When applied to genome-wide association summary statistics for major depression, eMAGMA identified substantially more putative candidate causal genes compared to other eQTL-based approaches. By integrating tissue-specific eQTL information, these results show eMAGMA will help to identify novel candidate causal genes from genome-wide association summary statistics and thereby improve the understanding of the biological basis of complex disorders.


Author(s):  
Jianhua Wang ◽  
Dandan Huang ◽  
Yao Zhou ◽  
Hongcheng Yao ◽  
Huanhuan Liu ◽  
...  

Abstract Genome-wide association studies (GWASs) have revolutionized the field of complex trait genetics over the past decade, yet for most of the significant genotype-phenotype associations the true causal variants remain unknown. Identifying and interpreting how causal genetic variants confer disease susceptibility is still a big challenge. Herein we introduce a new database, CAUSALdb, to integrate the most comprehensive GWAS summary statistics to date and identify credible sets of potential causal variants using uniformly processed fine-mapping. The database has six major features: it (i) curates 3052 high-quality, fine-mappable GWAS summary statistics across five human super-populations and 2629 unique traits; (ii) estimates causal probabilities of all genetic variants in GWAS significant loci using three state-of-the-art fine-mapping tools; (iii) maps the reported traits to a powerful ontology MeSH, making it simple for users to browse studies on the trait tree; (iv) incorporates highly interactive Manhattan and LocusZoom-like plots to allow visualization of credible sets in a single web page more efficiently; (v) enables online comparison of causal relations on variant-, gene- and trait-levels among studies with different sample sizes or populations and (vi) offers comprehensive variant annotations by integrating massive base-wise and allele-specific functional annotations. CAUSALdb is freely available at http://mulinlab.org/causaldb.


2018 ◽  
Vol 35 (14) ◽  
pp. 2512-2514 ◽  
Author(s):  
Bongsong Kim ◽  
Xinbin Dai ◽  
Wenchao Zhang ◽  
Zhaohong Zhuang ◽  
Darlene L Sanchez ◽  
...  

Abstract Summary We present GWASpro, a high-performance web server for the analyses of large-scale genome-wide association studies (GWAS). GWASpro was developed to provide data analyses for large-scale molecular genetic data, coupled with complex replicated experimental designs such as found in plant science investigations and to overcome the steep learning curves of existing GWAS software tools. GWASpro supports building complex design matrices, by which complex experimental designs that may include replications, treatments, locations and times, can be accounted for in the linear mixed model. GWASpro is optimized to handle GWAS data that may consist of up to 10 million markers and 10 000 samples from replicable lines or hybrids. GWASpro provides an interface that significantly reduces the learning curve for new GWAS investigators. Availability and implementation GWASpro is freely available at https://bioinfo.noble.org/GWASPRO. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (15) ◽  
pp. 4374-4376
Author(s):  
Ninon Mounier ◽  
Zoltán Kutalik

Abstract Summary Increasing sample size is not the only strategy to improve discovery in Genome Wide Association Studies (GWASs) and we propose here an approach that leverages published studies of related traits to improve inference. Our Bayesian GWAS method derives informative prior effects by leveraging GWASs of related risk factors and their causal effect estimates on the focal trait using multivariable Mendelian randomization. These prior effects are combined with the observed effects to yield Bayes Factors, posterior and direct effects. The approach not only increases power, but also has the potential to dissect direct and indirect biological mechanisms. Availability and implementation bGWAS package is freely available under a GPL-2 License, and can be accessed, alongside with user guides and tutorials, from https://github.com/n-mounier/bGWAS. Supplementary information Supplementary data are available at Bioinformatics online.


Diabetologia ◽  
2018 ◽  
Vol 61 (5) ◽  
pp. 1098-1111 ◽  
Author(s):  
Delnaz Roshandel ◽  
◽  
Rose Gubitosi-Klug ◽  
Shelley B. Bull ◽  
Angelo J. Canty ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Lianne M. Reus ◽  
Iris E. Jansen ◽  
Merel O. Mol ◽  
Fred van Ruissen ◽  
Jeroen van Rooij ◽  
...  

AbstractGenetic factors play a major role in frontotemporal dementia (FTD). The majority of FTD cannot be genetically explained yet and it is likely that there are still FTD risk loci to be discovered. Common variants have been identified with genome-wide association studies (GWAS), but these studies have not systematically searched for rare variants. To identify rare and new common variant FTD risk loci and provide more insight into the heritability of C9ORF72-related FTD, we performed a GWAS consisting of 354 FTD patients (including and excluding N = 28 pathological repeat carriers) and 4209 control subjects. The Haplotype Reference Consortium was used as reference panel, allowing for the imputation of rare genetic variants. Two rare genetic variants nearby C9ORF72 were strongly associated with FTD in the discovery (rs147211831: OR = 4.8, P = 9.2 × 10−9, rs117204439: OR = 4.9, P = 6.0 × 10−9) and replication analysis (P < 1.1 × 10−3). These variants also significantly associated with amyotrophic lateral sclerosis in a publicly available dataset. Using haplotype analyses in 1200 individuals, we showed that these variants tag a sub-haplotype of the founder haplotype of the repeat expansion that was previously found to be present in virtually all pathological C9ORF72 G4C2 repeat lengths. This new risk haplotype was 10 times more likely to contain a C9ORF72 pathological repeat length compared to founder haplotypes without one of the two risk variants (~22% versus ~2%; P = 7.70 × 10−58). In haplotypes without a pathologic expansion, the founder risk haplotype had a higher number of repeats (median = 12 repeats) compared to the founder haplotype without the risk variants (median = 8 repeats) (P = 2.05 × 10−260). In conclusion, the identified risk haplotype, which is carried by ~4% of all individuals, is a major risk factor for pathological repeat lengths of C9ORF72 G4C2. These findings strongly indicate that longer C9ORF72 repeats are unstable and more likely to convert to germline pathological C9ORF72 repeat expansions.


Author(s):  
Tim B Bigdeli ◽  
Ayman H Fanous ◽  
Yuli Li ◽  
Nallakkandi Rajeevan ◽  
Frederick Sayward ◽  
...  

Abstract Background Schizophrenia (SCZ) and bipolar disorder (BIP) are debilitating neuropsychiatric disorders, collectively affecting 2% of the world’s population. Recognizing the major impact of these psychiatric disorders on the psychosocial function of more than 200 000 US Veterans, the Department of Veterans Affairs (VA) recently completed genotyping of more than 8000 veterans with SCZ and BIP in the Cooperative Studies Program (CSP) #572. Methods We performed genome-wide association studies (GWAS) in CSP #572 and benchmarked the predictive value of polygenic risk scores (PRS) constructed from published findings. We combined our results with available summary statistics from several recent GWAS, realizing the largest and most diverse studies of these disorders to date. Results Our primary GWAS uncovered new associations between CHD7 variants and SCZ, and novel BIP associations with variants in Sortilin Related VPS10 Domain Containing Receptor 3 (SORCS3) and downstream of PCDH11X. Combining our results with published summary statistics for SCZ yielded 39 novel susceptibility loci including CRHR1, and we identified 10 additional findings for BIP (28 326 cases and 90 570 controls). PRS trained on published GWAS were significantly associated with case-control status among European American (P &lt; 10–30) and African American (P &lt; .0005) participants in CSP #572. Conclusions We have demonstrated that published findings for SCZ and BIP are robustly generalizable to a diverse cohort of US veterans. Leveraging available summary statistics from GWAS of global populations, we report 52 new susceptibility loci and improved fine-mapping resolution for dozens of previously reported associations.


Sign in / Sign up

Export Citation Format

Share Document