scholarly journals nMAGMA: a network-enhanced method for inferring risk genes from GWAS summary statistics and its application to schizophrenia

Author(s):  
Anyi Yang ◽  
Jingqi Chen ◽  
Xing-Ming Zhao

Abstract Motivation: Annotating genetic variants from summary statistics of genome-wide association studies (GWAS) is crucial for predicting risk genes of various disorders. The multimarker analysis of genomic annotation (MAGMA) is one of the most popular tools for this purpose, where MAGMA aggregates signals of single nucleotide polymorphisms (SNPs) to their nearby genes. In biology, SNPs may also affect genes that are far away in the genome, thus missed by MAGMA. Although different upgrades of MAGMA have been proposed to extend gene-wise variant annotations with more information (e.g. Hi-C or eQTL), the regulatory relationships among genes and the tissue specificity of signals have not been taken into account. Results: We propose a new approach, namely network-enhanced MAGMA (nMAGMA), for gene-wise annotation of variants from GWAS summary statistics. Compared with MAGMA and H-MAGMA, nMAGMA significantly extends the lists of genes that can be annotated to SNPs by integrating local signals, long-range regulation signals (i.e. interactions between distal DNA elements), and tissue-specific gene networks. When applied to schizophrenia (SCZ), nMAGMA is able to detect more risk genes (217% more than MAGMA and 57% more than H-MAGMA) that are involved in SCZ compared with MAGMA and H-MAGMA, and more of nMAGMA results can be validated with known SCZ risk genes. Some disease-related functions (e.g. the ATPase pathway in Cortex) are also uncovered in nMAGMA but not in MAGMA or H-MAGMA. Moreover, nMAGMA provides tissue-specific risk signals, which are useful for understanding disorders with multitissue origins.

2020 ◽  
Author(s):  
Anyi Yang ◽  
Jingqi Chen ◽  
Xing-Ming Zhao

AbstractMotivationAnnotating genetic variants from summary statistics of genome-wide association studies (GWAS) is crucial for predicting risk genes of various disorders. The multi-marker analysis of genomic annotation (MAGMA) is one of the most popular tools for this purpose, where MAGMA aggregates signals of single nucleotide polymorphisms (SNPs) to their nearby genes. However, SNPs may also affect genes in a distance, thus missed by MAGMA. Although different upgrades of MAGMA have been proposed to extend gene-wise variant annotations with more information (e.g. Hi-C or eQTL), the regulatory relationships among genes and the tissue-specificity of signals have not been taken into account.ResultsWe propose a new approach, namely network-enhanced MAGMA (nMAGMA), for gene-wise annotation of variants from GWAS summary statistics. Compared with MAGMA and H-MAGMA, nMAGMA significantly extends the lists of genes that can be annotated to SNPs by integrating local signals, long-range regulation signals, and tissue-specific gene networks. When applied to schizophrenia, nMAGMA is able to detect more risk genes (217% more than MAGMA and 57% more than H-MAGMA) that are reasonably involved in schizophrenia compared to MAGMA and H-MAGMA. Some disease-related functions (e.g. the ATPase pathway in Cortex) tissues are also uncovered in nMAGMA but not in MAGMA or H-MAGMA. Moreover, nMAGMA provides tissue-specific risk signals, which are useful for understanding disorders with multi-tissue origins.


2018 ◽  
Author(s):  
Yang Luo ◽  
Xinyi Li ◽  
Xin Wang ◽  
Steven Gazal ◽  
Josep Maria Mercader ◽  
...  

AbstractThe increasing size and diversity of genome-wide association studies provide an exciting opportunity to study how the genetics of complex traits vary among diverse populations. Here, we introduce covariate-adjusted LD score regression (cov-LDSC), a method to accurately estimate genetic heritability and its enrichment in both homogenous and admixed populations with summary statistics and in-sample LD estimates. In-sample LD can be estimated from a subset of the GWAS samples, allowing our method to be applied efficiently to very large cohorts. In simulations, we show that unadjusted LDSC underestimates by 10% − 60% in admixed populations; in contrast, cov-LDSC is robust to all simulation parameters. We apply cov-LDSC to genotyping data from approximately 170,000 Latino, 47,000 African American and 135,000 European individuals. We estimate and detect heritability enrichment in three quantitative and five dichotomous phenotypes respectively, making this, to our knowledge, the most comprehensive heritability-based analysis of admixed individuals. Our results show that most traits have high concordance of and consistent tissue-specific heritability enrichment among different populations. However, for age at menarche, we observe population-specific heritability estimates of . We observe consistent patterns of tissue-specific heritability enrichment across populations; for example, in the limbic system for BMI, the per-standardized-annotation effect size τ* is 0.16 ± 0.04, 0.28 ± 0.11 and 0.18 ± 0.03 in Latino, African American and European populations respectively. Our results demonstrate that our approach is a powerful way to analyze genetic data for complex traits from underrepresented populations.Author summaryAdmixed populations such as African Americans and Hispanic Americans bear a disproportionately high burden of disease but remain underrepresented in current genetic studies. It is important to extend current methodological advancements for understanding the genetic basis of complex traits in homogeneous populations to individuals with admixed genetic backgrounds. Here, we develop a computationally efficient method to answer two specific questions. First, does genetic variation contribute to the same amount of phenotypic variation (heritability) across diverse populations? Second, are the genetic mechanisms shared among different populations? To answer these questions, we use our novel method to conduct the first comprehensive heritability-based analysis of a large number of admixed individuals. We show that there is a high degree of concordance in total heritability and tissue-specific enrichment between different ancestral groups. However, traits such as age at menarche show a noticeable differences among populations. Our work provides a powerful way to analyze genetic data in admixed populations and may contribute to the applicability of genomic medicine to admixed population groups.


2021 ◽  
Vol 5 (Supplement_1) ◽  
pp. 986-986
Author(s):  
Yury Loika ◽  
Elena Loiko ◽  
Irina Culminskaya ◽  
Alexander Kulminski

Abstract Epidemiological studies report beneficial associations of higher educational attainment (EDU) with Alzheimer’s disease (AD). Prior genome-wide association studies (GWAS) also reported variants associated with AD and EDU separately. The analysis of pleiotropic predisposition to these phenotypes may shed light on EDU-related protection against AD. We examined pleiotropic predisposition to AD and EDU using Fisher’s method and omnibus test applied to summary statistics for single nucleotide polymorphisms (SNPs) associated with AD and EDU in large-scale univariate GWAS at suggestive-effect (5×10-8


2017 ◽  
Author(s):  
Max Lam ◽  
Joey W. Trampush ◽  
Jin Yu ◽  
Emma Knowles ◽  
Gail Davies ◽  
...  

AbstractNeurocognitive ability is a fundamental readout of brain function, and cognitive deficits are a critical component of neuropsychiatric disorders, yet neurocognition is poorly understood at the molecular level. In the present report, we present the largest genome-wide association studies (GWAS) of cognitive ability to date (N=107,207), and further enhance signal by combining results with a large-scale GWAS of educational attainment. We identified 70 independent genomic loci associated with cognitive ability, 34 of which were novel. A total of 350 genes were implicated, and this list showed significant enrichment for genes associated with Mendelian disorders with an intellectual disability phenotype. Competitive pathway analysis of gene results implicated the biological process of neurogenesis, as well as the gene targets of two pharmacologic agents: cinnarizine, a T-type calcium channel blocker; and LY97241, a potassium channel inhibitor. Transcriptome-wide analysis revealed that the implicated genes were strongly expressed in neurons, but not astrocytes or oligodendrocytes, and were more strongly associated with fetal brain expression than adult brain expression. Several tissue-specific gene expression relationships to cognitive ability were observed (for example, DAG1 levels in the hippocampus). Finally, we report novel genetic correlations between cognitive ability and disparate phenotypes such as maternal age at first birth and number of children, as well as several autoimmune disorders.


BMC Medicine ◽  
2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Haojie Lu ◽  
Jiahao Qiao ◽  
Zhonghe Shao ◽  
Ting Wang ◽  
Shuiping Huang ◽  
...  

Abstract Background Recent genome-wide association studies (GWASs) have revealed the polygenic nature of psychiatric disorders and discovered a few of single-nucleotide polymorphisms (SNPs) associated with multiple psychiatric disorders. However, the extent and pattern of pleiotropy among distinct psychiatric disorders remain not completely clear. Methods We analyzed 14 psychiatric disorders using summary statistics available from the largest GWASs by far. We first applied the cross-trait linkage disequilibrium score regression (LDSC) to estimate genetic correlation between disorders. Then, we performed a gene-based pleiotropy analysis by first aggregating a set of SNP-level associations into a single gene-level association signal using MAGMA. From a methodological perspective, we viewed the identification of pleiotropic associations across the entire genome as a high-dimensional problem of composite null hypothesis testing and utilized a novel method called PLACO for pleiotropy mapping. We ultimately implemented functional analysis for identified pleiotropic genes and used Mendelian randomization for detecting causal association between these disorders. Results We confirmed extensive genetic correlation among psychiatric disorders, based on which these disorders can be grouped into three diverse categories. We detected a large number of pleiotropic genes including 5884 associations and 2424 unique genes and found that differentially expressed pleiotropic genes were significantly enriched in pancreas, liver, heart, and brain, and that the biological process of these genes was remarkably enriched in regulating neurodevelopment, neurogenesis, and neuron differentiation, offering substantial evidence supporting the validity of identified pleiotropic loci. We further demonstrated that among all the identified pleiotropic genes there were 342 unique ones linked with 6353 drugs with drug-gene interaction which can be classified into distinct types including inhibitor, agonist, blocker, antagonist, and modulator. We also revealed causal associations among psychiatric disorders, indicating that genetic overlap and causality commonly drove the observed co-existence of these disorders. Conclusions Our study is among the first large-scale effort to characterize gene-level pleiotropy among a greatly expanded set of psychiatric disorders and provides important insight into shared genetic etiology underlying these disorders. The findings would inform psychiatric nosology, identify potential neurobiological mechanisms predisposing to specific clinical presentations, and pave the way to effective drug targets for clinical treatment.


2020 ◽  
Author(s):  
Diptavo Dutta ◽  
Yuan He ◽  
Ashis Saha ◽  
Marios Arvanitis ◽  
Alexis Battle ◽  
...  

AbstractLarge scale genetic association studies have identified many trait-associated variants and understanding the role of these variants in downstream regulation of gene-expressions can uncover important mediating biological mechanisms. In this study, we propose Aggregative tRans assoCiation to detect pHenotype specIfic gEne-sets (ARCHIE), as a method to establish links between sets of known genetic variants associated with a trait and sets of co-regulated gene-expressions through trans associations. ARCHIE employs sparse canonical correlation analysis based on summary statistics from trans-eQTL mapping and genotype and expression correlation matrices constructed from external data sources. We propose a resampling based procedure to test for significant trait-specific trans-association patterns in the background of highly polygenic regulation of gene-expression. By applying ARCHIE to available trans-eQTL summary statistics reported by the eQTLGen consortium, we identify 71 gene networks which have significant evidence of trans-association with groups of known genetic variants across 29 complex traits. A majority (50.7%) of the genes do not have any strong trans-associations and could not have been detected by standard trans-eQTL mapping. We provide further evidence for causal basis of the target genes through a series of follow-up analyses. These results show ARCHIE is a powerful tool for identifying sets of genes whose trans regulation may be related to specific complex traits.


2016 ◽  
Author(s):  
Yue Li ◽  
Manolis Kellis

Genome wide association studies (GWAS) provide a powerful approach for uncovering disease-associated variants in human, but fine-mapping the causal variants remains a challenge. This is partly remedied by prioritization of disease-associated variants that overlap GWAS-enriched epigenomic annotations. Here, we introduce a new Bayesian model RiVIERA-beta (Risk Variant Inference using Epigenomic Reference Annotations) for inference of driver variants by modelling summary statistics p-values in Beta density function across multiple traits using hundreds of epigenomic annotations. In simulation, RiVIERA-beta promising power in detecting causal variants and causal annotations, the multi-trait joint inference further improved the detection power. We applied RiVIERA-beta to model the existing GWAS summary statistics of 9 autoimmune diseases and Schizophrenia by jointly harnessing the potential causal enrichments among 848 tissue-specific epigenomics annotations from ENCODE/Roadmap consortium covering 127 cell/tissue types and 8 major epigenomic marks. RiVIERA-beta identified meaningful tissue-specific enrichments for enhancer regions defined by H3K4me1 and H3K27ac for Blood T-Cell specifically in the 9 autoimmune diseases and Brain-specific enhancer activities exclusively in Schizophrenia. Moreover, the variants from the 95% credible sets exhibited high conservation and enrichments for GTEx whole-blood eQTLs located within transcription-factor-binding-sites and DNA-hypersensitive-sites. Furthermore, joint modeling the nine immune traits by simultaneously inferring and exploiting the underlying epigenomic correlation between traits further improved the functional enrichments compared to single-trait models.


2017 ◽  
Vol 121 (suppl_1) ◽  
Author(s):  
Le Shu ◽  
Yuqi Zhao ◽  
Aldons J Lusis ◽  
Ke Hao ◽  
Thomas Quertermous ◽  
...  

Insulin resistance (IR) is a critical pathogenic factor for highly prevalent modern cardiometabolic diseases, including coronary artery disease (CAD) and type 2 diabetes (T2D). However, the molecular circuitries underlying IR remain to be elucidated. The GENEticS of Insulin Sensitivity Consortium (GENESIS) conducted genome-wide association studies (GWAS) for direct measures of IR using euglycemic clamp or insulin suppression test. We sought to identify gene networks and their key intervening drivers for IR by performing a comprehensive integrative analysis leveraging GWAS data from seven GENESIS cohorts representing three ethnic groups - Europeans, Asians and Hispanics, along with expression quantitative trait loci, ENCODE, and tissue-specific gene network models (both co-expression and graphical models) from IR relevant tissues. Integration of the multi-ethnic GWAS with diverse functional genomics information captured shared IR pathways and networks across ethnicities that are independent of body mass index, including GLUT4 translocation regulation, insulin signaling, MAPK signaling, interleukin signaling, extracellular matrix, branched-chain amino acids metabolisms, cell cycle, and oxidative phosphorylation. Further integration of these GWAS-informed IR processes with graphical gene networks uncovered potential key regulators including HADH, COX5A, VCAN and TOP2A , whose network neighbors are consistently enriched for the genetic association signals of IR across ethnicities, and show significant correlation with IR, fasting glucose and insulin levels in the transcriptomic-wide association data from a Hybrid Mouse Diversity Panel comprised of >100 strains fed with high-fat diet. Findings from this in-depth assessment of genetic and functional data from multiple human cohorts provide new understanding of the pathways, gene networks and potential regulators contributing to IR. These results will also facilitate future functional investigations to unveil how DNA variations translate into IR.


2015 ◽  
Author(s):  
James Liley ◽  
Chris Wallace

Genome-wide association studies (GWAS) have been successful in identifying single nucleotide polymorphisms (SNPs) associated with many traits and diseases. However, at existing sample sizes, these variants explain only part of the estimated heritability. Leverage of GWAS results from related phenotypes may improve detection without the need for larger datasets. The Bayesian conditional false discovery rate (cFDR) constitutes an upper bound on the expected false discovery rate (FDR) across a set of SNPs whose p values for two diseases are both less than two disease-specific thresholds. Calculation of the cFDR requires only summary statistics and has several advantages over traditional GWAS analysis. However, existing methods require distinct control samples between studies. Here, we extend the technique to allow for some or all controls to be shared, increasing applicability. Several different SNP sets can be defined with the same cFDR value, and we show that the expected FDR across the union of these sets may exceed expected FDR in any single set. We describe a procedure to establish an upper bound for the expected FDR among the union of such sets of SNPs. We apply our technique to pairwise analysis of p values from ten autoimmune diseases with variable sharing of controls, enabling discovery of 59 SNP-disease associations which do not reach GWAS significance after genomic control in individual datasets. Most of the SNPs we highlight have previously been confirmed using replication studies or larger GWAS, a useful validation of our technique; we report eight SNP-disease associations across five diseases not previously declared. Our technique extends and strengthens the previous algorithm, and establishes robust limits on the expected FDR. This approach can improve SNP detection in GWAS, and give insight into shared aetiology between phenotypically related conditions.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Bo He ◽  
Chao Zhang ◽  
Xiaoxue Zhang ◽  
Yu Fan ◽  
Hu Zeng ◽  
...  

Abstract5-Hydroxymethylcytosine (5hmC) is an important epigenetic mark that regulates gene expression. Charting the landscape of 5hmC in human tissues is fundamental to understanding its regulatory functions. Here, we systematically profiled the whole-genome 5hmC landscape at single-base resolution for 19 types of human tissues. We found that 5hmC preferentially decorates gene bodies and outperforms gene body 5mC in reflecting gene expression. Approximately one-third of 5hmC peaks are tissue-specific differentially-hydroxymethylated regions (tsDhMRs), which are deposited in regions that potentially regulate the expression of nearby tissue-specific functional genes. In addition, tsDhMRs are enriched with tissue-specific transcription factors and may rewire tissue-specific gene expression networks. Moreover, tsDhMRs are associated with single-nucleotide polymorphisms identified by genome-wide association studies and are linked to tissue-specific phenotypes and diseases. Collectively, our results show the tissue-specific 5hmC landscape of the human genome and demonstrate that 5hmC serves as a fundamental regulatory element affecting tissue-specific gene expression programs and functions.


Sign in / Sign up

Export Citation Format

Share Document