scholarly journals Challenges and progress in interpretation of non-coding genetic variants associated with human disease

2017 ◽  
Vol 242 (13) ◽  
pp. 1325-1334 ◽  
Author(s):  
Yizhou Zhu ◽  
Cagdas Tazearslan ◽  
Yousin Suh

Genome-wide association studies have shown that the far majority of disease-associated variants reside in the non-coding regions of the genome, suggesting that gene regulatory changes contribute to disease risk. To identify truly causal non-coding variants and their affected target genes remains challenging but is a critical step to translate the genetic associations to molecular mechanisms and ultimately clinical applications. Here we review genomic/epigenomic resources and in silico tools that can be used to identify causal non-coding variants and experimental strategies to validate their functionalities. Impact statement Most signals from genome-wide association studies (GWASs) map to the non-coding genome, and functional interpretation of these associations remained challenging. We reviewed recent progress in methodologies of studying the non-coding genome and argued that no single approach allows one to effectively identify the causal regulatory variants from GWAS results. By illustrating the advantages and limitations of each method, our review potentially provided a guideline for taking a combinatorial approach to accurately predict, prioritize, and eventually experimentally validate the causal variants.

2021 ◽  
Author(s):  
Steven Gazal ◽  
Omer Weissbrod ◽  
Farhad Hormozdiari ◽  
Kushal Dey ◽  
Joseph Nasser ◽  
...  

Although genome-wide association studies (GWAS) have identified thousands of disease-associated common SNPs, these SNPs generally do not implicate the underlying target genes, as most disease SNPs are regulatory. Many SNP-to-gene (S2G) linking strategies have been developed to link regulatory SNPs to the genes that they regulate in cis, but it is unclear how these strategies should be applied in the context of interpreting common disease risk variants. We developed a framework for evaluating and combining different S2G strategies to optimize their informativeness for common disease risk, leveraging polygenic analyses of disease heritability to define and estimate their precision and recall. We applied our framework to GWAS summary statistics for 63 diseases and complex traits (average N=314K), evaluating 50 S2G strategies. Our optimal combined S2G strategy (cS2G) included 7 constituent S2G strategies (Exon, Promoter, 2 fine-mapped cis-eQTL strategies, EpiMap enhancer-gene linking, Activity-By-Contact (ABC), and Cicero), and achieved a precision of 0.75 and a recall of 0.33, more than doubling the precision and/or recall of any individual strategy; this implies that 33% of SNP-heritability can be linked to causal genes with 75% confidence. We applied cS2G to fine-mapping results for 49 UK Biobank diseases/traits to predict 7,111 causal SNP-gene-disease triplets (with S2G-derived functional interpretation) with high confidence. Finally, we applied cS2G to genome-wide fine-mapping results for these traits (not restricted to GWAS loci) to rank genes by the heritability linked to each gene, providing an empirical assessment of disease omnigenicity; averaging across traits, we determined that the top 200 (1%) of ranked genes explained roughly half of the heritability linked to all genes. Our results highlight the benefits of our cS2G strategy in providing functional interpretation of GWAS findings; we anticipate that precision and recall will increase further under our framework as improved functional assays lead to improved S2G strategies. 


Author(s):  
V. E. Golimbet ◽  
A. K. Golov ◽  
N. V. Kondratyev

Genome-wide association studies (GWASs) discovered multiple genetic variants associated with schizophrenia. Te next step (post-GWAS analysis) is aimed at identifying the causal genetic variants and biological mechanisms underlying the associations with disease risk. Te following strategies are considered: the study of transcriptional regulation in neuronal human cells and the use of epigenomic information for searching for regulatory elements involved in the pathogenesis of schizophrenia. Te frst strategy includes identifcation of neuronal enhancers, mapping of potential target genes and functional confrmation of enhancer-promoter interactions. Te second approach is focused on the identifcation of transcriptional factors, which appear to be master regulators of expression.


2020 ◽  
Author(s):  
Jingshu Wang ◽  
Qingyuan Zhao ◽  
Jack Bowden ◽  
Gilbran Hemani ◽  
George Davey Smith ◽  
...  

Over a decade of genome-wide association studies have led to the finding that significant genetic associations tend to spread across the genome for complex traits. The extreme polygenicity where "all genes affect every complex trait" complicates Mendelian Randomization studies, where natural genetic variations are used as instruments to infer the causal effect of heritable risk factors. We reexamine the assumptions of existing Mendelian Randomization methods and show how they need to be clarified to allow for pervasive horizontal pleiotropy and heterogeneous effect sizes. We propose a comprehensive framework GRAPPLE (Genome-wide mR Analysis under Pervasive PLEiotropy) to analyze the causal effect of target risk factors with heterogeneous genetic instruments and identify possible pleiotropic patterns from data. By using summary statistics from genome-wide association studies, GRAPPLE can efficiently use both strong and weak genetic instruments, detect the existence of multiple pleiotropic pathways, adjust for confounding risk factors, and determine the causal direction. With GRAPPLE, we analyze the effect of blood lipids, body mass index, and systolic blood pressure on 25 disease outcomes, gaining new information on their causal relationships and the potential pleiotropic pathways.


2020 ◽  
Author(s):  
Dylan M. Glubb ◽  
Deborah J. Thompson ◽  
Katja K.H. Aben ◽  
Ahmad Alsulimani ◽  
Frederic Amant ◽  
...  

AbstractAccumulating evidence suggests a relationship between endometrial cancer and epithelial ovarian cancer. For example, endometrial cancer and epithelial ovarian cancer share epidemiological risk factors and molecular features observed across histotypes are held in common (e.g. serous, endometrioid and clear cell). Independent genome-wide association studies (GWAS) for endometrial cancer and epithelial ovarian cancer have identified 16 and 27 risk regions, respectively, four of which overlap between the two cancers. Using GWAS summary statistics, we explored the shared genetic etiology between endometrial cancer and epithelial ovarian cancer. Genetic correlation analysis using LD Score regression revealed significant genetic correlation between the two cancers (rG = 0.43, P = 2.66 × 10−5). To identify loci associated with the risk of both cancers, we implemented a pipeline of statistical genetic analyses (i.e. inverse-variance meta-analysis, co-localization, and M-values), and performed analyses by stratified by subtype. We found seven loci associated with risk for both cancers (PBonferroni < 2.4 × 10−9). In addition, four novel regions at 7p22.2, 7q22.1, 9p12 and 11q13.3 were identified at a sub-genome wide threshold (P < 5 × 10−7). Integration with promoter-associated HiChIP chromatin loops from immortalized endometrium and epithelial ovarian cell lines, and expression quantitative trait loci (eQTL) data highlighted candidate target genes for further investigation.


Sign in / Sign up

Export Citation Format

Share Document