Addressing the Challenges of Detecting Epistasis in Genome-Wide Association Studies of Common Human Diseases Using Biological Expert Knowledge

2013 ◽  
pp. 725-744
Author(s):  
Kristine A. Pattin ◽  
Jason H. Moore

Recent technological developments in the field of genetics have given rise to an abundance of research tools, such as genome-wide genotyping, that allow researchers to conduct genome-wide association studies (GWAS) for detecting genetic variants that confer increased or decreased susceptibility to disease. However, discovering epistatic, or gene-gene, interactions in high dimensional datasets is a problem due to the computational complexity that results from the analysis of all possible combinations of single-nucleotide polymorphisms (SNPs). A recently explored approach to this problem employs biological expert knowledge, such as pathway or protein-protein interaction information, to guide an analysis by the selection or weighting of SNPs based on this knowledge. Narrowing the evaluation to gene combinations that have been shown to interact experimentally provides a biologically concise reason why those two genes may be detected together statistically. This chapter discusses the challenges of discovering epistatic interactions in GWAS and how biological expert knowledge can be used to facilitate genome-wide genetic studies.

Author(s):  
Kristine A. Pattin ◽  
Jason H. Moore

Recent technological developments in the field of genetics have given rise to an abundance of research tools, such as genome-wide genotyping, that allow researchers to conduct genome-wide association studies (GWAS) for detecting genetic variants that confer increased or decreased susceptibility to disease. However, discovering epistatic, or gene-gene, interactions in high dimensional datasets is a problem due to the computational complexity that results from the analysis of all possible combinations of single-nucleotide polymorphisms (SNPs). A recently explored approach to this problem employs biological expert knowledge, such as pathway or protein-protein interaction information, to guide an analysis by the selection or weighting of SNPs based on this knowledge. Narrowing the evaluation to gene combinations that have been shown to interact experimentally provides a biologically concise reason why those two genes may be detected together statistically. This chapter discusses the challenges of discovering epistatic interactions in GWAS and how biological expert knowledge can be used to facilitate genome-wide genetic studies.


Author(s):  
Ting-Hao Chen ◽  
Chen-Cheng Yang ◽  
Kuei-Hau Luo ◽  
Chia-Yen Dai ◽  
Yao-Chung Chuang ◽  
...  

Aluminum (Al) toxicity is related to renal failure and the failure of other systems. Although there were some genome-wide association studies (GWAS) in Australia and England, there were no GWAS about Han Chinese to our knowledge. Thus, this research focused on using whole genomic genotypes from the Taiwan Biobank for exploring the association between Al concentrations in plasma and renal function. Participants, who underwent questionnaire interviews, biomarkers, and genotyping, were from the Taiwan Biobank database. Then, we measured their plasma Al concentrations with ICP-MS in the laboratory at Kaohsiung Medical University. We used this data to link genome-wide association (GWA) tests while looking for candidate genes and associated plasma Al concentration to renal function. Furthermore, we examined the path relationship between Single Nucleotide Polymorphisms (SNPs), Al concentrations, and estimated glomerular filtration rates (eGFR) through the mediation analysis with 3000 replication bootstraps. Following the principles of GWAS, we focused on three SNPs within the dipeptidyl peptidase-like protein 6 (DPP6) gene in chromosome 7, rs10224371, rs2316242, and rs10268004, respectively. The results of the mediation analysis showed that all of the selected SNPs have indirectly affected eGFR through a mediation of Al concentrations. Our analysis revealed the association between DPP6 SNPs, plasma Al concentrations, and eGFR. However, further longitudinal studies and research on mechanism are in need. Our analysis was still be the first study that explored the association between the DPP6, SNPs, and Al in plasma affecting eGFR.


2020 ◽  
Author(s):  
Yanjiao Jin ◽  
Jie Yang ◽  
Shuyue Zhang ◽  
Jin Li ◽  
Songlin Wang

Abstract Background: Oral diseases impact the majority of the world’s population. The following traits are common in oral inflammatory diseases: mouth ulcers, painful gums, bleeding gums, loose teeth, and toothache. Despite the prevalence of genome-wide association studies, the associations between these traits and common genomic variants, and whether pleiotropic loci are shared by some of these traits remain poorly understood. Methods: In this work, we conducted multi-trait joint analyses based on the summary statistics of genome-wide association studies of these five oral inflammatory traits from the UK Biobank, each of which is comprised of over 10,000 cases and over 300,000 controls. We estimated the genetic correlations between the five traits. We conducted fine-mapping and functional annotation based on multi-omics data to better understand the biological functions of the potential causal variants at each locus. To identify the pathways in which the candidate genes were mainly involved, we applied gene-set enrichment analysis, and further performed protein-protein interaction (PPI) analyses.Results: We identified 39 association signals that surpassed genome-wide significance, including three that were shared between two or more oral inflammatory traits, consistent with a strong correlation. Among these genome-wide significant loci, two were novel for both painful gums and toothache. We performed fine-mapping and identified causal variants at each novel locus. Further functional annotation based on multi-omics data suggested IL10 and IL12A/TRIM59 as potential candidate genes at the novel pleiotropic loci, respectively. Subsequent analyses of pathway enrichment and protein-protein interaction networks suggested the involvement of candidate genes at genome-wide significant loci in immune regulation.Conclusions: Our results highlighted the importance of immune regulation in the pathogenesis of oral inflammatory diseases. Some common immune-related pleiotropic loci or genetic variants are shared by multiple oral inflammatory traits. These findings will be beneficial for risk prediction, prevention, and therapy of oral inflammatory diseases.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Guomin Zhang ◽  
Rongsheng Wang ◽  
Juntao Ma ◽  
Hongru Gao ◽  
Lingwei Deng ◽  
...  

Abstract Background Heilongjiang Province is a high-quality japonica rice cultivation area in China. One in ten bowls of Chinese rice is produced here. Increasing yield is one of the main aims of rice production in this area. However, yield is a complex quantitative trait composed of many factors. The purpose of this study was to determine how many genetic loci are associated with yield-related traits. Genome-wide association studies (GWAS) were performed on 450 accessions collected from northeast Asia, including Russia, Korea, Japan and Heilongjiang Province of China. These accessions consist of elite varieties and landraces introduced into Heilongjiang Province decade ago. Results After resequencing of the 450 accessions, 189,019 single nucleotide polymorphisms (SNPs) were used for association studies by two different models, a general linear model (GLM) and a mixed linear model (MLM), examining four traits: days to heading (DH), plant height (PH), panicle weight (PW) and tiller number (TI). Over 25 SNPs were found to be associated with each trait. Among them, 22 SNPs were selected to identify candidate genes, and 2, 8, 1 and 11 SNPs were found to be located in 3′ UTR region, intron region, coding region and intergenic region, respectively. Conclusions All SNPs detected in this research may become candidates for further fine mapping and may be used in the molecular breeding of high-latitude rice.


2020 ◽  
Vol 117 (21) ◽  
pp. 11608-11613 ◽  
Author(s):  
Marcelo Blatt ◽  
Alexander Gusev ◽  
Yuriy Polyakov ◽  
Shafi Goldwasser

Genome-wide association studies (GWASs) seek to identify genetic variants associated with a trait, and have been a powerful approach for understanding complex diseases. A critical challenge for GWASs has been the dependence on individual-level data that typically have strict privacy requirements, creating an urgent need for methods that preserve the individual-level privacy of participants. Here, we present a privacy-preserving framework based on several advances in homomorphic encryption and demonstrate that it can perform an accurate GWAS analysis for a real dataset of more than 25,000 individuals, keeping all individual data encrypted and requiring no user interactions. Our extrapolations show that it can evaluate GWASs of 100,000 individuals and 500,000 single-nucleotide polymorphisms (SNPs) in 5.6 h on a single server node (or in 11 min on 31 server nodes running in parallel). Our performance results are more than one order of magnitude faster than prior state-of-the-art results using secure multiparty computation, which requires continuous user interactions, with the accuracy of both solutions being similar. Our homomorphic encryption advances can also be applied to other domains where large-scale statistical analyses over encrypted data are needed.


2020 ◽  
Vol 116 (9) ◽  
pp. 1620-1634
Author(s):  
Charlotte Glinge ◽  
Najim Lahrouchi ◽  
Reza Jabbari ◽  
Jacob Tfelt-Hansen ◽  
Connie R Bezzina

Abstract The genetic basis of cardiac electrical phenotypes has in the last 25 years been the subject of intense investigation. While in the first years, such efforts were dominated by the study of familial arrhythmia syndromes, in recent years, large consortia of investigators have successfully pursued genome-wide association studies (GWAS) for the identification of single-nucleotide polymorphisms that govern inter-individual variability in electrocardiographic parameters in the general population. We here provide a review of GWAS conducted on cardiac electrical phenotypes in the last 14 years and discuss the implications of these discoveries for our understanding of the genetic basis of disease susceptibility and variability in disease severity. Furthermore, we review functional follow-up studies that have been conducted on GWAS loci associated with cardiac electrical phenotypes and highlight the challenges and opportunities offered by such studies.


2015 ◽  
Vol 75 (4) ◽  
pp. 652-659 ◽  
Author(s):  
Hirotaka Matsuo ◽  
Ken Yamamoto ◽  
Hirofumi Nakaoka ◽  
Akiyoshi Nakayama ◽  
Masayuki Sakiyama ◽  
...  

ObjectiveGout, caused by hyperuricaemia, is a multifactorial disease. Although genome-wide association studies (GWASs) of gout have been reported, they included self-reported gout cases in which clinical information was insufficient. Therefore, the relationship between genetic variation and clinical subtypes of gout remains unclear. Here, we first performed a GWAS of clinically defined gout cases only.MethodsA GWAS was conducted with 945 patients with clinically defined gout and 1213 controls in a Japanese male population, followed by replication study of 1048 clinically defined cases and 1334 controls.ResultsFive gout susceptibility loci were identified at the genome-wide significance level (p<5.0×10−8), which contained well-known urate transporter genes (ABCG2 and SLC2A9) and additional genes: rs1260326 (p=1.9×10−12; OR=1.36) of GCKR (a gene for glucose and lipid metabolism), rs2188380 (p=1.6×10−23; OR=1.75) of MYL2-CUX2 (genes associated with cholesterol and diabetes mellitus) and rs4073582 (p=6.4×10−9; OR=1.66) of CNIH-2 (a gene for regulation of glutamate signalling). The latter two are identified as novel gout loci. Furthermore, among the identified single-nucleotide polymorphisms (SNPs), we demonstrated that the SNPs of ABCG2 and SLC2A9 were differentially associated with types of gout and clinical parameters underlying specific subtypes (renal underexcretion type and renal overload type). The effect of the risk allele of each SNP on clinical parameters showed significant linear relationships with the ratio of the case–control ORs for two distinct types of gout (r=0.96 [p=4.8×10−4] for urate clearance and r=0.96 [p=5.0×10−4] for urinary urate excretion).ConclusionsOur findings provide clues to better understand the pathogenesis of gout and will be useful for development of companion diagnostics.


Author(s):  
Yingjie Guo ◽  
Chenxi Wu ◽  
Zhian Yuan ◽  
Yansu Wang ◽  
Zhen Liang ◽  
...  

Among the myriad of statistical methods that identify gene–gene interactions in the realm of qualitative genome-wide association studies, gene-based interactions are not only powerful statistically, but also they are interpretable biologically. However, they have limited statistical detection by making assumptions on the association between traits and single nucleotide polymorphisms. Thus, a gene-based method (GGInt-XGBoost) originated from XGBoost is proposed in this article. Assuming that log odds ratio of disease traits satisfies the additive relationship if the pair of genes had no interactions, the difference in error between the XGBoost model with and without additive constraint could indicate gene–gene interaction; we then used a permutation-based statistical test to assess this difference and to provide a statistical p-value to represent the significance of the interaction. Experimental results on both simulation and real data showed that our approach had superior performance than previous experiments to detect gene–gene interactions.


Sign in / Sign up

Export Citation Format

Share Document