scholarly journals DeepWAS: Multivariate genotype-phenotype associations by directly integrating regulatory information using deep learning

2016 ◽  
Author(s):  
Janine Arloth ◽  
Gökcen Eraslan ◽  
Till F.M. Andlauer ◽  
Jade Martins ◽  
Stella Iurato ◽  
...  

AbstractGenome-wide association studies (GWAS) identify genetic variants associated with quantitative traits or disease. Thus, GWAS never directly link variants to regulatory mechanisms, which, in turn, are typically inferred during post-hoc analyses. In parallel, a recent deep learning-based method allows for prediction of regulatory effects per variant on currently up to 1,000 cell type-specific chromatin features. We here describe “DeepWAS”, a new approach that directly integrates predictions of these regulatory effects of single variants into a multivariate GWAS setting. As a result, single variants associated with a trait or disease are, by design, coupled to their impact on a chromatin feature in a cell type. Up to 40,000 regulatory single-nucleotide polymorphisms (SNPs) were associated with multiple sclerosis (MS, 4,888 cases and 10,395 controls), major depressive disorder (MDD, 1,475 cases and 2,144 controls), and height (5,974 individuals) to each identify 43-61 regulatory SNPs, called deepSNPs, which are shown to reach at least nominal significance in large GWAS. MS- and height-specific deepSNPs resided in active chromatin and introns, whereas MDD-specific deepSNPs located mostly to intragenic regions and repressive chromatin states. We found deepSNPs to be enriched in public or cohort-matched expression and methylation quantitative trait loci and demonstrate the potential of the DeepWAS method to directly generate testable functional hypotheses based on genotype data alone. DeepWAS is an innovative GWAS approach with the power to identify individual SNPs in non-coding regions with gene regulatory capacity with a joint contribution to disease risk. DeepWAS is available at https://github.com/cellmapslab/DeepWAS.

2016 ◽  
Vol 119 (suppl_1) ◽  
Author(s):  
Aditya Kumar ◽  
Stephanie Thomas ◽  
Kirsten Wong ◽  
Kevin Tenerelli ◽  
Valentina Lo Sardo ◽  
...  

Genome-wide association studies have identified single nucleotide polymorphisms (SNPs) at gene loci that affect cardiovascular function, and while mechanisms in protein-coding loci are obvious, those in non-coding loci are difficult to determine. 9p21 is a recently identified locus associated with increased risk of coronary artery disease (CAD) and myocardial infarction. Associations have implicated SNPs in altering smooth muscle and endothelial cell properties but have not identified adverse effects in cardiomyocytes (CMs) despite enhanced disease risk. Using induced pluripotent stem cell-derived CMs from patients that are homozygous risk/risk (R/R) and non-risk/non-risk (N/N) for 9p21 SNPs and either CAD positive or negative, we assessed CM function when cultured on hydrogels capable of mimicking the fibrotic stiffening associated with disease post-heart attack, i.e. “heart attack-in-a-dish” stiffening from 11 kiloPascals (kPa) to 50 kPa. While all CMs independent of genotype and disease beat synchronously on soft matrices, R/R CMs cultured on dynamically stiffened hydrogels exhibited asynchronous contractions and had significantly lower correlation coefficients versus N/N CMs in the same conditions. Dynamic stiffening reduced connexin 43 expression and gap junction assembly in R/R CMs but not N/N CMs. To eliminate patient-to-patient variability, we created an isogenic line by deleting the 9p21 gene locus from a R/R patient using TALEN-mediated gene editing, i.e. R/R KO. Deletion of the 9p21 locus restored synchronous contractility and organized connexin 43 junctions. As a non-coding locus, 9p21 appears to repress connexin transcription, leading to the phenotypes we observe, but only when the niche is stiffened as in disease. These data are the first to demonstrate that disease-specific niche remodeling, e.g. a “heart attack-in-a-dish” model, can differentially affect CM function depending on SNPs within a non-coding locus.


2021 ◽  
Author(s):  
Jielin Xu ◽  
Yuan Hou ◽  
Yadi Zhou ◽  
Ming Hu ◽  
Feixiong Cheng

Human genome sequencing studies have identified numerous loci associated with complex diseases, including Alzheimer's disease (AD). Translating human genetic findings (i.e., genome-wide association studies [GWAS]) to pathobiology and therapeutic discovery, however, remains a major challenge. To address this critical problem, we present a network topology-based deep learning framework to identify disease-associated genes (NETTAG). NETTAG is capable of integrating multi-genomics data along with the protein-protein interactome to infer putative risk genes and drug targets impacted by GWAS loci. Specifically, we leverage non-coding GWAS loci effects on expression quantitative trait loci (eQTLs), histone-QTLs, and transcription factor binding-QTLs, enhancers and CpG islands, promoter regions, open chromatin, and promoter flanking regions. The key premises of NETTAG are that the disease risk genes exhibit distinct functional characteristics compared to non-risk genes and therefore can be distinguished by their aggregated genomic features under the human protein interactome. Applying NETTAG to the latest AD GWAS data, we identified 156 putative AD-risk genes (i.e., APOE, BIN1, GSK3B, MARK4, and PICALM). We showed that predicted risk genes are: 1) significantly enriched in AD-related pathobiological pathways, 2) more likely to be differentially expressed regarding transcriptome and proteome of AD brains, and 3) enriched in druggable targets with approved medicines (i.e., choline and ibudilast). In summary, our findings suggest that understanding of human pathobiology and therapeutic development could benefit from a network-based deep learning methodology that utilizes GWAS findings under the multimodal genomic analyses.


2020 ◽  
Vol 117 (26) ◽  
pp. 15028-15035 ◽  
Author(s):  
Ronald Yurko ◽  
Max G’Sell ◽  
Kathryn Roeder ◽  
Bernie Devlin

To correct for a large number of hypothesis tests, most researchers rely on simple multiple testing corrections. Yet, new methodologies of selective inference could potentially improve power while retaining statistical guarantees, especially those that enable exploration of test statistics using auxiliary information (covariates) to weight hypothesis tests for association. We explore one such method, adaptiveP-value thresholding (AdaPT), in the framework of genome-wide association studies (GWAS) and gene expression/coexpression studies, with particular emphasis on schizophrenia (SCZ). Selected SCZ GWAS associationPvalues play the role of the primary data for AdaPT; single-nucleotide polymorphisms (SNPs) are selected because they are gene expression quantitative trait loci (eQTLs). This natural pairing of SNPs and genes allow us to map the following covariate values to these pairs: GWAS statistics from genetically correlated bipolar disorder, the effect size of SNP genotypes on gene expression, and gene–gene coexpression, captured by subnetwork (module) membership. In all, 24 covariates per SNP/gene pair were included in the AdaPT analysis using flexible gradient boosted trees. We demonstrate a substantial increase in power to detect SCZ associations using gene expression information from the developing human prefrontal cortex. We interpret these results in light of recent theories about the polygenic nature of SCZ. Importantly, our entire process for identifying enrichment and creating features with independent complementary data sources can be implemented in many different high-throughput settings to ultimately improve power.


2016 ◽  
Vol 2016 ◽  
pp. 1-9 ◽  
Author(s):  
Yu Toyoda ◽  
Tsuneaki Gomi ◽  
Hiroshi Nakagawa ◽  
Makoto Nagakura ◽  
Toshihisa Ishikawa

The importance of personalized medicine and healthcare is becoming increasingly recognized. Genetic polymorphisms associated with potential risks of various human genetic diseases as well as drug-induced adverse reactions have recently been well studied, and their underlying molecular mechanisms are being uncovered by functional genomics as well as genome-wide association studies. Knowledge of certain genetic polymorphisms is clinically important for our understanding of interindividual differences in drug response and/or disease risk. As such evidence accumulates, new clinical applications and practices are needed. In this context, the development of new technologies for simple, fast, accurate, and cost-effective genotyping is imperative. Here, we describe a simple isothermal genotyping method capable of detecting single nucleotide polymorphisms (SNPs) in the human ATP-binding cassette (ABC) transporterABCC11gene and its application to the clinical diagnosis of axillary osmidrosis. We have recently reported that axillary osmidrosis is linked with one SNP 538G>A in theABCC11gene. Our molecular biological and biochemical studies have revealed that this SNP greatly affects the protein expression level and the function of ABCC11. In this review, we highlight the clinical relevance and importance of this diagnostic strategy in axillary osmidrosis therapy.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yunpeng Wu ◽  
Ling Zhong ◽  
Ge Li ◽  
Lanwen Han ◽  
Junling Fu ◽  
...  

BackgroundHypoadiponectinemia has been associated with various cardiometabolic disease states. Previous studies in adults have shown that adiponectin levels were regulated by specific genetic and behavioral or lifestyle factors. However, little is known about the influence of these factors on adiponectin levels in children, particularly as mitigated by pubertal development.MethodsWe performed a cross-sectional analysis of data from 3,402 children aged 6-18 years from the Beijing Child and Adolescent Metabolic Syndrome (BCAMS) study. Pubertal progress was classified as prepubertal, midpuberty, and postpuberty. Six relevant single nucleotide polymorphisms (SNPs) were selected from previous genome-wide association studies of adiponectin in East Asians. Individual SNPs and two weighted genetic predisposition scores, as well as their interactions with 14 lifestyle factors, were analyzed to investigate their influence on adiponectin levels across puberty. The effect of these factors on adiponectin was analyzed using general linear models adjusted for age, sex, and BMI.ResultsAfter adjustment for age, sex, and BMI, the associations between adiponectin levels and diet items, and diet score were significant at prepuberty or postpuberty, while the effect of exercise on adiponectin levels was more prominent at mid- and postpuberty. Walking to school was found to be associated with increased adiponectin levels throughout puberty. Meanwhile, the effect of WDR11-FGFR2-rs3943077 was stronger at midpuberty (P = 0.002), and ADIPOQ-rs6773957 was more effective at postpuberty (P = 0.005), while CDH13-rs4783244 showed the strongest association with adiponectin levels at all pubertal stages (all P < 3.24 × 10-15). We further found that effects of diet score (Pinteraction = 0.022) and exercise (Pinteraction = 0.049) were stronger in children with higher genetic risk of hypoadiponectinemia, while higher diet score and exercise frequency attenuated the differences in adiponectin levels among children with different genetic risks.ConclusionsOur study confirmed puberty modulates the associations between adiponectin, and genetic variants, lifestyle factors, and gene-by-lifestyle interactions. These findings provide new insight into puberty-specific lifestyle suggestions, especially in genetically susceptible individuals.


Author(s):  
Jody Ye ◽  
Kathleen Gillespie ◽  
Santiago Rodriguez

Although genome-wide association studies (GWAS) have identified several hundred loci associated with autoimmune diseases, their mechanistic insights are still poorly understood. The human genome is more complex than common single nucleotide polymorphisms (SNPs) that are interrogated by GWAS arrays. Some structural variants such as insertions-deletions, copy number variations, and minisatellites that are not very well tagged by SNPs cannot be fully explored by GWAS. Therefore, it is possible that some of these loci may have large effects on autoimmune disease risk. In addition, other layers of regulations such as gene-gene interactions, epigenetic-determinants, gene and environmental interactions also contribute to the heritability of autoimmune diseases. This review focuses on discussing why studying these elements may allow us to gain a more comprehensive understanding of the aetiology of complex autoimmune traits.


2018 ◽  
Author(s):  
Jason Chesler Klein ◽  
Aidan Keith ◽  
Sarah J. Rice ◽  
Colin Shepherd ◽  
Vikram Agarwal ◽  
...  

AbstractTo date, genome-wide association studies have implicated at least 35 loci in osteoarthritis, but due to linkage disequilibrium, we have yet to pinpoint the specific variants that underlie these associations, nor the mechanisms by which they contribute to disease risk. Here we functionally tested 1,605 single nucleotide variants associated with osteoarthritis for regulatory activity using a massively parallel reporter assay. We identified six single nucleotide polymorphisms (SNPs) with differential regulatory activity between the major and minor alleles. We show that our most significant hit, rs4730222, drives increased expression of an alternative isoform of HBP1 in a heterozygote chondrosarcoma cell line, a CRISPR-edited osteosarcoma cell line, and in chondrocytes derived from osteoarthritis patients.


2021 ◽  
Author(s):  
Sophie L Farrow ◽  
William Schierding ◽  
Sreemol Gokuladhas ◽  
Evgeniia Golovina ◽  
Tayaza M. Fadason ◽  
...  

The latest meta-analysis of genome wide association studies (GWAS) identified 90 independent single nucleotide polymorphisms (SNPs) across 78 genomic regions associated with Parkinson's disease (PD), yet the mechanisms by which these variants influence the development of the disease remains largely elusive. To establish the functional gene regulatory networks associated with PD-SNPs, we utilised an approach combining spatial (chromosomal conformation capture) and functional (expression quantitative trait loci; eQTL) data. We identified 518 genes subject to regulation by 76 PD-SNPs across 49 tissues, that encompass 36 peripheral and 13 CNS tissues. Notably, one third of these genes were regulated via trans- acting mechanisms (distal; risk locus-gene separated by > 1Mb, or on different chromosomes). Of particular interest is the identification of a novel trans-eQTL-gene connection between rs10847864 and SYNJ1 in the adult brain cortex, highlighting a convergence between familial studies and PD GWAS loci for SYNJ1 (PARK20) for the first time. Furthermore, we identified 16 neuro-development specific eQTL-gene regulatory connections within the foetal cortex, consistent with hypotheses suggesting a neurodevelopmental involvement in the pathogenesis of PD. Through utilising Louvain clustering we extracted nine significant and highly intra-connected clusters within the entire gene regulatory network. The nine clusters are enriched for specific biological processes and pathways, some of which have not previously been associated with PD. Together, our results not only contribute to an overall understanding of the mechanisms and impact of specific combinations of PD-SNPs, but also highlight the potential impact gene regulatory networks may have when elucidating aetiological subtypes of PD.


2018 ◽  
Author(s):  
Kengo Oishi ◽  
Tomihisa Niitsu ◽  
Nobuhisa Kanahara ◽  
Tasuku Hashimoto ◽  
Hideki Komatsu ◽  
...  

Summary ParagraphSchizophrenia is a highly hereditary mental disease1 related to abnormal dopaminergic activities.2,3 To elucidate the mechanisms underlying schizophrenia’s development, genomic studies have sought to identify the pathogenic genetic polymorphisms. Large-scale genome-wide association studies (GWAS) have reported potential candidate loci that contribute to schizophrenia’s development.4,5 The risk genetic profiles are not yet established. Here we show that the combination of three functional single nucleotide polymorphisms (SNPs) related to the key factors of dopaminergic signaling can be used to predict the risk of schizophrenia’s development, though none of the SNPs is known to be associated by itself. These functional SNPs were reported to demonstrate directional influences in their parent gene activity, perhaps characterizing the integrated properties of dopaminergic signaling. Interestingly, the risk combination presented here included the major genotype as well as the minor polymorphisms, suggesting a possible association of unaffected activities of some dopamine-related genes with the disease development. The phenotype speculated based on the allelic status seemed consistent with the conventional pathophysiological hypotheses, although recently developed predictive methods, such as the polygenic risk score, could miss this potent pathogenic role of carrying a normal genotype by evaluating only minor polymorphisms. Our results demonstrate the presence of a subtype in schizophrenia with the favored genetic background related to dopamine signaling. Our findings indicate the possibility that the combinations could characterize integrated biological functions (including neurotransmission) and therefore identify individuals with a disease risk. The biological microenvironment indicated by the functional SNPs could bring an insight to elucidate the pathogenic mechanisms of developing schizophrenia. Furthermore, we believe that our approach will contribute to the development of innovative means to predict disease risks even for other multi-factorial diseases and then, the following preventive medicine.


2020 ◽  
Author(s):  
Alix Booms ◽  
Steven E. Pierce ◽  
Gerhard A. Coetzee

AbstractGenome-wide association studies (GWAS) have uncovered thousands of single nucleotide polymorphisms (SNPs) that are associated with Parkinson’s disease (PD) risk. The functions of most of these SNPs, including the cell type they influence, and how they affect PD etiology remain largely unknown. To identify functional SNPs, we aligned PD risk SNPs within active regulatory regions of DNA in microglia, a cell type implicated in PD development. Out of 6,749 ‘SNPs of interest’ from the most recent PD GWAS metanalysis, 73 were located in open regulatory chromatin as determined by both ATAC-seq and H3K27ac ChIP-seq. We highlight a subset of SNPs that are favorable candidates for further mechanistic studies. These SNPs are located in regulatory DNA at the SLC50A1, SNCA, BAG3, FBXL19, SETD1A, and NUCKS1 loci. A network analysis of the genes with risk SNPs in their promoters, implicated substance transport, involving autophagy and lysosomal genes. Our study provides a more focused set of risk SNPs and their associated risk genes as candidates for further follow-up studies, which will help identify mechanisms in microglia that increase the risk for PD.


Sign in / Sign up

Export Citation Format

Share Document