scholarly journals Genoppi: a web application for interactive integration of experimental proteomics results with genetic datasets

2017 ◽  
Author(s):  
April Kim ◽  
Edyta Malolepsza ◽  
Justin Lim ◽  
Kasper Lage

AbstractSummaryIntegrating protein-protein interaction experiments and genetic datasets can lead to new insight into the cellular processes implicated in diseases, but this integration is technically challenging. Here, we present Genoppi, a web application that integrates quantitative interaction proteomics data and results from genome-wide association studies or exome sequencing projects, to highlight biological relationships that might otherwise be difficult to discern. Written in R, Python and Bash script, Genoppi is a user-friendly framework easily deployed across Mac OS and Linux distributions.AvailabilityGenoppi is open source and available at https://github.com/lagelab/[email protected] and [email protected]

Diabetes ◽  
2021 ◽  
Vol 70 (Supplement 1) ◽  
pp. 26-OR
Author(s):  
K. ALAINE BROADAWAY ◽  
XIANYONOG YIN ◽  
ALICE WILLIAMSON ◽  
EMMA WILSON ◽  
MAGIC INVESTIGATORS

2020 ◽  
Author(s):  
Yanjiao Jin ◽  
Jie Yang ◽  
Shuyue Zhang ◽  
Jin Li ◽  
Songlin Wang

Abstract Background: Oral diseases impact the majority of the world’s population. The following traits are common in oral inflammatory diseases: mouth ulcers, painful gums, bleeding gums, loose teeth, and toothache. Despite the prevalence of genome-wide association studies, the associations between these traits and common genomic variants, and whether pleiotropic loci are shared by some of these traits remain poorly understood. Methods: In this work, we conducted multi-trait joint analyses based on the summary statistics of genome-wide association studies of these five oral inflammatory traits from the UK Biobank, each of which is comprised of over 10,000 cases and over 300,000 controls. We estimated the genetic correlations between the five traits. We conducted fine-mapping and functional annotation based on multi-omics data to better understand the biological functions of the potential causal variants at each locus. To identify the pathways in which the candidate genes were mainly involved, we applied gene-set enrichment analysis, and further performed protein-protein interaction (PPI) analyses.Results: We identified 39 association signals that surpassed genome-wide significance, including three that were shared between two or more oral inflammatory traits, consistent with a strong correlation. Among these genome-wide significant loci, two were novel for both painful gums and toothache. We performed fine-mapping and identified causal variants at each novel locus. Further functional annotation based on multi-omics data suggested IL10 and IL12A/TRIM59 as potential candidate genes at the novel pleiotropic loci, respectively. Subsequent analyses of pathway enrichment and protein-protein interaction networks suggested the involvement of candidate genes at genome-wide significant loci in immune regulation.Conclusions: Our results highlighted the importance of immune regulation in the pathogenesis of oral inflammatory diseases. Some common immune-related pleiotropic loci or genetic variants are shared by multiple oral inflammatory traits. These findings will be beneficial for risk prediction, prevention, and therapy of oral inflammatory diseases.


Author(s):  
Elle M Weeks ◽  
Jacob C Ulirsch ◽  
Nathan Y Cheng ◽  
Brian L Trippe ◽  
Rebecca S Fine ◽  
...  

Genome-wide association studies (GWAS) are a valuable tool for understanding the biology of complex traits, but the associations found rarely point directly to causal genes. Here, we introduce a new method to identify the causal genes by integrating GWAS summary statistics with gene expression, biological pathway, and predicted protein-protein interaction data. We further propose an approach that effectively leverages both polygenic and locus-specific genetic signals by combining results across multiple gene prioritization methods, increasing confidence in prioritized genes. Using a large set of gold standard genes to evaluate our approach, we prioritize 8,402 unique gene-trait pairs with greater than 75% estimated precision across 113 complex traits and diseases, including known genes such as SORT1 for LDL cholesterol, SMIM1 for red blood cell count, and DRD2 for schizophrenia, as well as novel genes such as TTC39B for cholelithiasis. Our results demonstrate that a polygenic approach is a powerful tool for gene prioritization and, in combination with locus-specific signal, improves upon existing methods.


2021 ◽  
Author(s):  
Ronald J Yurko ◽  
Kathryn Roeder ◽  
Bernie Devlin ◽  
Max G'Sell

In genome-wide association studies (GWAS), it has become commonplace to test millions of SNPs for phenotypic association. Gene-based testing can improve power to detect weak signal by reducing multiple testing and pooling signal strength. While such tests account for linkage disequilibrium (LD) structure of SNP alleles within each gene, current approaches do not capture LD of SNPs falling in different nearby genes, which can induce correlation of gene-based test statistics. We introduce an algorithm to account for this correlation. When a gene's test statistic is independent of others, it is assessed separately; when test statistics for nearby genes are strongly correlated, their SNPs are agglomerated and tested as a locus. To provide insight into SNPs and genes driving association within loci, we develop an interactive visualization tool to explore localized signal. We demonstrate our approach in the context of weakly powered GWAS for autism spectrum disorder, which is contrasted to more highly powered GWAS for schizophrenia and educational attainment. To increase power for these analyses, especially those for autism, we use adaptive p-value thresholding (AdaPT), guided by high-dimensional metadata modeled with gradient boosted trees, highlighting when and how it can be most useful. Notably our workflow is based on summary statistics.


2020 ◽  
Author(s):  
Olivia C Leavy ◽  
Shwu-Fan Ma ◽  
Philip L Molyneaux ◽  
Toby M Maher ◽  
Justin M Oldham ◽  
...  

Genome-wide association studies have identified 14 genetic loci associated with susceptibility to idiopathic pulmonary fibrosis (IPF), a devastating lung disease with poor prognosis. Of these, the variant with the strongest association, rs35705950, is located in the promoter region of the MUC5B gene and has a risk allele (T) frequency of 30-35% in IPF cases. Here we present estimates of the proportion of disease liability explained by each of the 14 IPF risk variants as well as estimates of the proportion of cases that can be attributed to each variant. We estimate that rs35705950 explains 5.9-9.4% of disease liability, which is much lower than previously reported estimates. Of every 100,000 individuals with the rs35705950_GG genotype we estimate 30 will have IPF, whereas for every 100,000 individuals with the rs35705950_GT genotype 152 will have IPF. Quantifying the impact of genetic risk factors on disease liability improves our understanding of the underlying genetic architecture of IPF and provides insight into the impact of genetic factors in risk prediction modelling.


Author(s):  
Kristine A. Pattin ◽  
Jason H. Moore

Recent technological developments in the field of genetics have given rise to an abundance of research tools, such as genome-wide genotyping, that allow researchers to conduct genome-wide association studies (GWAS) for detecting genetic variants that confer increased or decreased susceptibility to disease. However, discovering epistatic, or gene-gene, interactions in high dimensional datasets is a problem due to the computational complexity that results from the analysis of all possible combinations of single-nucleotide polymorphisms (SNPs). A recently explored approach to this problem employs biological expert knowledge, such as pathway or protein-protein interaction information, to guide an analysis by the selection or weighting of SNPs based on this knowledge. Narrowing the evaluation to gene combinations that have been shown to interact experimentally provides a biologically concise reason why those two genes may be detected together statistically. This chapter discusses the challenges of discovering epistatic interactions in GWAS and how biological expert knowledge can be used to facilitate genome-wide genetic studies.


2017 ◽  
Vol 29 (3) ◽  
pp. 713-726 ◽  
Author(s):  
Olivier Devuyst ◽  
Cristian Pattaro

The identification of genetic factors associated with kidney disease has the potential to provide critical insights into disease mechanisms. Genome-wide association studies have uncovered genomic regions associated with renal function metrics and risk of CKD. UMOD is among the most outstanding loci associated with CKD in the general population, because it has a large effect on eGFR and CKD risk that is consistent across different ethnic groups. The relevance of UMOD for CKD is clear, because the encoded protein, uromodulin (Tamm–Horsfall protein), is exclusively produced by the kidney tubule and has specific biochemical properties that mediate important functions in the kidney and urine. Rare mutations in UMOD are the major cause of autosomal dominant tubulointerstitial kidney disease, a condition that leads to CKD and ESRD. In this brief review, we use the UMOD paradigm to describe how population genetic studies can yield insight into the pathogenesis and prognosis of kidney diseases.


2014 ◽  
Vol 18 (1) ◽  
pp. 86-91 ◽  
Author(s):  
Aniket Mishra ◽  
Stuart Macgregor

Gene-based tests such as versatile gene-based association study (VEGAS) are commonly used following per-single nucleotide polymorphism (SNP) GWAS (genome-wide association studies) analysis. Two limitations of VEGAS were that the HapMap2 reference set was used to model the correlation between SNPs and only autosomal genes were considered. HapMap2 has now been superseded by the 1,000 Genomes reference set, and whereas early GWASs frequently ignored the X chromosome, it is now commonly included. Here we have developed VEGAS2, an extension that uses 1,000 Genomes data to model SNP correlations across the autosomes and chromosome X. VEGAS2 allows greater flexibility when defining gene boundaries. VEGAS2 offers both a user-friendly, web-based front end and a command line Linux version. The online version of VEGAS2 can be accessed through https://vegas2.qimrberghofer.edu.au/. The command line version can be downloaded from https://vegas2.qimrberghofer.edu.au/zVEGAS2offline.tgz. The command line version is developed in Perl, R and shell scripting languages; source code is available for further development.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Vincent L. Chen ◽  
Xiaomeng Du ◽  
Yanhua Chen ◽  
Annapurna Kuppa ◽  
Samuel K. Handelman ◽  
...  

AbstractSerum liver enzyme concentrations are the most frequently-used laboratory markers of liver disease, a major cause of mortality. We conduct a meta-analysis of genome-wide association studies of liver enzymes from UK BioBank and BioBank Japan. We identified 160 previously-unreported independent alanine aminotransferase, 190 aspartate aminotransferase, and 199 alkaline phosphatase genome-wide significant associations, with some affecting multiple different enzymes. Associated variants implicate genes that demonstrate diverse liver cell type expression and promote a range of metabolic and liver diseases. These findings provide insight into the pathophysiology of liver and other metabolic diseases that are associated with serum liver enzyme concentrations.


Sign in / Sign up

Export Citation Format

Share Document