scholarly journals ExPheWas: a browser for gene-based pheWAS associations

Author(s):  
Marc-André Legault ◽  
Louis-Philippe Lemieux Perreault ◽  
Marie-Pierre Dubé

Structured AbstractMotivationThe relationship between protein coding genes and phenotypes has the potential to inform on the underlying molecular function in disease etiology. We conducted a phenome-wide association study (pheWAS) of protein coding genes using a principal components analysis-based approach in the UK Biobank.ResultsWe tested the association between 19,114 protein coding gene regions and 1,210 phenotypes including anthropometric measurements, laboratory biomarkers, cancer registry data, hospitalization and death record codes and algorithmically-defined cardiovascular outcomes. We report the pheWAS results in a user-friendly web-based browser. Taking atrial fibrillation, a common cardiac arrhythmia, as an example, ExPheWas identified genes that are known drug targets for the treatment of arrhythmias and genes involved in biological processes implicated in cardiac muscle function. We also identified MYOT as a possible atrial fibrillation gene.Availability and implementationThe ExPheWas browser and API are available at http://exphewas.statgen.org/[email protected]

2020 ◽  
Vol 49 (D1) ◽  
pp. D962-D968 ◽  
Author(s):  
Zhao Li ◽  
Lin Liu ◽  
Shuai Jiang ◽  
Qianpeng Li ◽  
Changrui Feng ◽  
...  

Abstract Expression profiles of long non-coding RNAs (lncRNAs) across diverse biological conditions provide significant insights into their biological functions, interacting targets as well as transcriptional reliability. However, there lacks a comprehensive resource that systematically characterizes the expression landscape of human lncRNAs by integrating their expression profiles across a wide range of biological conditions. Here, we present LncExpDB (https://bigd.big.ac.cn/lncexpdb), an expression database of human lncRNAs that is devoted to providing comprehensive expression profiles of lncRNA genes, exploring their expression features and capacities, identifying featured genes with potentially important functions, and building interactions with protein-coding genes across various biological contexts/conditions. Based on comprehensive integration and stringent curation, LncExpDB currently houses expression profiles of 101 293 high-quality human lncRNA genes derived from 1977 samples of 337 biological conditions across nine biological contexts. Consequently, LncExpDB estimates lncRNA genes’ expression reliability and capacities, identifies 25 191 featured genes, and further obtains 28 443 865 lncRNA-mRNA interactions. Moreover, user-friendly web interfaces enable interactive visualization of expression profiles across various conditions and easy exploration of featured lncRNAs and their interacting partners in specific contexts. Collectively, LncExpDB features comprehensive integration and curation of lncRNA expression profiles and thus will serve as a fundamental resource for functional studies on human lncRNAs.


2020 ◽  
Author(s):  
Marc-André Legault ◽  
Johanna Sandoval ◽  
Sylvie Provost ◽  
Amina Barhdadi ◽  
Louis-Philippe Lemieux Perreault ◽  
...  

ABSTRACTBackgroundNaturally occurring human genetic variants provide a valuable tool to identify drug targets and guide drug prioritization and clinical trial design. Ivabradine is a heart rate lowering drug with protective effects on heart failure despite increasing the risk of atrial fibrillation. In patients with coronary artery disease without heart failure, the drug does not protect against major cardiovascular adverse events prompting questions about the ability of genetics to have predicted those effects. This study evaluates the effect of a mutation in HCN4, ivabradine’s drug target, on safety and efficacy endpoints.MethodsWe used genetic association testing and Mendelian randomization to predict the effect of ivabradine and heart rate lowering on cardiovascular outcomes.ResultsUsing data from the UK Biobank and large GWAS consortia, we evaluated the effect of a heart rate-reducing genetic variant at the HCN4 locus encoding ivabradine’s drug target. These genetic association analyses showed increases in risk for atrial fibrillation (OR 1.09, 95% CI: 1.06-1.13, P=9.3 ×10−9) in the UK Biobank. In a cause-specific competing risk model to account for the increased risk of atrial fibrillation, the HCN4 variant reduced incident heart failure in participants that did not develop atrial fibrillation (HR 0.90, 95% CI: 0.83-0.98, P=0.013). In contrast, the same heart rate reducing HCN4 variant did not prevent a composite endpoint of myocardial infarction or cardiovascular death (OR 0.99, 95% CI: 0.93-1.04, P=0.61).ConclusionGenetic modelling of ivabradine recapitulates its benefits in heart failure, promotion of atrial fibrillation, and neutral effect on myocardial infarction.CONDENSED ABSTRACTThe effects of drugs can sometimes be predicted from the effects of mutations in genes encoding drug targets. We tested the effect of a heart rate reducing allele at the HCN4 locus encoding ivabradine’s drug target and found results coherent with the SHIFT and SIGNIFY clinical trials of ivabradine. The genetic variant increased the risk of atrial fibrillation and cardioembolic stroke and protected against heart failure in a competing risk model accounting for the increased risk of atrial fibrillation. The variant had a neutral effect on a composite of myocardial infarction and cardiovascular death.


2020 ◽  
Author(s):  
Yura Kim ◽  
Mariam Naghavi ◽  
Ying-Tao Zhao

ABSTRACTThe human genome contains more than 4000 genes that are longer than 100 kb. These long genes require more time and resources to make a transcript than shorter genes do. Long genes have also been linked to various human diseases. Specific mechanisms are utilized by long genes to facilitate their transcription and co-transcriptional processes. This results in unique features in their multi-omics profiles. Although these unique profiles are important to understand long genes, a database that provides an integrated view and easy access to the multi-omics profiles of long genes does not exist. We leveraged the publicly accessible multi-omics data and systematically analyzed the genomic conservation, histone modifications, chromatin organization, tissue-specific transcriptome, and single cell transcriptome of 992 protein-coding genes that are longer than 200 kb in the mouse genome. We also examined the evolution history of their gene lengths in 15 species that belong to six Classes and 11 Orders. To share the multi-omics profiles of long genes, we developed a user-friendly and easy-to-use database, LongGeneDB (https://longgenedb.com), for users to search, browse, and download these profiles. LongGeneDB will be a useful data hub for the biomedical research community to understand long genes.


2019 ◽  
Vol 47 (W1) ◽  
pp. W283-W288 ◽  
Author(s):  
Jesús Murga-Moreno ◽  
Marta Coronado-Zamora ◽  
Sergi Hervas ◽  
Sònia Casillas ◽  
Antonio Barbadilla

Abstract The McDonald and Kreitman test (MKT) is one of the most powerful and widely used methods to detect and quantify recurrent natural selection using DNA sequence data. Here we present iMKT (acronym for integrative McDonald and Kreitman test), a novel web-based service performing four distinct MKT types. It allows the detection and estimation of four different selection regimes −adaptive, neutral, strongly deleterious and weakly deleterious− acting on any genomic sequence. iMKT can analyze both user's own population genomic data and pre-loaded Drosophila melanogaster and human sequences of protein-coding genes obtained from the largest population genomic datasets to date. Advanced options in the website allow testing complex hypotheses such as the application example showed here: do genes located in high recombination regions undergo higher rates of adaptation? We aim that iMKT will become a reference site tool for the study of evolutionary adaptation in massive population genomics datasets, especially in Drosophila and humans. iMKT is a free resource online at https://imkt.uab.cat.


2019 ◽  
Vol 47 (W1) ◽  
pp. W106-W113 ◽  
Author(s):  
Jana Marie Schwarz ◽  
Daniela Hombach ◽  
Sebastian Köhler ◽  
David N Cooper ◽  
Markus Schuelke ◽  
...  

Abstract RegulationSpotter is a web-based tool for the user-friendly annotation and interpretation of DNA variants located outside of protein-coding transcripts (extratranscriptic variants). It is designed for clinicians and researchers who wish to assess the potential impact of the considerable number of non-coding variants found in Whole Genome Sequencing runs. It annotates individual variants with underlying regulatory features in an intuitive way by assessing over 100 genome-wide annotations. Additionally, it calculates a score, which reflects the regulatory potential of the variant region. Its dichotomous classifications, ‘functional’ or ‘non-functional’, and a human-readable presentation of the underlying evidence allow a biologically meaningful interpretation of the score. The output shows key aspects of every variant and allows rapid access to more detailed information about its possible role in gene regulation. RegulationSpotter can either analyse single variants or complete VCF files. Variants located within protein-coding transcripts are automatically assessed by MutationTaster as well as by RegulationSpotter to account for possible intragenic regulatory effects. RegulationSpotter offers the possibility of using phenotypic data to focus on known disease genes or genomic elements interacting with them. RegulationSpotter is freely available at https://www.regulationspotter.org.


2020 ◽  
Author(s):  
Quanli Wang ◽  
Ryan S. Dhindsa ◽  
Keren Carss ◽  
Andrew R Harper ◽  
Abhishek Nag ◽  
...  

The UK Biobank (UKB) represents an unprecedented population-based study of 502,543 participants with detailed phenotypic data and linkage to medical records. While the release of genotyping array data for this cohort has bolstered genomic discovery for common variants, the contribution of rare variants to this broad phenotype collection remains relatively unknown. Here, we use exome sequencing data from 177,882 UKB participants to evaluate the association between rare protein-coding variants with 10,533 binary and 1,419 quantitative phenotypes. We performed both a variant-level phenome-wide association study (PheWAS) and a gene-level collapsing analysis-based PheWAS tailored to detecting the aggregate contribution of rare variants. The latter revealed 911 statistically significant gene-phenotype relationships, with a median odds ratio of 15.7 for binary traits. Among the binary trait associations identified using collapsing analysis, 83% were undetectable using single variant association tests, emphasizing the power of collapsing analysis to detect signal in the setting of high allelic heterogeneity. As a whole, these genotype-phenotype associations were significantly enriched for loss-of-function mediated traits and currently approved drug targets. Using these results, we summarise the contribution of rare variants to common diseases in the context of the UKB phenome and provide an example of how novel gene-phenotype associations can aid in therapeutic target prioritisation.


2019 ◽  
Author(s):  
Viola Fanfani ◽  
Luca Citi ◽  
Adrian L. Harris ◽  
Francesco Pezzella ◽  
Giovanni Stracquadanio

AbstractGenome-wide association studies (GWAS) have found hundreds of single nucleotide polymorphisms (SNPs) associated with increased risk of cancer. However, the amount of heritable risk explained by these variants is limited, thus leaving most of cancer heritability unexplained.Recent studies have shown that genomic regions associated with specific biological functions explain a large proportion of the heritability of many traits. Since cancer is mostly triggered by aberrant genes function, we hypothesised that SNPs located in protein-coding genes could explain a significant proportion of cancer heritability.To perform this analysis, we developed a new method, called Bayesian Gene HERitability Analysis (BAGHERA), to estimate the heritability explained by all the genotyped SNPs and by those located in protein coding genes directly from GWAS summary statistics.By applying BAGHERA to the 38 cancers reported in the UK Biobank, we identified 1, 146 genes explaining a significant amount of cancer heritability. We found these genes to be tumour suppressors directly involved in the hallmark processes controlling the transformation from normal to cancer cell; moreover, these genes also harbour somatic driver mutation for many tumours, suggesting a two-hit model underpinning tumorigenesis.Our study provides new evidence for a functional role of SNPs in cancer and identifies new targets for risk assessment and patients’ stratification.


2020 ◽  
Author(s):  
Sezanur Rahman ◽  
Mehedi Hasan ◽  
Mohammad Enayet Hossain ◽  
Mohammed Ziaur Rahman ◽  
Mustafizur Rahman

Abstract Background: The human-to-human transmissive nature of SARS-CoV-2 makes Bangladesh, as well as the other South Asian regions, vulnerable to the ongoing pandemic of COVID-19 due to their high population densities. The present study was designed based on the genome wide analysis of Bangladeshi and other South Asian isolates. Complete sequences of SARS-CoV-2 were retrieved from the EpiCoV database in order to identify molecular features demonstrating the evolutionary trail and mutation rate.Result: The complete genome mutation rate of the Bangladeshi isolates was estimated to be 0.49E-3 nucleotide substitutions/site/year. A higher mutation rate was found in the non-structural protein-coding genes at: ORF6 (10.29E-3), ORF7a (31.81E-3), and ORF8 (18.35E-3). In contrast, the mutation rates of the structural protein-coding genes were relatively low at: M (1.14E-3), S (1.47E-3), E (3.35E-3), and N (4.59E-3).Conclusions: A comparison of Bangladeshi and other South Asian isolates demonstrated that there were limited mutational changes in the SARS CoV-2 genome. Knowledge of the Southeast Asian SARS CoV-2 evolutionary genome will help in selecting future vaccine candidates and designing therapeutic drug targets.


Author(s):  
Hui Ling ◽  
Leonard Girnita ◽  
Octavian Buda ◽  
George A. Calin

AbstractProtein-coding genes comprise only 3% of the human genome, while the genes that are transcribed into RNAs but do not code for proteins occupy majority of the genome. Once considered as biological darker matter, non-coding RNAs are now being recognized as critical regulators in cancer genome. Among the many types of non-coding RNAs, microRNAs approximately 20 nucleotides in length are best characterized and their mechanisms of action are well generalized. microRNA exerts oncogenic or tumor suppressor function by regulation of protein-coding genes via sequence complementarity. The expression of microRNA is aberrantly regulated in all cancer types, and both academia and biotech companies have been keenly pursuing the potential of microRNA as cancer biomarker for early detection, prognosis, and therapeutic response. The key involvement of microRNAs in cancer also prompted interest on exploration of therapeutic values of microRNAs as anticancer drugs and drug targets. MRX34, a liposome-formulated miRNA-34 mimic, developed by Mirna Therapeutics, becomes the first microRNA therapeutic entering clinical trial for the treatment of hepatocellular carcinoma, renal cell carcinoma, and melanoma. In this review, we presented a general overview of microRNAs in cancer biology, the potential of microRNAs as cancer biomarkers and therapeutic targets, and associated challenges.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yaozhong Liu ◽  
Na Liu ◽  
Fan Bai ◽  
Qiming Liu

Background: Atrial fibrillation (AF) is the most common arrhythmia. We aimed to construct competing endogenous RNA (ceRNA) networks associated with the susceptibility and persistence of AF by applying the weighted gene co-expression network analysis (WGCNA) and prioritize key genes using the random walk with restart on multiplex networks (RWR-M) algorithm.Methods: RNA sequencing results from 235 left atrial appendage samples were downloaded from the GEO database. The top 5,000 lncRNAs/mRNAs with the highest variance were used to construct a gene co-expression network using the WGCNA method. AF susceptibility- or persistence-associated modules were identified by correlating the module eigengene with the atrial rhythm phenotype. Using a module-specific manner, ceRNA pairs of lncRNA–mRNA were predicted. The RWR-M algorithm was applied to calculate the proximity between lncRNAs and known AF protein-coding genes. Random forest classifiers, based on the expression value of key lncRNA-associated ceRNA pairs, were constructed and validated against an independent data set.Results: From the 21 identified modules, magenta and tan modules were associated with AF susceptibility, whereas turquoise and yellow modules were associated with AF persistence. ceRNA networks in magenta and tan modules were primarily involved in the inflammatory process, whereas ceRNA networks in turquoise and yellow modules were primarily associated with electrical remodeling. A total of 106 previously identified AF-associated protein-coding genes were found in the ceRNA networks, including 16 that were previously implicated in the genome-wide association study. Myocardial infarction–associated transcript (MIAT) and LINC00964 were prioritized as key lncRNAs through RWR-M. The classifiers based on their associated ceRNA pairs were able to distinguish AF from sinus rhythm with respective AUC values of 0.810 and 0.940 in the training set and 0.870 and 0.922 in the independent test set. The AF-related single-nucleotide polymorphism rs35006907 was found in the intronic region of LINC00964 and negatively regulated the LINC00964 expression.Conclusion: Our study constructed AF susceptibility- and persistence-associated ceRNA networks, linked genetics with epigenetics, identified MIAT and LINC00964 as key lncRNAs, and constructed random forest classifiers based on their associated ceRNA pairs. These results will help us to better understand the mechanisms underlying AF from the ceRNA perspective and provide candidate therapeutic and diagnostic tools.


Sign in / Sign up

Export Citation Format

Share Document