Integrated multi-omics data analysis identifies a novel genetics-risk gene of IRF4 associated with prognosis of oral cavity cancer

Oral cavity cancer (OCC) is one of the most common carcinoma diseases. Recent genome-wide association studies (GWAS) have reported numerous genetic variants associated with OCC susceptibility. However, the regulatory mechanisms of these genetic variants underlying OCC remain largely unclear. By combining GWAS summary statistics (N = 4,151) with expression quantitative trait loci (eQTL) across 49 different tissues from the GTEx database, we performed an integrative genomics analysis to uncover novel risk genes associated with OCC. By leveraging various computational methods based on multi-omics data, risk genes were prioritized as promising candidate genes for drug repurposing in OCC.Using two independent computational algorithms, we found that 14 risk genes whose genetics-modulated expressions showed a notable association with OCC. Among them, nine genes were newly identified, such as IRF4 (P = 2.5x10-9 and P = 1.06x10-4), TNS3 (P = 1.44x10-6 and P = 4.45x10-3), ZFP90 (P = 2.37x10-6 and P = 2.93x10-4), and DRD2 (P = 2.0x10-5 and P = 6.12x10-3). These 14 genes were significantly overrepresented in several cancer-related terms, and 10 of 14 genes were enriched in 10 potential druggable gene categories. Based on differential gene expression analysis, the majority of these genes (71.43%) showed remarkable differential expressions between OCC patients and paracancerous controls. Integration of multi-omics-based evidence from genetics, eQTL, and gene expression, we identified that the novel risk gene of IRF4 exhibited the highest ranked risk score for OCC. Survival analysis showed that dysregulation of IRF4 expression was significantly associated with cancer patients outcomes (P = 8.1x10-5). In summary, we prioritized 14 OCC-associated genes with nine novel risk genes, especially the IRF4 gene, which provides a drug repurposing resource to develop therapeutic drugs for oral cancer.

Download Full-text

A transcriptome-wide Mendelian randomization study to uncover tissue-dependent regulatory mechanisms across the human phenome

10.1101/563379 ◽

2019 ◽

Cited By ~ 2

Author(s):

Tom G Richardson ◽

Gibran Hemani ◽

Tom R Gaunt ◽

Caroline L Relton ◽

George Davey Smith

Keyword(s):

Gene Expression ◽

Genetic Variants ◽

Complex Traits ◽

Mendelian Randomization ◽

Drug Repositioning ◽

Association Studies ◽

Thyroid Tissue ◽

Genome Wide Association Studies ◽

Tissue Specific ◽

Genome Wide

AbstractBackgroundDeveloping insight into tissue-specific transcriptional mechanisms can help improve our understanding of how genetic variants exert their effects on complex traits and disease. By applying the principles of Mendelian randomization, we have undertaken a systematic analysis to evaluate transcriptome-wide associations between gene expression across 48 different tissue types and 395 complex traits.ResultsOverall, we identified 100,025 gene-trait associations based on conventional genome-wide corrections (P < 5 × 10−08) that also provided evidence of genetic colocalization. These results indicated that genetic variants which influence gene expression levels in multiple tissues are more likely to influence multiple complex traits. We identified many examples of tissue-specific effects, such as genetically-predicted TPO, NR3C2 and SPATA13 expression only associating with thyroid disease in thyroid tissue. Additionally, FBN2 expression was associated with both cardiovascular and lung function traits, but only when analysed in heart and lung tissue respectively.We also demonstrate that conducting phenome-wide evaluations of our results can help flag adverse on-target side effects for therapeutic intervention, as well as propose drug repositioning opportunities. Moreover, we find that exploring the tissue-dependency of associations identified by genome-wide association studies (GWAS) can help elucidate the causal genes and tissues responsible for effects, as well as uncover putative novel associations.ConclusionsThe atlas of tissue-dependent associations we have constructed should prove extremely valuable to future studies investigating the genetic determinants of complex disease. The follow-up analyses we have performed in this study are merely a guide for future research. Conducting similar evaluations can be undertaken systematically at http://mrcieu.mrsoftware.org/Tissue_MR_atlas/.

Download Full-text

Integrative Genomic Analysis for the Discovery of Biomarkers in Prostate Cancer

Biomarker Insights ◽

10.4137/bmi.s13729 ◽

2014 ◽

Vol 9 ◽

pp. BMI.S13729 ◽

Cited By ~ 5

Author(s):

Chindo Hicks ◽

Tejaswi Koganti ◽

Shankar Giri ◽

Memory Tekere ◽

Ritika Ramani ◽

...

Keyword(s):

Gene Expression ◽

Prostate Cancer ◽

Gene Expression Data ◽

Genetic Variants ◽

Association Studies ◽

Biological Pathways ◽

Great Success ◽

Genome Wide Association Studies ◽

Expression Data ◽

Increased Risk

Genome-wide association studies (GWAS) have achieved great success in identifying single nucleotide polymorphisms (SNPs, herein called genetic variants) and genes associated with risk of developing prostate cancer. However, GWAS do not typically link the genetic variants to the disease state or inform the broader context in which the genetic variants operate. Here, we present a novel integrative genomics approach that combines GWAS information with gene expression data to infer the causal association between gene expression and the disease and to identify the network states and biological pathways enriched for genetic variants. We identified gene regulatory networks and biological pathways enriched for genetic variants, including the prostate cancer, IGF-1, JAK2, androgen, and prolactin signaling pathways. The integration of GWAS information with gene expression data provides insights about the broader context in which genetic variants associated with an increased risk of developing prostate cancer operate.

Download Full-text

Leveraging brain cortex-derived molecular data to elucidate epigenetic and transcriptomic drivers of neurological function and disease

10.1101/429134 ◽

2018 ◽

Author(s):

Charlie Hatcher ◽

Caroline L. Relton ◽

Tom R. Gaunt ◽

Tom G. Richardson

Keyword(s):

Gene Expression ◽

Dna Methylation ◽

Histone Acetylation ◽

Genetic Variants ◽

Large Scale ◽

Genetic Variant ◽

Association Studies ◽

Genome Wide Association Studies ◽

Epigenetic Mechanisms ◽

Trait Variation

AbstractIntegrative approaches which harness large-scale molecular datasets can help develop mechanistic insight into findings from genome-wide association studies (GWAS). We have performed extensive analyses to uncover transcriptional and epigenetic processes which may play a role in neurological trait variation.This was undertaken by applying Bayesian multiple-trait colocalization systematically across the genome to identify genetic variants responsible for influencing intermediate molecular phenotypes as well as neurological traits. In this analysis we leveraged high dimensional quantitative trait loci data derived from prefrontal cortex tissue (concerning gene expression, DNA methylation and histone acetylation) and GWAS findings for 5 neurological traits (Neuroticism, Schizophrenia, Educational Attainment, Insomnia and Alzheimer’s disease).There was evidence of colocalization for 118 associations suggesting that the same underlying genetic variant influenced both nearby gene expression as well as neurological trait variation. Of these, 73 associations provided evidence that the genetic variant also influenced proximal DNA methylation and/or histone acetylation. These findings support previous evidence at loci where epigenetic mechanisms may putatively mediate effects of genetic variants on traits, such as KLC1 and schizophrenia. We also uncovered evidence implicating novel loci in neurological disease susceptibility, including genes expressed predominantly in brain tissue such as MDGA1, KIRREL3 and SLC12A5.An inverse relationship between DNA methylation and gene expression was observed more than can be accounted for by chance, supporting previous findings implicating DNA methylation as a transcriptional repressor. Our study should prove valuable in helping future studies prioritise candidate genes and epigenetic mechanisms for in-depth functional follow-up analyses.

Download Full-text

A novel quantile regression approach for eQTL discovery

10.1101/070052 ◽

2016 ◽

Author(s):

Xiaoyu Song ◽

Gen Li ◽

Iuliana Ionita-Laza ◽

Ying Wei

Keyword(s):

Gene Expression ◽

Genetic Variation ◽

Linear Regression ◽

Genetic Variants ◽

Molecular Mechanisms ◽

Association Studies ◽

Expression Level ◽

Special Focus ◽

Genome Wide Association Studies ◽

Genome Wide

AbstractOver the past decade, there has been a remarkable improvement in our understanding of the role of genetic variation in complex human diseases, especially via genome-wide association studies. However, the underlying molecular mechanisms are still poorly characterized, impending the development of therapeutic interventions. Identifying genetic variants that influence the expression level of a gene, i.e. expression quantitative trait loci (eQTLs), can help us understand how genetic variants influence traits at the molecular level. While most eQTL studies focus on identifying mean effects on gene expression using linear regression, evidence suggests that genetic variation can impact the entire distribution of the expression level. Indeed, several studies have already investigated higher order associations with a special focus on detecting heteroskedasticity. In this paper, we develop a Quantile Rank-score Based Test (QRBT) to identify eQTLs that are associated with the conditional quantile functions of gene expression. We have applied the proposed QRBT to the Genotype-Tissue Expression project, an international tissue bank for studying the relationship between genetic variation and gene expression in human tissues, and found that the proposed QRBT complements the existing methods, and identifies new eQTLs with heterogeneous effects genome-wideacross different quantile levels. Notably, we show that the eQTLs identified by QRBT but missed by linear regression are more likely to be tissue specific, and also associated with greater enrichment in genome-wide significant SNPs from the GWAS catalog. An R package implementing QRBT is available on our website.

Download Full-text

Functional Genetic Biomarkers of Alzheimer’s Disease and Gene Expression from Peripheral Blood

10.1101/2021.01.15.426891 ◽

2021 ◽

Author(s):

Andrew Ni ◽

Amish Sethi ◽

Keyword(s):

Gene Expression ◽

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Peripheral Blood ◽

Genetic Variants ◽

Cell Activation ◽

Association Studies ◽

Gene Set Enrichment Analysis ◽

Machine Learning Techniques ◽

Genome Wide Association Studies

AbstractDetecting Alzheimer’s Disease (AD) at the earliest possible stage is key in advancing AD prevention and treatment but is challenged by normal aging processes in addition to other confounding neurodegenerative diseases. Recent genome-wide association studies (GWAS) have identified associated alleles, but it has been difficult to transition from non-coding genetic variants to underlying mechanisms of AD. Here, we sought to reveal functional genetic variants and diagnostic biomarkers underlying AD using machine learning techniques. We first developed a Random Forest (RF) classifier using microarray gene expression data sampled from the peripheral blood of 744 participants in the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort. After initial feature selection, 5-fold cross-validation of the 100-gene RF classifier achieved an accuracy of 99.04%. The high accuracy of the RF classifier supports the possibility of a powerful and minimally invasive tool for screening of AD. Next, unsupervised clustering was used to validate and identify relationships among differentially expressed genes (DEGs) the RF selected revealing 3 distinct AD clusters. Results suggest downregulation of global sulfatase and oxidoreductase activities in AD through mutations in SUMF1 and SMOX respectively. Then, we used Greedy Fast Causal Inference (GFCI) to find potential causes of AD within DEGs. In the causal graph, HLA-DPB1 and CYP4A11 emerge as hub genes, furthering the discussion of the immune system’s role in AD. Finally, we used Gene Set Enrichment Analysis (GSEA) to determine the biological pathways and processes underlying the DEGs that were highly correlated with AD. Cell activation in the immune system, glycosaminoglycan (GAG) binding, vascular dysfunction, oxidative stress, and the neuronal apoptotic process were revealed to be significantly enriched in AD. This study further advances the possibility of low-cost and noninvasive genetic screening for AD while also providing potential gene targets for further experimentation.

Download Full-text

Identification of Druggable Genes for Asthma by Integrated Genomic Network Analysis

Biomedicines ◽

10.3390/biomedicines10010113 ◽

2022 ◽

Vol 10 (1) ◽

pp. 113

Author(s):

Wirawan Adikusuma ◽

Wan-Hsuan Chou ◽

Min-Rou Lin ◽

Jafit Ting ◽

Lalu Muhammad Irham ◽

...

Keyword(s):

Severe Asthma ◽

Target Genes ◽

Inhaled Corticosteroids ◽

Association Studies ◽

Therapeutic Option ◽

Drug Repurposing ◽

Genome Wide Association Studies ◽

Biological Drugs ◽

Risk Genes ◽

Asthma Risk

Asthma is a common and heterogeneous disease characterized by chronic airway inflammation. Currently, the two main types of asthma medicines are inhaled corticosteroids and long-acting β2-adrenoceptor agonists (LABAs). In addition, biological drugs provide another therapeutic option, especially for patients with severe asthma. However, these drugs were less effective in preventing severe asthma exacerbation, and other drug options are still limited. Herein, we extracted asthma-associated single nucleotide polymorphisms (SNPs) from the genome-wide association studies (GWAS) and phenome-wide association studies (PheWAS) catalog and prioritized candidate genes through five functional annotations. Genes enriched in more than two categories were defined as “biological asthma risk genes.” Then, DrugBank was used to match target genes with FDA-approved medications and identify candidate drugs for asthma. We discovered 139 biological asthma risk genes and identified 64 drugs targeting 22 of these genes. Seven of them were approved for asthma, including reslizumab, mepolizumab, theophylline, dyphylline, aminophylline, oxtriphylline, and enprofylline. We also found 17 drugs with clinical or preclinical evidence in treating asthma. In addition, eleven of the 40 candidate drugs were further identified as promising asthma therapy. Noteworthy, IL6R is considered a target for asthma drug repurposing based on its high target scores. Through in silico drug repurposing approach, we identified sarilumab and satralizumab as the most promising drug for asthma treatment.

Download Full-text

Monocyte-specific changes in gene expression implicate LACTB2 and PLIN2 in Alzheimer’s disease

10.1101/2020.06.05.136275 ◽

2020 ◽

Author(s):

Janet C. Harwood ◽

Ganna Leonenko ◽

Rebecca Sims ◽

Valentina Escott-Price ◽

Julie Williams ◽

...

Keyword(s):

Gene Expression ◽

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Association Studies ◽

Genetic Data ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Risk Genes ◽

Genome Wide ◽

Immune Pathways

AbstractMore than 50 genetic loci have been identified as being associated with Alzheimer’s disease (AD) from genome-wide association studies (GWAS) and many of these are involved in immune pathways and lipid metabolism. Therefore, we performed a transcriptome-wide association study (TWAS) of immune-relevant cells, to study the mis-regulation of genes implicated in AD. We used expression and genetic data from naive and induced CD14+ monocytes and two GWAS of AD to study genetically controlled gene expression in monocytes at different stages of differentiation and compared the results with those from TWAS of brain and blood. We identified nine genes with statistically independent TWAS signals, seven are known AD risk genes from GWAS: BIN1, PTK2B, SPI1, MS4A4A, MS4A6E, APOE and PVR and two, LACTB2 and PLIN2/ADRP, are novel candidate genes for AD. Three genes, SPI1, PLIN2 and LACTB2, are TWAS significant specifically in monocytes. LACTB2 is a mitochondrial endoribonuclease and PLIN2/ADRP associates with intracellular neutral lipid storage droplets (LSDs) which have been shown to play a role in the regulation of the immune response. Notably, LACTB2 and PLIN2 were not detected from GWAS alone.

Download Full-text

Prioritization of schizophrenia risk genes from GWAS results by integrating multi-omics data

Translational Psychiatry ◽

10.1038/s41398-021-01294-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Dan He ◽

Cong Fan ◽

Mengling Qi ◽

Yuedong Yang ◽

David N. Cooper ◽

...

Keyword(s):

Molecular Mechanisms ◽

De Novo ◽

Association Studies ◽

Bayesian Framework ◽

Snp Markers ◽

Genome Wide Association Studies ◽

Omics Data ◽

De Novo Mutations ◽

Gwas Study ◽

Risk Genes

AbstractSchizophrenia (SCZ) is a polygenic disease with a heritability approaching 80%. Over 100 SCZ-related loci have so far been identified by genome-wide association studies (GWAS). However, the risk genes associated with these loci often remain unknown. We present a new risk gene predictor, rGAT-omics, that integrates multi-omics data under a Bayesian framework by combining the Hotelling and Box–Cox transformations. The Bayesian framework was constructed using gene ontology, tissue-specific protein–protein networks, and multi-omics data including differentially expressed genes in SCZ and controls, distance from genes to the index single-nucleotide polymorphisms (SNPs), and de novo mutations. The application of rGAT-omics to the 108 loci identified by a recent GWAS study of SCZ predicted 103 high-risk genes (HRGs) that explain a high proportion of SCZ heritability (Enrichment = 43.44 and $$p = 9.30 \times 10^{ - 9}$$ p = 9.30 × 1 0 − 9 ). HRGs were shown to be significantly ($$p_{\mathrm{adj}} = 5.35 \times 10^{ - 7}$$ p adj = 5.35 × 1 0 − 7 ) enriched in genes associated with neurological activities, and more likely to be expressed in brain tissues and SCZ-associated cell types than background genes. The predicted HRGs included 16 novel genes not present in any existing databases of SCZ-associated genes or previously predicted to be SCZ risk genes by any other method. More importantly, 13 of these 16 genes were not the nearest to the index SNP markers, and them would have been difficult to identify as risk genes by conventional approaches while ten out of the 16 genes are associated with neurological functions that make them prime candidates for pathological involvement in SCZ. Therefore, rGAT-omics has revealed novel insights into the molecular mechanisms underlying SCZ and could provide potential clues to future therapies.

Download Full-text

An Integrative Genomics Approach to Biomarker Discovery in Breast Cancer

Cancer Informatics ◽

10.4137/cin.s6837 ◽

2011 ◽

Vol 10 ◽

pp. CIN.S6837 ◽

Cited By ~ 17

Author(s):

Chindo Hicks ◽

Rozana Asfour ◽

Antonio Pannuti ◽

Lucio Miele

Keyword(s):

Breast Cancer ◽

Gene Expression ◽

Gene Expression Data ◽

Genetic Variants ◽

Association Studies ◽

Biological Pathways ◽

Genome Wide Association Studies ◽

Expression Data ◽

Integrative Genomics ◽

Novel Genes

Genome-wide association studies (GWAS) have successfully identified genetic variants associated with risk for breast cancer. However, the molecular mechanisms through which the identified variants confer risk or influence phenotypic expression remains poorly understood. Here, we present a novel integrative genomics approach that combines GWAS information with gene expression data to assess the combined contribution of multiple genetic variants acting within genes and putative biological pathways, and to identify novel genes and biological pathways that could not be identified using traditional GWAS. The results show that genes containing SNPs associated with risk for breast cancer are functionally related and interact with each other in biological pathways relevant to breast cancer. Additionally, we identified novel genes that are co-expressed and interact with genes containing SNPs associated with breast cancer. Integrative analysis combining GWAS information with gene expression data provides functional bridges between GWAS findings and biological pathways involved in breast cancer.

Download Full-text

Brain transcriptional regulatory architecture and schizophrenia etiology converge between East Asian and European ancestral populations

10.1101/2021.02.04.922880 ◽

2021 ◽

Author(s):

Sihan Liu ◽

Yu Chen ◽

Feiran Wang ◽

Yi Jiang ◽

Fangyuan Duan ◽

...

Keyword(s):

Gene Expression ◽

Large Scale ◽

Association Studies ◽

Allelic Frequency ◽

Genome Wide Association Studies ◽

Disease Etiology ◽

Risk Genes ◽

Regulatory Architecture ◽

Genome Wide ◽

Transcriptional Regulatory

AbstractUnderstanding the genetic architecture of gene expression and splicing in human brain is critical to unlocking the mechanisms of complex neuropsychiatric disorders like schizophrenia (SCZ). Large-scale brain transcriptomic studies are based primarily on populations of European (EUR) ancestry. The uniformity of mono-racial resources may limit important insights into the disease etiology. Here, we characterized brain transcriptional regulatory architecture of East Asians (EAS; n=151), identifying 3,278 expression quantitative trait loci (eQTL) and 4,726 spliceQTL (sQTL). Comparing these to PsychENCODE/BrainGVEX confirmed our hypothesis that the transcriptional regulatory architecture in EAS and EUR brains align. Furthermore, distinctive allelic frequency and linkage disequilibrium impede QTL translation and gene-expression prediction accuracy. Integration of eQTL/sQTL with genome-wide association studies reveals common and novel SCZ risk genes. Pathway-based analyses showing shared SCZ biology point to synaptic and GTPase dysfunction as a prospective pathogenesis. This study elucidates the transcriptional landscape of the EAS brain and emphasizes an essential convergence between EAS and EUR populations.

Download Full-text