Gene Expression Prediction Using Deep Neural Networks

2019 ◽  
Vol 17 (3) ◽  
pp. 422-431
Author(s):  
Raju Bhukya ◽  
Achyuth Ashok

In the field of molecular biology, gene expression is a term that encompasses all the information contained in an organism’s genome. Although, researchers have developed several clinical techniques to quantitatively measure the expressions of genes of an organism, they are too costly to be extensively used. The NIH LINCS program revealed that human gene expressions are highly correlated. Further research at the University of California, Irvine (UCI) led to the development of D-GEX, a Multi Layer Perceptron (MLP) model that was trained to predict unknown target expressions from previously identified landmark expressions. But, bowing to hardware limitations, they had split the target genes into different sets and constructed separate models to profile the whole genome. This paper proposes an alternative solution using a combination of deep autoencoder and MLP to overcome this bottleneck and improve the prediction performance. The microarray based Gene Expression Omnibus (GEO) dataset was employed to train the neural networks. Experimental result shows that this new model, abbreviated as E-GEX, outperforms D-GEX by 16.64% in terms of overall prediction accuracy on GEO dataset. The models were further tested on an RNA-Seq based 1000G dataset and E-GEX was found to be 49.23% more accurate than D-GEX.

Author(s):  
Raju Bhukya

Gene expression of an organism contains all the information that characterises its observable traits. Researchers have invested abundant time and money to quantitatively measure the expressions in laboratories. On account of such techniques being too expensive to be widely used, the correlation between expressions of certain genes was exploited to develop statistical solutions. Pioneered by the National Institutes of Health Library of Integrated Network-Based Cellular Signature (NIH LINCS) program, expression inference techniques has many improvements over the years. The Deep Learning for Gene expression (D-GEX) project by University of California, Irvine approached the problem from a machine learning perspective, leading to the development of a multi-layer feedforward neural network to infer target gene expressions from clinically measured landmark expressions. Still, the huge number of genes to be inferred from a limited set of known expressions vexed the researchers. Ignoring possible correlation between target genes, they partitioned the target genes randomly and built separate networks to infer their expressions. This paper proposes that the dimensionality of the target set can be virtually reduced using deep autoencoders. Feedforward networks will be used to predict the coded representation of target expressions. In spite of the reconstruction error of the autoencoder, overall prediction error on the microarray based Gene Expression Omnibus (GEO) dataset was reduced by 6.6%, compared to D-GEX. An improvement of 16.64% was obtained on cross platform normalized data obtained by combining the GEO dataset and an RNA-Seq based 1000G dataset.


2020 ◽  
Vol 16 ◽  
pp. 117693432092057
Author(s):  
Lijun Yu ◽  
Meiyan Wei ◽  
Fengyan Li

Despite advances in the treatment of cervical cancer (CC), the prognosis of patients with CC remains to be improved. This study aimed to explore candidate gene targets for CC. CC datasets were downloaded from the Gene Expression Omnibus database. Genes with similar expression trends in varying steps of CC development were clustered using Short Time-series Expression Miner (STEM) software. Gene functions were then analyzed using the Gene Ontology (GO) database and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis. Protein interactions among genes of interest were predicted, followed by drug-target genes and prognosis-associated genes. The expressions of the predicted genes were determined using real-time quantitative polymerase chain reaction (RT-qPCR) and Western blotting. Red and green profiles with upward and downward gene expressions, respectively, were screened using STEM software. Genes with increased expression were significantly enriched in DNA replication, cell-cycle-related biological processes, and the p53 signaling pathway. Based on the predicted results of the Drug-Gene Interaction database, 17 drug-gene interaction pairs, including 3 red profile genes (TOP2A, RRM2, and POLA1) and 16 drugs, were obtained. The Cancer Genome Atlas data analysis showed that high POLA1 expression was significantly correlated with prolonged survival, indicating that POLA1 is protective against CC. RT-qPCR and Western blotting showed that the expressions of TOP2A, RRM2, and POLA1 gradually increased in the multistep process of CC. TOP2A, RRM2, and POLA1 may be targets for the treatment of CC. However, many studies are needed to validate our findings.


2014 ◽  
Vol 2014 ◽  
pp. 1-8
Author(s):  
Tzu-Hao Chang ◽  
Shih-Lin Wu ◽  
Wei-Jen Wang ◽  
Jorng-Tzong Horng ◽  
Cheng-Wei Chang

Microarrays are widely used to assess gene expressions. Most microarray studies focus primarily on identifying differential gene expressions between conditions (e.g., cancer versus normal cells), for discovering the major factors that cause diseases. Because previous studies have not identified the correlations of differential gene expression between conditions, crucial but abnormal regulations that cause diseases might have been disregarded. This paper proposes an approach for discovering the condition-specific correlations of gene expressions within biological pathways. Because analyzing gene expression correlations is time consuming, an Apache Hadoop cloud computing platform was implemented. Three microarray data sets of breast cancer were collected from the Gene Expression Omnibus, and pathway information from the Kyoto Encyclopedia of Genes and Genomes was applied for discovering meaningful biological correlations. The results showed that adopting the Hadoop platform considerably decreased the computation time. Several correlations of differential gene expressions were discovered between the relapse and nonrelapse breast cancer samples, and most of them were involved in cancer regulation and cancer-related pathways. The results showed that breast cancer recurrence might be highly associated with the abnormal regulations of these gene pairs, rather than with their individual expression levels. The proposed method was computationally efficient and reliable, and stable results were obtained when different data sets were used. The proposed method is effective in identifying meaningful biological regulation patterns between conditions.


2007 ◽  
Vol 25 (18_suppl) ◽  
pp. 7669-7669 ◽  
Author(s):  
C. Huang ◽  
D. Liu ◽  
J. Nakano ◽  
S. Ishikawa ◽  
H. Yokomise ◽  
...  

7669 Background: The thymidylate synthase (TS) expression is related to 5-FU sensitivity. The survivin expression is associated with tumor apoptosis, an indicator to predict the efficacy of chemotherapy. Recently, TS and Survivin have been reported to be E2F1 target genes. We investigate the clinical significance of the E2F1 gene expression in relation to gene expressions of TS and Survivin among non-small cell lung cancer (NSCLC). Methods: One hundred and twenty-seven NSCLC patients were investigated. Quantitative RT-PCR was performed to evaluate gene expressions of E2F1, TS, and survivin. The Ki-67 proliferation index and the apoptotic index using TUNEL method were also evaluated. Results: The E2F1 gene expression was significantly higher in stage II to III tumors than in stage I tumors (p=0.006). The E2F1 gene expression significantly correlated with the Ki-67 proliferation index (p<0.001), while no correlation was observed between the E2F1 gene expression and the apoptotic index. Regarding E2F1-target genes, the E2F1 gene expression significantly correlated with the TS gene expression (p<0.001). The E2F1 gene expression also significantly correlated with the survivin gene expression (p<0.001). The TS expression and the survivin expression significantly correlated with the Ki-67 proliferation index (p<0.001 and p<0.001, respectively). There was a significant inverse relationship between the survivin expression and the apoptotic index (p<0.001). The overall survival was significantly lower in patients with high-E2F1 tumors than in those with low-E2F1 tumors (p=0.002), especially among patients with stage II to III NSCLCs (p=0.018). The Cox regression analysis demonstrated that the E2F1 status was a significant prognostic factor for NSCLC patients (p=0.026). Conclusions: The present study revealed the E2F1 gene expression to correlate with TS and survivin gene expressions, and tumor proliferation. E2F1 overexpression could occur to produce more aggressive tumors with high proliferation rate and chemo-resistance during progression of NSCLCs. The suppression of E2F1 by RNA interference would be a useful strategy for cancer gene therapy. No significant financial relationships to disclose.


2020 ◽  
Author(s):  
Sida Zhou ◽  
Wanyu Sun ◽  
Xinyu Zhao ◽  
Yang Xu ◽  
Mengyu Zhang ◽  
...  

ABSTRACTHistone H3K4 methylation is catalysed by the multi-protein complex known as the Set1/COMPASS or MLL/COMPASS-like complex, an element that is highly evolutionarily conserved from yeast to humans. However, the components and mechanisms by which the COMPASS-like complex targets the H3K4 methylation of plant pathogenic genes in fungi remain elusive. Here we present a comprehensive analysis combining biochemical, molecular, and genome-wide approaches to characterize the roles of the COMPASS-like family in Magnaporthe oryzae, a model plant fungal pathogen. We purified and identified six conserved subunits of COMPASS from the rice blast fungus M. oryzae, i.e., MoBre2 (Cps60/ASH2L), MoSpp1 (Cps40/Cfp1), MoSwd2 (Cps35), MoSdc1 (Cps25/DPY30), MoSet1 (MLL/ALL) and MoRbBP5 (Cps50), using an affinity tag on MoBre2. We determined the SPRY domain of MoBre2 can recognize directly with DPY30 domain of MoSdc1 in vitro. Furthermore, we found that deletion of the genes encoding COMPASS subunits of MoBre2, MoSpp1 and MoSwd2 caused similar defects regarding invasive hyphal development and pathogenicity. Genome-wide profiling of H3K4me3 revealed that the it has remarkable co-occupancy at the TSS regions of target genes. Significantly, these target genes are often involved in spore germination and pathogenesis. Decreased gene expression caused by the deletion of MoBre2, MoSwd2 or MoSpp1 gene was highly correlated with decrease in H3K4me3. Taken together, these results suggest that MoBre2, MoSpp1, and MoSwd2 function as a whole COMPASS complex, contributing to fungal development and pathogenesis by regulating H3K4me3-targeted genes in M. oryzae.


F1000Research ◽  
2013 ◽  
Vol 2 ◽  
pp. 21 ◽  
Author(s):  
Y-h Taguchi

Background miRNA regulation of target genes and promoter methylation are known to be the primary mechanisms underlying the epigenetic regulation of gene expression. However, how these two processes cooperatively regulate gene expression has not been extensively studied.Methods Gene expression and promoter methylation profiles of 270 distinct human cell lines were obtained from Gene Expression Omnibus. P-values that describe both miRNA-targeted-gene promoter methylation and miRNA regulation of target genes were computed using the MiRaGE method proposed recently by the author.Results Significant changes in promoter methylation were associated with miRNA targeting. It was also found that miRNA-targeted-gene promoter hypomethylation was related to differential target gene expression; the genes with miRNA-targeted-gene promoter hypomethylation were downregulated during cell senescence and upregulated during cellular differentiation. Promoter hypomethylation was especially enhanced for genes targeted by miR-548 miRNAs, which are non-conserved, primate-specific miRNAs that are typically expressed at lower levels than the frequently investigated conserved miRNAs. miRNA-targeted-gene promoter methylation may also be related to the seed region features of miRNA.Conclusions It was found that promoter methylation was correlated to miRNA targeting. Furthermore, miRNA-targeted-gene promoter hypomethylation was especially enhanced in promoters of genes targeted by miRNAs that are not strongly expressed (e.g., miR-548 miRNAs) and was suggested to be highly related to some seed region features of miRNAs.


2020 ◽  
Vol 21 (3) ◽  
pp. 999
Author(s):  
Yung-Che Chen ◽  
Po-Yuan Hsu ◽  
Mao-Chang Su ◽  
Chien-Hung Chin ◽  
Chia-Wei Liou ◽  
...  

The purpose of this study is to explore the anti-inflammatory role of microRNAs (miR)-21 and miR-23 targeting the TLR/TNF-α pathway in response to chronic intermittent hypoxia with re-oxygenation (IHR) injury in patients with obstructive sleep apnea (OSA). Gene expression levels of the miR-21/23a, and their predicted target genes were assessed in peripheral blood mononuclear cells from 40 treatment-naive severe OSA patients, and 20 matched subjects with primary snoring (PS). Human monocytic THP-1 cell lines were induced to undergo apoptosis under IHR exposures, and transfected with miR-21-5p mimic. Both miR-21-5p and miR-23-3p gene expressions were decreased in OSA patients as compared with that in PS subjects, while TNF-α gene expression was increased. Both miR-21-5p and miR-23-3p gene expressions were negatively correlated with apnea hypopnea index and oxygen desaturation index, while TNF-α gene expression positively correlated with apnea hypopnea index. In vitro IHR treatment resulted in decreased miR-21-5p and miR-23-3p expressions. Apoptosis, cytotoxicity, and gene expressions of their predicted target genes—including TNF-α, ELF2, NFAT5, HIF-2α, IL6, IL6R, EDNRB, and TLR4—were all increased in response to IHR, while all were reversed with miR-21-5p mimic transfection under IHR condition. The findings provide biological insight into mechanisms by which IHR-suppressed miRs protect cell apoptosis via inhibit inflammation, and indicate that over-expression of the miR-21-5p may be a new therapy for OSA.


2020 ◽  
Author(s):  
Weijia Lu ◽  
Yunyu Wu ◽  
CanXiong Lu ◽  
Ting Zhu ◽  
ZhongLu Ren ◽  
...  

Abstract Objective MicroRNAs (MiRNAs) is considered to play an important role in the occurrence and development of ovarian cancer(OC). Although miRNAs has been widely recognized in ovarian cancer, the role of hsa-miR-30a-5p (miR-30a) in OC has not been fully elucidated. Methods Through the analysis of public data sets in Gene Expression Omnibus (GEO) database and literature review, the significance of miR-30a expression in OC is evaluated. Three mRNA datasets of OC and normal ovarian tissue, GSE14407, GSE18520 and GSE36668, were downloaded from GEO to find the differentially expressed gene (DEG). Then the target genes of hsa-miR-30a-5p were predicted by miRWALK3.0 and TargetScan. Then, the gene overlap between DEG and the predicted target genes of miR-30a in OC was analyzed by Gene Ontology (GO) enrichment analysis. Protein-protein interaction (PPI) network was constructed by STRING and Cytoscape, and the effect of HUB gene on the prognosis of OC was analyzed. Results A common pattern of up-regulation of miR-30a in OC was found. A total of 225 DEG, were identified, both OC-related and miR-30a-related. Many DEG are enriched in the interactions of intracellular matrix tissue, ion binding and biological process regulation. Among the 10 major Hub genes analyzed by PPI, five Hub genes were significantly related to the overall poor survival of OC patients, in which the low expression of ESR1 ,MAPK10, Tp53 and the high expression of YKT ,NSF were related to poor prognosis of OC.


2021 ◽  
Author(s):  
YiQun Ma ◽  
LISHI SHAO ◽  
CHEN SHI ◽  
JIAPING WANG

Abstract Background: Infection with hepatitis C virus (HCV) can cause hepatic fibrosis and cirrhosis, thereby significantly increasing the risk of HCC development. Many prior studies have shown that oncogenesis and cancer progression are governed by competing endogenous RNA (ceRNA) networks composed of long non-coding RNAs (lncRNAs), microRNAs (miRNAs), and mRNAs. As such, we herein sought to identify and evaluate the prognostic relevance of novel ceRNA network related to HCC associated with HCV. Methods: Differentially expressed genes (DEGs) in the GSE140845 Gene Expression Omnibus (GEO) dataset were identified using NetworkAnalyst, and were subjected to Gene Ontology (GO) and Kyoto Encyclopedia of Gene, Genome (KEGG) pathway, and Reactome analyses. In addition, a protein-protein interaction (PPI) network was generated, and key hub genes were detected. Hub gene expression levels, as well as those of their upstream lncRNAs and miRNAs and associated survival analyses were conducted using appropriate bioinformatics databases. Predicted target relationships were additionally used to establish putative ceRNA networks for HCV-related HCC. Results: 372 and 360 upregulated and downregulated significant DEGs were identified, respectively. Functional enrichment analyses suggested that DE-mRNAs were associated with nuclear division, the cell cycle, and ATPase activity. The top 11 genes with the greatest degree of connectivity among these DE-mRNAs were selected for subsequent prognostic evaluation. The differential expression of six of these candidate mRNAs (BUB1, BUB1B, CDC20, CDC45, CDK1, NDC80) in liver tissue was validated. After further analyses of the expression and prognostic relevance of the miRNAs and lncRNAs predicted to lie upstream of these DE-mRNAs, we identified 22 miRNAs and 4 lncRNAs significantly associated with poorer-HCV-related HCC prognosis. By combining the results of these analyses, we also identified the BUB1-hsa-miR-193a-3p-MALAT1 ceRNA sub-network as being related to the survival of these patients. Conclusion: This study providing novel insights into the mRNA-miRNA-lncRNA ceRNA network and reveals potential lncRNA biomarkers in HCV related HCC.


Sign in / Sign up

Export Citation Format

Share Document