scholarly journals Predicting Methylation from Sequence and Gene Expression Using Deep Learning with Attention

2018 ◽  
Author(s):  
Alona Levy-Jurgenson ◽  
Xavier Tekpli ◽  
Vessela N. Kristensen ◽  
Zohar Yakhini

AbstractDNA methylation has been extensively linked to alterations in gene expression, playing a key role in the manifestation of multiple diseases, most notably cancer. For this reason, researchers have long been measuring DNA methylation in living organisms. The relationship between methylation and expression, and between methylation in different genomic regions is of great theoretical interest from a molecular biology perspective. Therefore, several models have been suggested to support the prediction of methylation status in samples. These models, however, have two main limitations: (a) they heavily rely on partially measured methylation levels as input, somewhat defeating the object as one is required to collect measurements from the sample of interest before applying the model; and (b) they are largely based on human mediated feature engineering, thus preventing the model from unveiling its own representations. To address these limitations we used deep learning, with an attention mechanism, to produce a general model that predicts DNA methylation for a given sample in any CpG position based solely on the sample's gene expression profile and the sequence surrounding the CpG.We show that our model is capable of generalizing to a completely separate test set of CpG positions and subjects. Depending on gene-CpG proximity conditions, our model can attain a Spearman correlation of up to 0.8 and MAE of 0.14 for thousands of CpG sites in the test data. We also identify and analyze several motifs and genes that our model suggests may be linked to methylation activity, such as Nodal and Hand1. Moreover, our approach, and most notably the use of attention mechanisms, offers a novel framework with which to extract valuable insights from gene expression data when combined with sequence information.The code and trained models are available at:https://github.com/YakhiniGroup/Methylation

2021 ◽  
Author(s):  
Tianyu Dong ◽  
Xiaoyan Wei ◽  
Qianting Qi ◽  
Peilei Chen ◽  
Yanqing Zhou ◽  
...  

Abstract Background: Epigenetic regulation plays a significant role in the accumulation of plant secondary metabolites. The terpenoids are the most abundant in the secondary metabolites of plants, iridoid glycosides belong to monoterpenoids which is one of the main medicinal components of R.glutinosa. At present, study on iridoid glycosides mainly focuses on its pharmacology, accumulation and distribution, while the mechanism of its biosynthesis and the relationship between DNA methylation and plant terpene biosynthesis are seldom reports. Results: The research showed that the expression of DXS, DXR, 10HGO, G10H, GPPS and accumulation of iridoid glycosides increased at first and then decreased with the maturity of R.glutinosa, and under different concentrations of 5-azaC, the expression of DXS, DXR, 10HGO, G10H, GPPS and the accumulation of total iridoid glycosides were promoted, the promotion effect of low concentration (15μM-50μM) was more significant, the content of genomic DNA 5mC decreased significantly, the DNA methylation status of R.glutinosa genomes was also changed. DNA demethylation promoted gene expression and increased the accumulation of iridoid glycosides, but excessive demethylation inhibited gene expression and decreased the accumulation of iridoid glycosides. Conclusion: The analysis of DNA methylation, gene expression, and accumulation of iridoid glycoside provides insights into accumulation of terpenoids in R.glutinosa and lays a foundation for future studies on the effects of epigenetics on the synthesis of secondary metabolites.


Genes ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 931 ◽  
Author(s):  
Saurav Mallik ◽  
Soumita Seth ◽  
Tapas Bhadra ◽  
Zhongming Zhao

DNA methylation change has been useful for cancer biomarker discovery, classification, and potential treatment development. So far, existing methods use either differentially methylated CpG sites or combined CpG sites, namely differentially methylated regions, that can be mapped to genes. However, such methylation signal mapping has limitations. To address these limitations, in this study, we introduced a combinatorial framework using linear regression, differential expression, deep learning method for accurate biological interpretation of DNA methylation through integrating DNA methylation data and corresponding TCGA gene expression data. We demonstrated it for uterine cervical cancer. First, we pre-filtered outliers from the data set and then determined the predicted gene expression value from the pre-filtered methylation data through linear regression. We identified differentially expressed genes (DEGs) by Empirical Bayes test using Limma. Then we applied a deep learning method, “nnet” to classify the cervical cancer label of those DEGs to determine all classification metrics including accuracy and area under curve (AUC) through 10-fold cross validation. We applied our approach to uterine cervical cancer DNA methylation dataset (NCBI accession ID: GSE30760, 27,578 features covering 63 tumor and 152 matched normal samples). After linear regression and differential expression analysis, we obtained 6287 DEGs with false discovery rate (FDR) <0.001. After performing deep learning analysis, we obtained average classification accuracy 90.69% (±1.97%) of the uterine cervical cancerous labels. This performance is better than that of other peer methods. We performed in-degree and out-degree hub gene network analysis using Cytoscape. We reported five top in-degree genes (PAIP2, GRWD1, VPS4B, CRADD and LLPH) and five top out-degree genes (MRPL35, FAM177A1, STAT4, ASPSCR1 and FABP7). After that, we performed KEGG pathway and Gene Ontology enrichment analysis of DEGs using tool WebGestalt(WEB-based Gene SeT AnaLysis Toolkit). In summary, our proposed framework that integrated linear regression, differential expression, deep learning provides a robust approach to better interpret DNA methylation analysis and gene expression data in disease study.


2015 ◽  
Vol 137 (2) ◽  
Author(s):  
Julia C. Chen ◽  
Mardonn Chua ◽  
Raymond B. Bellon ◽  
Christopher R. Jacobs

Osteogenic lineage commitment is often evaluated by analyzing gene expression. However, many genes are transiently expressed during differentiation. The availability of genes for expression is influenced by epigenetic state, which affects the heterochromatin structure. DNA methylation, a form of epigenetic regulation, is stable and heritable. Therefore, analyzing methylation status may be less temporally dependent and more informative for evaluating lineage commitment. Here we analyzed the effect of mechanical stimulation on osteogenic differentiation by applying fluid shear stress for 24 hr to osteocytes and then applying the osteocyte-conditioned medium (CM) to progenitor cells. We analyzed gene expression and changes in DNA methylation after 24 hr of exposure to the CM using quantitative real-time polymerase chain reaction and bisulfite sequencing. With fluid shear stress stimulation, methylation decreased for both adipogenic and osteogenic markers, which typically increases availability of genes for expression. After only 24 hr of exposure to CM, we also observed increases in expression of later osteogenic markers that are typically observed to increase after seven days or more with biochemical induction. However, we observed a decrease or no change in early osteogenic markers and decreases in adipogenic gene expression. Treatment of a demethylating agent produced an increase in all genes. The results indicate that fluid shear stress stimulation rapidly promotes the availability of genes for expression, but also specifically increases gene expression of later osteogenic markers.


2020 ◽  
Vol 14 ◽  
Author(s):  
Mette Soerensen ◽  
Dominika Marzena Hozakowska-Roszkowska ◽  
Marianne Nygaard ◽  
Martin J. Larsen ◽  
Veit Schwämmle ◽  
...  

2014 ◽  
Vol 34 (suppl_1) ◽  
Author(s):  
Jessilyn Dunn ◽  
Haiwei Qiu ◽  
Soyeon Kim ◽  
Daudi Jjingo ◽  
Ryan Hoffman ◽  
...  

Atherosclerosis preferentially occurs in arterial regions of disturbed blood flow (d-flow), which alters gene expression, endothelial function, and atherosclerosis. Here, we show that d-flow regulates genome-wide DNA methylation patterns in a DNA methyltransferase (DNMT)-dependent manner. We found that d-flow induced expression of DNMT1, but not DNMT3a or DNMT3b, in mouse arterial endothelium in vivo and in cultured endothelial cells by oscillatory shear (OS) compared to unidirectional laminar shear in vitro. The DNMT inhibitor 5-Aza-2’deoxycytidine (5Aza) or DNMT1 siRNA significantly reduced OS-induced endothelial inflammation. Moreover, 5Aza reduced lesion formation in two atherosclerosis models using ApoE-/- mice (western diet for 3 months and the partial carotid ligation model with western diet for 3 weeks). To identify the 5Aza mechanisms, we conducted two genome-wide studies: reduced representation bisulfite sequencing (RRBS) and transcript microarray using endothelial-enriched gDNA and RNA, respectively, obtained from the partially-ligated left common carotid artery (LCA exposed to d-flow) and the right contralateral control (RCA exposed to s-flow) of mice treated with 5Aza or vehicle. D-flow induced DNA hypermethylation in 421 gene promoters, which was significantly prevented by 5Aza in 335 genes. Systems biological analyses using the RRBS and the transcriptome data revealed 11 mechanosensitive genes whose promoters were hypermethylated by d-flow but rescued by 5Aza treatment. Of those, five genes contain hypermethylated cAMP-response-elements in their promoters, including the transcription factors HoxA5 and Klf3. Their methylation status could serve as a mechanosensitive master switch in endothelial gene expression. Our results demonstrate that d-flow controls epigenomic DNA methylation patterns in a DNMT-dependent manner, which in turn alters endothelial gene expression and induces atherosclerosis.


2015 ◽  
Vol 11 (7) ◽  
pp. 1786-1793 ◽  
Author(s):  
Yuanyuan Zhang ◽  
Junying Zhang

DNA methylation is essential not only in cellular differentiation but also in diseases.


Sign in / Sign up

Export Citation Format

Share Document