scholarly journals BiLSTM-5mC: A Bidirectional Long Short-Term Memory-Based Approach for Predicting 5-Methylcytosine Sites in Genome-Wide DNA Promoters

Molecules ◽  
2021 ◽  
Vol 26 (24) ◽  
pp. 7414
Author(s):  
Xin Cheng ◽  
Jun Wang ◽  
Qianyue Li ◽  
Taigang Liu

An important reason of cancer proliferation is the change in DNA methylation patterns, characterized by the localized hypermethylation of the promoters of tumor-suppressor genes together with an overall decrease in the level of 5-methylcytosine (5mC). Therefore, identifying the 5mC sites in the promoters is a critical step towards further understanding the diverse functions of DNA methylation in genetic diseases such as cancers and aging. However, most wet-lab experimental techniques are often time consuming and laborious for detecting 5mC sites. In this study, we proposed a deep learning-based approach, called BiLSTM-5mC, for accurately identifying 5mC sites in genome-wide DNA promoters. First, we randomly divided the negative samples into 11 subsets of equal size, one of which can form the balance subset by combining with the positive samples in the same amount. Then, two types of feature vectors encoded by the one-hot method, and the nucleotide property and frequency (NPF) methods were fed into a bidirectional long short-term memory (BiLSTM) network and a full connection layer to train the 22 submodels. Finally, the outputs of these models were integrated to predict 5mC sites by using the majority vote strategy. Our experimental results demonstrated that BiLSTM-5mC outperformed existing methods based on the same independent dataset.

2020 ◽  
Author(s):  
Abdolreza Nazemi ◽  
Johannes Jakubik ◽  
Andreas Geyer-Schulz ◽  
Frank J. Fabozzi

2021 ◽  
Vol 11 (14) ◽  
pp. 6625
Author(s):  
Yan Su ◽  
Kailiang Weng ◽  
Chuan Lin ◽  
Zeqin Chen

An accurate dam deformation prediction model is vital to a dam safety monitoring system, as it helps assess and manage dam risks. Most traditional dam deformation prediction algorithms ignore the interpretation and evaluation of variables and lack qualitative measures. This paper proposes a data processing framework that uses a long short-term memory (LSTM) model coupled with an attention mechanism to predict the deformation response of a dam structure. First, the random forest (RF) model is introduced to assess the relative importance of impact factors and screen input variables. Secondly, the density-based spatial clustering of applications with noise (DBSCAN) method is used to identify and filter the equipment based abnormal values to reduce the random error in the measurements. Finally, the coupled model is used to focus on important factors in the time dimension in order to obtain more accurate nonlinear prediction results. The results of the case study show that, of all tested methods, the proposed coupled method performed best. In addition, it was found that temperature and water level both have significant impacts on dam deformation and can serve as reliable metrics for dam management.


Sign in / Sign up

Export Citation Format

Share Document