scholarly journals Prediction Model of Organic Molecular Absorption Energies based on Deep Learning trained by Chaos-enhanced Accelerated Evolutionary algorithm

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Mengshan Li ◽  
Suyun Lian ◽  
Fan Wang ◽  
Yanying Zhou ◽  
Bingsheng Chen ◽  
...  

AbstractAs an important physical property of molecules, absorption energy can characterize the electronic property and structural information of molecules. Moreover, the accurate calculation of molecular absorption energies is highly valuable. Present linear and nonlinear methods hold low calculation accuracies due to great errors, especially irregular complicated molecular systems for structures. Thus, developing a prediction model for molecular absorption energies with enhanced accuracy, efficiency, and stability is highly beneficial. By combining deep learning and intelligence algorithms, we propose a prediction model based on the chaos-enhanced accelerated particle swarm optimization algorithm and deep artificial neural network (CAPSO BP DNN) that possesses a seven-layer 8-4-4-4-4-4-1 structure. Eight parameters related to molecular absorption energies are selected as inputs, such as a theoretical calculating value Ec of absorption energy (B3LYP/STO-3G), molecular electron number Ne, oscillator strength Os, number of double bonds Ndb, total number of atoms Na, number of hydrogen atoms Nh, number of carbon atoms Nc, and number of nitrogen atoms NN; and one parameter representing the molecular absorption energy is regarded as the output. A prediction experiment on organic molecular absorption energies indicates that CAPSO BP DNN exhibits a favourable predictive effect, accuracy, and correlation. The tested absolute average relative error, predicted root-mean-square error, and square correlation coefficient are 0.033, 0.0153, and 0.9957, respectively. Relative to other prediction models, the CAPSO BP DNN model exhibits a good comprehensive prediction performance and can provide references for other materials, chemistry and physics fields, such as nonlinear prediction of chemical and physical properties, QSAR/QAPR and chemical information modelling, etc.

2020 ◽  
Author(s):  
Ryosuke Kojima ◽  
Shoichi Ishida ◽  
Masateru Ohta ◽  
Hiroaki Iwata ◽  
Teruki Honma ◽  
...  

<div>Deep learning is developing as an important technology to perform various tasks in cheminformatics. In particular, graph convolutional neural networks (GCNs) have been reported to perform well in many types of prediction tasks related to molecules. Although GCN exhibits considerable potential in various applications, appropriate utilization of this resource for obtaining reasonable and reliable prediction results requires thorough understanding of GCN and programming. To leverage the power of GCN to benefit various users from chemists to cheminformaticians, an open-source GCN tool, kGCN, is introduced. To support the users with various levels of programming skills, kGCN includes three interfaces: a graphical user interface (GUI) employing KNIME for users with limited programming skills such as chemists, as well as command-line and Python library interfaces for users with advanced programming skills such as cheminformaticians. To support the three steps required for building a prediction model, i.e., pre-processing, model tuning, and interpretation of results, kGCN includes functions of typical pre-processing, Bayesian optimization for automatic model tuning, and visualization of the atomic contribution to prediction for interpretation of results. kGCN supports three types of approaches, single-task, multi-task, and multimodal predictions. The prediction of compound-protein interaction for four matrixmetalloproteases, MMP-3, -9, -12 and -13, in the inhibition assays is performed as a representative case study using kGCN. Additionally, kGCN provides the visualization of atomic contributions to the prediction. Such visualization is useful for the validation of the prediction models and the design of molecules based on the prediction model, realizing “explainable AI” for understanding the factors affecting AI prediction. kGCN is available at https://github.com/clinfo/kGCN.</div>


2020 ◽  
Author(s):  
Ryosuke Kojima ◽  
Shoichi Ishida ◽  
Masateru Ohta ◽  
Hiroaki Iwata ◽  
Teruki Honma ◽  
...  

<div>Deep learning is developing as an important technology to perform various tasks in cheminformatics. In particular, graph convolutional neural networks (GCNs) have been reported to perform well in many types of prediction tasks related to molecules. Although GCN exhibits considerable potential in various applications, appropriate utilization of this resource for obtaining reasonable and reliable prediction results requires thorough understanding of GCN and programming. To leverage the power of GCN to benefit various users from chemists to cheminformaticians, an open-source GCN tool, kGCN, is introduced. To support the users with various levels of programming skills, kGCN includes three interfaces: a graphical user interface (GUI) employing KNIME for users with limited programming skills such as chemists, as well as command-line and Python library interfaces for users with advanced programming skills such as cheminformaticians. To support the three steps required for building a prediction model, i.e., pre-processing, model tuning, and interpretation of results, kGCN includes functions of typical pre-processing, Bayesian optimization for automatic model tuning, and visualization of the atomic contribution to prediction for interpretation of results. kGCN supports three types of approaches, single-task, multi-task, and multimodal predictions. The prediction of compound-protein interaction for four matrixmetalloproteases, MMP-3, -9, -12 and -13, in the inhibition assays is performed as a representative case study using kGCN. Additionally, kGCN provides the visualization of atomic contributions to the prediction. Such visualization is useful for the validation of the prediction models and the design of molecules based on the prediction model, realizing “explainable AI” for understanding the factors affecting AI prediction. kGCN is available at https://github.com/clinfo/kGCN.</div>


2020 ◽  
Vol 11 ◽  
pp. 374
Author(s):  
Masahito Katsuki ◽  
Yukinari Kakizawa ◽  
Akihiro Nishikawa ◽  
Yasunaga Yamamoto ◽  
Toshiya Uchiyama

Background: Reliable prediction models of subarachnoid hemorrhage (SAH) outcomes are needed for decision-making of the treatment. SAFIRE score using only four variables is a good prediction scoring system. However, making such prediction models needs a large number of samples and time-consuming statistical analysis. Deep learning (DL), one of the artificial intelligence, is attractive, but there were no reports on prediction models for SAH outcomes using DL. We herein made a prediction model using DL software, Prediction One (Sony Network Communications Inc., Tokyo, Japan) and compared it to SAFIRE score. Methods: We used 153 consecutive aneurysmal SAH patients data in our hospital between 2012 and 2019. Modified Rankin Scale (mRS) 0–3 at 6 months was defined as a favorable outcome. We randomly divided them into 102 patients training dataset and 51 patients external validation dataset. Prediction one made the prediction model using the training dataset with internal cross-validation. We used both the created model and SAFIRE score to predict the outcomes using the external validation set. The areas under the curve (AUCs) were compared. Results: The model made by Prediction One using 28 variables had AUC of 0.848, and its AUC for the validation dataset was 0.953 (95%CI 0.900–1.000). AUCs calculated using SAFIRE score were 0.875 for the training dataset and 0.960 for the validation dataset, respectively. Conclusion: We easily and quickly made prediction models using Prediction One, even with a small single-center dataset. The accuracy of the model was not so inferior to those of previous statistically calculated prediction models.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Youjin Jang ◽  
Inbae Jeong ◽  
Yong K. Cho

PurposeThe study seeks to identify the impact of variables in a deep learning-based bankruptcy prediction model, which has achieved superior performance to other prediction models but cannot easily interpret hidden processes.Design/methodology/approachThis study developed three LSTM-RNN–based models that predicted the probability of bankruptcy before 1, 2 and 3 years using financial, the construction market and macroeconomic variables as input variables. Then, the impacts of the input variables that affected prediction accuracy in each model were identified by using Shapley value and compared among the three models. This study also investigated the prediction accuracy using variants of input variables grouped sequentially by high-impact ranking.FindingsThe results showed that the prediction accuracies were largely impacted by “housing starts” in all models. As the prediction period increased, the effects of macroeconomic variables on prediction accuracy increased, whereas the impact of “return on assets” on prediction accuracy decreased. It also found that the “current ratio” and “debt ratio” significantly influenced the prediction accuracies in all models. Also, the results revealed that similar prediction accuracies could be achieved using only 8, 10, and 10 variables out of a total of 18 variables for the 1-, 2-, and 3-year prediction models, respectively.Originality/valueThis study provides a Shapley value-based approach to identify how each input variable in a deep-learning bankruptcy prediction model. The findings of this study can not only assist in obtaining better insights into the underlying concept of bankruptcy but also use to select variables by removing those identified as less significant.


2018 ◽  
Vol 57 (3) ◽  
pp. 547-570 ◽  
Author(s):  
Wanli Xing ◽  
Dongping Du

Massive open online courses (MOOCs) show great potential to transform traditional education through the Internet. However, the high attrition rates in MOOCs have often been cited as a scale-efficacy tradeoff. Traditional educational approaches are usually unable to identify such large-scale number of at-risk students in danger of dropping out in time to support effective intervention design. While building dropout prediction models using learning analytics are promising in informing intervention design for these at-risk students, results of the current prediction model construction methods do not enable personalized intervention for these students. In this study, we take an initial step to optimize the dropout prediction model performance toward intervention personalization for at-risk students in MOOCs. Specifically, based on a temporal prediction mechanism, this study proposes to use the deep learning algorithm to construct the dropout prediction model and further produce the predicted individual student dropout probability. By taking advantage of the power of deep learning, this approach not only constructs more accurate dropout prediction models compared with baseline algorithms but also comes up with an approach to personalize and prioritize intervention for at-risk students in MOOCs through using individual drop out probabilities. The findings from this study and implications are then discussed.


2017 ◽  
Vol 2017 ◽  
pp. 1-6 ◽  
Author(s):  
Guohui Li ◽  
Songling Zhang ◽  
Hong Yang

Aiming at the irregularity of nonlinear signal and its predicting difficulty, a deep learning prediction model based on extreme-point symmetric mode decomposition (ESMD) and clustering analysis is proposed. Firstly, the original data is decomposed by ESMD to obtain the finite number of intrinsic mode functions (IMFs) and residuals. Secondly, the fuzzy c-means is used to cluster the decomposed components, and then the deep belief network (DBN) is used to predict it. Finally, the reconstructed IMFs and residuals are the final prediction results. Six kinds of prediction models are compared, which are DBN prediction model, EMD-DBN prediction model, EEMD-DBN prediction model, CEEMD-DBN prediction model, ESMD-DBN prediction model, and the proposed model in this paper. The same sunspots time series are predicted with six kinds of prediction models. The experimental results show that the proposed model has better prediction accuracy and smaller error.


2019 ◽  
Vol 18 (10) ◽  
pp. 2099-2107 ◽  
Author(s):  
Shenheng Guan ◽  
Michael F. Moran ◽  
Bin Ma

Deep learning models for prediction of three key LC-MS/MS properties from peptide sequences were developed. The LC-MS/MS properties or behaviors are indexed retention times (iRT), MS1 or survey scan charge state distributions, and sequence ion intensities of HCD spectra. A common core deep supervised learning architecture, bidirectional long-short term memory (LSTM) recurrent neural networks was used to construct the three prediction models. Two featurization schemes were proposed and demonstrated to allow for efficient encoding of modifications. The iRT and charge state distribution models were trained with on order of 105 data points each. An HCD sequence ion prediction model was trained with 2 × 106 experimental spectra. The iRT prediction model and HCD sequence ion prediction model provide improved accuracies over the start-of-the-art models available in literature. The MS1 charge state distribution prediction model offers excellent performance. The prediction models can be used to enhance peptide identification and quantification in data-dependent acquisition and data-independent acquisition (DIA) experiments as well as to assist MRM (multiple reaction monitoring) and PRM (parallel reaction monitoring) experiment design.


2020 ◽  
Vol 10 (5) ◽  
pp. 1597 ◽  
Author(s):  
Yoojeong Song ◽  
Jongwoo Lee

In Korea, because of the high interest in stock investment, many researchers have attempted to predict stock prices using deep learning. Studies to predict stock prices have been continuously conducted. However, the type of stock data that is suitable for deep learning has not been established, and it has not been confirmed that the developed stock prediction model can actually result in a profit. To date, designing a good deep learning model depends on how well the user can extract the features that represent all the characteristics of the training data. Among the various available features for training and test data, we determined that the use of event binary features can make stock price prediction models perform better. An event binary feature refers to a 0 or 1 value describing whether an indicator is satisfied (1) or not (0) for any given day and stock. We proposed and compared a stock price prediction model with three different feature combinations to verify the importance of binary features. As a result, we derived a prediction model that defeated the market (KOSPI and KODAQ (KOSPI (Korea Composite Stock Price Index) and KOSDAQ (Korean Securities Dealers Automated Quotations) is Korean stock indices)). The results suggest that deep learning is suitable for stock price prediction.


Water ◽  
2020 ◽  
Vol 12 (7) ◽  
pp. 1942 ◽  
Author(s):  
Kanghyeok Lee ◽  
Changhyun Choi ◽  
Do Hyoung Shin ◽  
Hung Soo Kim

Heavy rain damage prediction models were developed with a deep learning technique for predicting the damage to a region before heavy rain damage occurs. As a dependent variable, a damage scale comprising three categories (minor, significant, severe) was used, and meteorological data 7 days before the damage were used as independent variables. A deep neural network (DNN), convolutional neural network (CNN), and recurrent neural network (RNN), which are representative deep learning techniques, were employed for the model development. Each model was trained and tested 30 times to evaluate the predictive performance. As a result of evaluating the predicted performance, the DNN-based model and the CNN-based model showed good performance, and the RNN-based model was analyzed to have relatively low performance. For the DNN-based model, the convergence epoch of the training showed a relatively wide distribution, which may lead to difficulties in selecting an epoch suitable for practical use. Therefore, the CNN-based model would be acceptable for the heavy rain damage prediction in terms of the accuracy and robustness. These results demonstrated the applicability of deep learning in the development of the damage prediction model. The proposed prediction model can be used for disaster management as the basic data for decision making.


2020 ◽  
Author(s):  
Ryosuke Kojima ◽  
Shoichi Ishida ◽  
Masateru Ohta ◽  
Hiroaki Iwata ◽  
Teruki Honma ◽  
...  

Abstract Deep learning is developing as an important technology to perform various tasks in cheminformatics. In particular, graph convolutional neural networks (GCNs) have been reported to perform well in many types of prediction tasks related to molecules. Although GCN exhibits considerable potential in various applications, appropriate utilization of this resource for obtaining reasonable and reliable prediction results requires thorough understanding of GCN and programming. To leverage the power of GCN to benefit various users from chemists to cheminformaticians, an open-source GCN tool, kGCN, is introduced. To support the users with various levels of programming skills, kGCN includes three interfaces: a graphical user interface (GUI) employing KNIME for users with limited programming skills such as chemists, as well as command-line and Python library interfaces for users with advanced programming skills such as cheminformaticians. To support the three steps required for building a prediction model, i.e., pre-processing, model tuning, and interpretation of results, kGCN includes functions of typical pre-processing, Bayesian optimization for automatic model tuning, and visualization of the atomic contribution to prediction for interpretation of results. kGCN supports three types of approaches, single-task, multi-task, and multi-modal predictions. The prediction of compound-protein interaction for four matrixmetalloproteases, MMP-3, -9, -12 and -13, in the inhibition assays is performed as a representative case study using kGCN. Additionally, kGCN provides the visualization of atomic contributions to the prediction. Such visualization is useful for the validation of the prediction models and the design of molecules based on the prediction model, realizing “explainable AI” for understanding the factors affecting AI prediction. kGCN is available at https://github.com/clinfo/kGCN.


Sign in / Sign up

Export Citation Format

Share Document