Different molecular enumeration influences in deep learning: an example using aqueous solubility

Author(s):  
Jen-Hao Chen ◽  
Yufeng Jane Tseng

Abstract Aqueous solubility is the key property driving many chemical and biological phenomena and impacts experimental and computational attempts to assess those phenomena. Accurate prediction of solubility is essential and challenging, even with modern computational algorithms. Fingerprint-based, feature-based and molecular graph-based representations have all been used with different deep learning methods for aqueous solubility prediction. It has been clearly demonstrated that different molecular representations impact the model prediction and explainability. In this work, we reviewed different representations and also focused on using graph and line notations for modeling. In general, one canonical chemical structure is used to represent one molecule when computing its properties. We carefully examined the commonly used simplified molecular-input line-entry specification (SMILES) notation representing a single molecule and proposed to use the full enumerations in SMILES to achieve better accuracy. A convolutional neural network (CNN) was used. The full enumeration of SMILES can improve the presentation of a molecule and describe the molecule with all possible angles. This CNN model can be very robust when dealing with large datasets since no additional explicit chemistry knowledge is necessary to predict the solubility. Also, traditionally it is hard to use a neural network to explain the contribution of chemical substructures to a single property. We demonstrated the use of attention in the decoding network to detect the part of a molecule that is relevant to solubility, which can be used to explain the contribution from the CNN.

2021 ◽  
Vol 11 ◽  
Author(s):  
Xiandong Leng ◽  
Eghbal Amidi ◽  
Sitai Kou ◽  
Hassam Cheema ◽  
Ebunoluwa Otegbeye ◽  
...  

We have developed a novel photoacoustic microscopy/ultrasound (PAM/US) endoscope to image post-treatment rectal cancer for surgical management of residual tumor after radiation and chemotherapy. Paired with a deep-learning convolutional neural network (CNN), the PAM images accurately differentiated pathological complete responders (pCR) from incomplete responders. However, the role of CNNs compared with traditional histogram-feature based classifiers needs further exploration. In this work, we compare the performance of the CNN models to generalized linear models (GLM) across 24 ex vivo specimens and 10 in vivo patient examinations. First order statistical features were extracted from histograms of PAM and US images to train, validate and test GLM models, while PAM and US images were directly used to train, validate, and test CNN models. The PAM-CNN model performed superiorly with an AUC of 0.96 (95% CI: 0.95-0.98) compared to the best PAM-GLM model using kurtosis with an AUC of 0.82 (95% CI: 0.82-0.83). We also found that both CNN and GLMs derived from photoacoustic data outperformed those utilizing ultrasound alone. We conclude that deep-learning neural networks paired with photoacoustic images is the optimal analysis framework for determining presence of residual cancer in the treated human rectum.


2018 ◽  
Vol 19 (0) ◽  
pp. 1-6
Author(s):  
Amane Suzuki ◽  
Yuichiro Kikura ◽  
Kenichi Tanaka ◽  
Kimito Funatsu

2021 ◽  
Author(s):  
Yuanyuan Jiang ◽  
Jiali Guo ◽  
Yjing Liu ◽  
Yanzhi Guo ◽  
Menglong Li ◽  
...  

<p>Cocrystal plays an important role in various fields. However, how to choose coformer remains a challenge on experiments. In this work, we develop a novel graph neural network (GNN) based deep learning framework to rapidly predict formation of the cocrystal. A large and reliable data set is first constructed, which contains 7871 samples. A complementary feature representation is proposed by combining molecular graph and molecular descriptors from priori knowledge. A new GNN learning architecture is then explored to effectively embed the priori knowledge into the “endto-end” learning on the molecular graph, in which multi-head attention mechanism is introduced to further optimize the feature space. Consequently, the performance of our model achieves 98.86% accuracy, greatly surpassing some traditional machine learning models and classic GNN models. Furthermore, the out-of-distribution prediction on energetic cocrystals is also high up to 97.11% accuracy, showing strong generalization.</p><br>


2020 ◽  
Author(s):  
Kelong Mao ◽  
Peilin Zhao ◽  
Tingyang Xu ◽  
Yu Rong ◽  
Xi Xiao ◽  
...  

AbstractWith massive possible synthetic routes in chemistry, retrosynthesis prediction is still a challenge for researchers. Recently, retrosynthesis prediction is formulated as a Machine Translation (MT) task. Namely, since each molecule can be represented as a Simplified Molecular-Input Line-Entry System (SMILES) string, the process of retrosynthesis is analogized to a process of language translation from the product to reactants. However, the MT models that applied on SMILES data usually ignore the information of natural atomic connections and the topology of molecules. To make more chemically plausible constrains on the atom representation learning for better performance, in this paper, we propose a Graph Enhanced Transformer (GET) framework, which adopts both the sequential and graphical information of molecules. Four different GET designs are proposed, which fuse the SMILES representations with atom embeddings learned from our improved Graph Neural Network (GNN). Empirical results show that our model significantly outperforms the vanilla Transformer model in test accuracy.


2021 ◽  
Author(s):  
Yuanyuan Jiang ◽  
Jiali Guo ◽  
Yjing Liu ◽  
Yanzhi Guo ◽  
Menglong Li ◽  
...  

<p>Cocrystal plays an important role in various fields. However, how to choose coformer remains a challenge on experiments. In this work, we develop a novel graph neural network (GNN) based deep learning framework to rapidly predict formation of the cocrystal. A large and reliable data set is first constructed, which contains 7871 samples. A complementary feature representation is proposed by combining molecular graph and molecular descriptors from priori knowledge. A new GNN learning architecture is then explored to effectively embed the priori knowledge into the “endto-end” learning on the molecular graph, in which multi-head attention mechanism is introduced to further optimize the feature space. Consequently, the performance of our model achieves 98.86% accuracy, greatly surpassing some traditional machine learning models and classic GNN models. Furthermore, the out-of-distribution prediction on energetic cocrystals is also high up to 97.11% accuracy, showing strong generalization.</p><br>


2021 ◽  
Vol 11 (24) ◽  
pp. 11595
Author(s):  
Abdolmaged Alkhulaifi ◽  
Arshad Jamal ◽  
Irfan Ahmad

Traffic signs are essential for the safe and efficient movement of vehicles through the transportation network. Poor sign visibility can lead to accidents. One of the key properties used to measure the visibility of a traffic sign is retro-reflection, which indicates how much light a traffic sign reflects back to the driver. The retro-reflection of the traffic sign degrades over time until it reaches a point where the traffic sign has to be changed or repaired. Several studies have explored the idea of modeling the sign degradation level to help the authorities in effective scheduling of sign maintenance. However, previous studies utilized simpler models and proposed multiple models for different combinations of the sheeting type and color used for the traffic sign. In this study, we present a neural network based deep learning model for traffic sign retro-reflectivity prediction. Data utilized in this study was collected using a handheld retro-reflectometer GR3 from field surveys of traffic signs. Sign retro-reflective measurements (i.e., the RA values) were taken for different sign sheeting brands, grades, colors, orientation angles, observation angles, and aging periods. Feature-based sensitivity analysis was conducted to identify variables’ relative importance in determining retro-reflectivity. Results show that the sheeting color and observation angle were the most significant variables, whereas sign orientation was the least important. Considering all the features, RA prediction results obtained from one-hot encoding outperformed other models reported in the literature. The findings of this study demonstrate the feasibility and robustness of the proposed neural network based deep learning model in predicting the sign retro-reflectivity.


Author(s):  
Shamik Tiwari

The classification of plants is one of the most important aims for botanists since plants have a significant part in the natural life cycle. In this work, a leaf-based automatic plant classification framework is investigated. The aim is to compare two different deep learning approaches named Deep Neural Network (DNN) and deep Convolutional Neural Network (CNN). In the case of deep neural network, hybrid shapes and texture features are utilized as hand-crafted features while in the case of the convolution non-handcraft, features are applied for classification. The offered frameworks are evaluated with a public leaf database. From the simulation results, it is confirmed that the deep CNN-based deep learning framework demonstrates superior classification performance than the handcraft feature based approach.


Sign in / Sign up

Export Citation Format

Share Document