Incorporating symbolic domain knowledge into graph neural networks

Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models

Journal of Cheminformatics ◽

10.1186/s13321-020-00479-8 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Dejun Jiang ◽

Zhenxing Wu ◽

Chang-Yu Hsieh ◽

Guangyong Chen ◽

Ben Liao ◽

...

Keyword(s):

Neural Networks ◽

Computational Efficiency ◽

Domain Knowledge ◽

Prediction Models ◽

Computational Cost ◽

Large Dataset ◽

Predictive Capacity ◽

Classification Tasks ◽

Graph Neural Networks ◽

Public Datasets

AbstractGraph neural networks (GNN) has been considered as an attractive modelling method for molecular property prediction, and numerous studies have shown that GNN could yield more promising results than traditional descriptor-based methods. In this study, based on 11 public datasets covering various property endpoints, the predictive capacity and computational efficiency of the prediction models developed by eight machine learning (ML) algorithms, including four descriptor-based models (SVM, XGBoost, RF and DNN) and four graph-based models (GCN, GAT, MPNN and Attentive FP), were extensively tested and compared. The results demonstrate that on average the descriptor-based models outperform the graph-based models in terms of prediction accuracy and computational efficiency. SVM generally achieves the best predictions for the regression tasks. Both RF and XGBoost can achieve reliable predictions for the classification tasks, and some of the graph-based models, such as Attentive FP and GCN, can yield outstanding performance for a fraction of larger or multi-task datasets. In terms of computational cost, XGBoost and RF are the two most efficient algorithms and only need a few seconds to train a model even for a large dataset. The model interpretations by the SHAP method can effectively explore the established domain knowledge for the descriptor-based models. Finally, we explored use of these models for virtual screening (VS) towards HIV and demonstrated that different ML algorithms offer diverse VS profiles. All in all, we believe that the off-the-shelf descriptor-based models still can be directly employed to accurately predict various chemical endpoints with excellent computability and interpretability.

Get full-text (via PubEx)

Could Graph Neural Networks Learn Better Molecular Representation for Drug Discovery? A Comparison Study of Descriptor-based and Graph-based Models

10.21203/rs.3.rs-81439/v1 ◽

2020 ◽

Author(s):

Dejun Jiang ◽

Zhenxing Wu ◽

Chang-Yu Hsieh ◽

Guangyong Chen ◽

Ben Liao ◽

...

Keyword(s):

Neural Networks ◽

Computational Efficiency ◽

Domain Knowledge ◽

Prediction Models ◽

Computational Cost ◽

Large Dataset ◽

Predictive Capacity ◽

Classification Tasks ◽

Graph Neural Networks ◽

Public Datasets

Abstract Graph neural networks (GNN) has been considered as an attractive modelling method for molecular property prediction, and numerous studies have shown that GNN could yield more promising results than traditional descriptor-based methods. In this study, based on 11 public datasets covering various property endpoints, the predictive capacity and computational efficiency of the prediction models developed by eight machine learning (ML) algorithms, including four descriptor-based models (SVM, XGBoost, RF and DNN) and four graph-based models (GCN, GAT, MPNN and Attentive FP), were extensively tested and compared. The results demonstrate that on average the descriptor-based models outperform the graph-based models in terms of prediction accuracy and computational efficiency. SVM generally achieves the best predictions for the regression tasks. Both RF and XGBoost can achieve reliable predictions for the classification tasks, and some of the graph-based models, such as Attentive FP and GCN, can yield outstanding performance for a fraction of larger or multi-task datasets. In terms of computational cost, XGBoost and RF are the two most efficient algorithms and only need a few seconds to train a model even for a large dataset. The model interpretations by the SHAP method can effectively explore the established domain knowledge for the descriptor-based models. Finally, we explored use of these models for virtual screening (VS) towards HIV and demonstrated that different ML algorithms offer diverse VS profiles. All in all, we believe that the off-the-shelf descriptor-based models still can be directly employed to accurately predict various chemical endpoints with excellent computability and interpretability.

Get full-text (via PubEx)

Graph Neural Architecture Search

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/195 ◽

2020 ◽

Cited By ~ 2

Author(s):

Yang Gao ◽

Hong Yang ◽

Peng Zhang ◽

Chuan Zhou ◽

Yue Hu

Keyword(s):

Neural Networks ◽

Network Architecture ◽

Domain Knowledge ◽

Search Space ◽

Recurrent Network ◽

Validation Data ◽

Data Set ◽

Neural Architecture ◽

Real World Datasets ◽

Graph Neural Networks

Graph neural networks (GNNs) emerged recently as a powerful tool for analyzing non-Euclidean data such as social network data. Despite their success, the design of graph neural networks requires heavy manual work and domain knowledge. In this paper, we present a graph neural architecture search method (GraphNAS) that enables automatic design of the best graph neural architecture based on reinforcement learning. Specifically, GraphNAS uses a recurrent network to generate variable-length strings that describe the architectures of graph neural networks, and trains the recurrent network with policy gradient to maximize the expected accuracy of the generated architectures on a validation data set. Furthermore, to improve the search efficiency of GraphNAS on big networks, GraphNAS restricts the search space from an entire architecture space to a sequential concatenation of the best search results built on each single architecture layer. Experiments on real-world datasets demonstrate that GraphNAS can design a novel network architecture that rivals the best human-invented architecture in terms of validation set accuracy. Moreover, in a transfer learning task we observe that graph neural architectures designed by GraphNAS, when transferred to new datasets, still gain improvement in terms of prediction accuracy.

Get full-text (via PubEx)

Could Graph Neural Networks Learn Better Molecular Representation for Drug Discovery? A Comparison Study of Descriptor-based and Graph-based Models

10.21203/rs.3.rs-79416/v1 ◽

2020 ◽

Author(s):

Dejun Jiang ◽

Zhenxing Wu ◽

Chang-Yu Hsieh ◽

Guangyong Chen ◽

Ben Liao ◽

...

Keyword(s):

Neural Networks ◽

Computational Efficiency ◽

Domain Knowledge ◽

Prediction Models ◽

Computational Cost ◽

Large Dataset ◽

Predictive Capacity ◽

Classification Tasks ◽

Graph Neural Networks ◽

Public Datasets

Abstract Graph neural networks (GNN) has been considered as an attractive modelling method for molecular property prediction, and numerous studies have shown that GNN could yield more promising results than traditional descriptor-based methods. In this study, based on 11 public datasets covering various property endpoints, the predictive capacity and computational efficiency of the prediction models developed by eight machine learning (ML) algorithms, including four descriptor-based models (SVM, XGBoost, RF and DNN) and four graph-based models (GCN, GAT, MPNN and Attentive FP), were extensively tested and compared. The results demonstrate that on average the descriptor-based models outperform the graph-based models in terms of prediction accuracy and computational efficiency. SVM generally achieves the best predictions for the regression tasks. Both RF and XGBoost can achieve reliable predictions for the classification tasks, and some of the graph-based models, such as Attentive FP and GCN, can yield outstanding performance for a fraction of larger or multi-task datasets. In terms of computational cost, XGBoost and RF are the two most efficient algorithms and only need a few seconds to train a model even for a large dataset. The model interpretations by the SHAP method can effectively explore the established domain knowledge for the descriptor-based models. Finally, we explored use of these models for virtual screening (VS) towards HIV and demonstrated that different ML algorithms offer diverse VS profiles. All in all, we believe that the off-the-shelf descriptor-based models still can be directly employed to accurately predict various chemical endpoints with excellent computability and interpretability.

Get full-text (via PubEx)

Graph Neural Networks for Prediction of Fuel Ignition Quality

10.26434/chemrxiv.12280325.v1 ◽

2020 ◽

Author(s):

Artur Schweidtmann ◽

Jan Rittig ◽

Andrea König ◽

Martin Grohe ◽

Alexander Mitsos ◽

...

Keyword(s):

Neural Networks ◽

Octane Number ◽

Molecular Graph ◽

Chemical Properties ◽

Graph Representation ◽

Structure Property ◽

Oxygenated Hydrocarbons ◽

Physico Chemical ◽

Ignition Quality ◽

Graph Neural Networks

<div>Prediction of combustion-related properties of (oxygenated) hydrocarbons is an important and challenging task for which quantitative structure-property relationship (QSPR) models are frequently employed. Recently, a machine learning method, graph neural networks (GNNs), has shown promising results for the prediction of structure-property relationships. GNNs utilize a graph representation of molecules, where atoms correspond to nodes and bonds to edges containing information about the molecular structure. More specifically, GNNs learn physico-chemical properties as a function of the molecular graph in a supervised learning setup using a backpropagation algorithm. This end-to-end learning approach eliminates the need for selection of molecular descriptors or structural groups, as it learns optimal fingerprints through graph convolutions and maps the fingerprints to the physico-chemical properties by deep learning. We develop GNN models for predicting three fuel ignition quality indicators, i.e., the derived cetane number (DCN), the research octane number (RON), and the motor octane number (MON), of oxygenated and non-oxygenated hydrocarbons. In light of limited experimental data in the order of hundreds, we propose a combination of multi-task learning, transfer learning, and ensemble learning. The results show competitive performance of the proposed GNN approach compared to state-of-the-art QSPR models making it a promising field for future research. The prediction tool is available via a web front-end at www.avt.rwth-aachen.de/gnn.</div>

Get full-text (via PubEx)

Conversational Emotion Recognition Using Self-Attention Mechanisms and Graph Neural Networks

10.21437/interspeech.2020-1703 ◽

2020 ◽

Author(s):

Zheng Lian ◽

Jianhua Tao ◽

Bin Liu ◽

Jian Huang ◽

Zhanlei Yang ◽

...

Keyword(s):

Neural Networks ◽

Emotion Recognition ◽

Graph Neural Networks

Get full-text (via PubEx)

Graph Pooling in Graph Neural Networks with Node Feature Correlation

Proceedings of the 3rd International Conference on Data Science and Information Technology ◽

10.1145/3414274.3414490 ◽

2020 ◽

Author(s):

Jianjian Jiang ◽

Fangyuan Lei ◽

Qingyun Dai ◽

Zhengmin Li

Keyword(s):

Neural Networks ◽

Feature Correlation ◽

Graph Neural Networks

Get full-text (via PubEx)

Streaming Graph Neural Networks via Continual Learning

Proceedings of the 29th ACM International Conference on Information & Knowledge Management ◽

10.1145/3340531.3411963 ◽

2020 ◽

Author(s):

Junshan Wang ◽

Guojie Song ◽

Yi Wu ◽

Liang Wang

Keyword(s):

Neural Networks ◽

Graph Neural Networks ◽

Continual Learning

Get full-text (via PubEx)

Gated Knowledge Graph Neural Networks for Top-N Recommendation System

2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD) ◽

10.1109/cscwd49262.2021.9437829 ◽

2021 ◽

Author(s):

Nan Mu ◽

Daren Zha ◽

Rui Gong

Keyword(s):

Neural Networks ◽

Recommendation System ◽

Knowledge Graph ◽

Graph Neural Networks

Get full-text (via PubEx)

MutualRec: Joint friend and item recommendations with mutualistic attentional graph neural networks

Journal of Network and Computer Applications ◽

10.1016/j.jnca.2020.102954 ◽

2020 ◽

pp. 102954

Author(s):

Yang Xiao ◽

Qingqi Pei ◽

Tingting Xiao ◽

Lina Yao ◽

Huan Liu

Keyword(s):

Neural Networks ◽

Graph Neural Networks

Get full-text (via PubEx)