Molecular Graph Contrastive Learning with Parameterized Explainable Augmentations

ABSTRACTLearning generalizable, transferable, and robust representations for molecule data has always been a challenge. The recent success of contrastive learning (CL) for self-supervised graph representation learning provides a novel perspective to learn molecule representations. The most prevailing graph CL framework is to maximize the agreement of representations in different augmented graph views. However, existing graph CL frameworks usually adopt stochastic augmentations or schemes according to pre-defined rules on the input graph to obtain different graph views in various scales (e.g. node, edge, and subgraph), which may destroy topological semantemes and domain prior in molecule data, leading to suboptimal performance. Therefore, designing parameterized, learnable, and explainable augmentation is quite necessary for molecular graph contrastive learning. A well-designed parameterized augmentation scheme can preserve chemically meaningful structural information and intrinsically essential attributes for molecule graphs, which helps to learn representations that are insensitive to perturbation on unimportant atoms and bonds. In this paper, we propose a novel Molecular Graph Contrastive Learning with Parameterized Explainable Augmentations, MolCLE for brevity, that self-adaptively incorporates chemically significative information from both topological and semantic aspects of molecular graphs. Specifically, we apply deep neural networks to parameterize the augmentation process for both the molecular graph topology and atom attributes, to highlight contributive molecular substructures and recognize underlying chemical semantemes. Comprehensive experiments on a variety of real-world datasets demonstrate that our proposed method consistently outperforms compared baselines, which verifies the effectiveness of the proposed framework. Detailedly, our self-supervised MolCLE model surpasses many supervised counterparts, and meanwhile only uses hundreds of thousands of parameters to achieve comparative results against the state-of-the-art baseline, which has tens of millions of parameters. We also provide detailed case studies to validate the explainability of augmented graph views.CCS CONCEPTS• Mathematics of computing → Graph algorithms; • Applied computing → Bioinformatics; • Computing methodologies → Neural networks; Unsupervised learning.

Download Full-text

Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/204 ◽

2021 ◽

Author(s):

Ming Jin ◽

Yizhen Zheng ◽

Yuan-Fang Li ◽

Chen Gong ◽

Chuan Zhou ◽

...

Keyword(s):

State Of The Art ◽

Representation Learning ◽

Vital Role ◽

Graph Representation ◽

Input Graph ◽

Global Perspectives ◽

Multi Scale ◽

Recent Success ◽

Real World Datasets ◽

Siamese Networks

Graph representation learning plays a vital role in processing graph-structured data. However, prior arts on graph representation learning heavily rely on labeling information. To overcome this problem, inspired by the recent success of graph contrastive learning and Siamese networks in visual representation learning, we propose a novel self-supervised approach in this paper to learn node representations by enhancing Siamese self-distillation with multi-scale contrastive learning. Specifically, we first generate two augmented views from the input graph based on local and global perspectives. Then, we employ two objectives called cross-view and cross-network contrastiveness to maximize the agreement between node representations across different views and networks. To demonstrate the effectiveness of our approach, we perform empirical experiments on five real-world datasets. Our method not only achieves new state-of-the-art results but also surpasses some semi-supervised counterparts by large margins. Code is made available at https://github.com/GRAND-Lab/MERIT

Download Full-text

UniGNN: a Unified Framework for Graph and Hypergraph Neural Networks

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/353 ◽

2021 ◽

Author(s):

Jing Huang ◽

Jie Yang

Keyword(s):

Neural Networks ◽

Message Passing ◽

State Of The Art ◽

Representation Learning ◽

Graph Representation ◽

Challenging Problem ◽

Unified Framework ◽

Real World Datasets ◽

Graph Neural Networks ◽

Research Domains

Hypergraph, an expressive structure with flexibility to model the higher-order correlations among entities, has recently attracted increasing attention from various research domains. Despite the success of Graph Neural Networks (GNNs) for graph representation learning, how to adapt the powerful GNN-variants directly into hypergraphs remains a challenging problem. In this paper, we propose UniGNN, a unified framework for interpreting the message passing process in graph and hypergraph neural networks, which can generalize general GNN models into hypergraphs. In this framework, meticulously-designed architectures aiming to deepen GNNs can also be incorporated into hypergraphs with the least effort. Extensive experiments have been conducted to demonstrate the effectiveness of UniGNN on multiple real-world datasets, which outperform the state-of-the-art approaches with a large margin. Especially for the DBLP dataset, we increase the accuracy from 77.4% to 88.8% in the semi-supervised hypernode classification task. We further prove that the proposed message-passing based UniGNN models are at most as powerful as the 1-dimensional Generalized Weisfeiler-Leman (1-GWL) algorithm in terms of distinguishing non-isomorphic hypergraphs. Our code is available at https://github.com/OneForward/UniGNN.

Download Full-text

Graph Transformer for Graph-to-Sequence Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6243 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7464-7471

Author(s):

Deng Cai ◽

Wai Lam

Keyword(s):

Neural Networks ◽

Information Exchange ◽

State Of The Art ◽

Structural Information ◽

Representation Learning ◽

Graph Representation ◽

Text Generation ◽

Proposed Model ◽

Graph Neural Networks ◽

Meaning Representation

The dominant graph-to-sequence transduction models employ graph neural networks for graph representation learning, where the structural information is reflected by the receptive field of neurons. Unlike graph neural networks that restrict the information exchange between immediate neighborhood, we propose a new model, known as Graph Transformer, that uses explicit relation encoding and allows direct communication between two distant nodes. It provides a more efficient way for global graph structure modeling. Experiments on the applications of text generation from Abstract Meaning Representation (AMR) and syntax-based neural machine translation show the superiority of our proposed model. Specifically, our model achieves 27.4 BLEU on LDC2015E86 and 29.7 BLEU on LDC2017T10 for AMR-to-text generation, outperforming the state-of-the-art results by up to 2.2 points. On the syntax-based translation tasks, our model establishes new single-model state-of-the-art BLEU scores, 21.3 for English-to-German and 14.1 for English-to-Czech, improving over the existing best results, including ensembles, by over 1 BLEU.

Download Full-text

Unsupervised Representation Learning by Predicting Random Distances

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/408 ◽

2020 ◽

Author(s):

Hu Wang ◽

Guansong Pang ◽

Chunhua Shen ◽

Congbo Ma

Keyword(s):

Neural Networks ◽

Anomaly Detection ◽

Unsupervised Learning ◽

Large Scale ◽

Deep Neural Networks ◽

State Of The Art ◽

Representation Learning ◽

Great Success ◽

Learning Tasks ◽

Real World Datasets

Deep neural networks have gained great success in a broad range of tasks due to its remarkable capability to learn semantically rich features from high-dimensional data. However, they often require large-scale labelled data to successfully learn such features, which significantly hinders their adaption in unsupervised learning tasks, such as anomaly detection and clustering, and limits their applications to critical domains where obtaining massive labelled data is prohibitively expensive. To enable unsupervised learning on those domains, in this work we propose to learn features without using any labelled data by training neural networks to predict data distances in a randomly projected space. Random mapping is a theoretically proven approach to obtain approximately preserved distances. To well predict these distances, the representation learner is optimised to learn genuine class structures that are implicitly embedded in the randomly projected space. Empirical results on 19 real-world datasets show that our learned representations substantially outperform a few state-of-the-art methods for both anomaly detection and clustering tasks. Code is available at: \url{https://git.io/RDP}

Download Full-text

Multi-hop Attention Graph Neural Networks

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/425 ◽

2021 ◽

Author(s):

Guangtao Wang ◽

Rex Ying ◽

Jing Huang ◽

Jure Leskovec

Keyword(s):

Neural Networks ◽

Large Scale ◽

Performance Metrics ◽

State Of The Art ◽

Structural Information ◽

Representation Learning ◽

Attention Mechanism ◽

Graph Representation ◽

Knowledge Graph ◽

Graph Neural Networks

Self-attention mechanism in graph neural networks (GNNs) led to state-of-the-art performance on many graph representation learning tasks. Currently, at every layer, attention is computed between connected pairs of nodes and depends solely on the representation of the two nodes. However, such attention mechanism does not account for nodes that are not directly connected but provide important network context. Here we propose Multi-hop Attention Graph Neural Network (MAGNA), a principled way to incorporate multi-hop context information into every layer of attention computation. MAGNA diffuses the attention scores across the network, which increases the receptive field for every layer of the GNN. Unlike previous approaches, MAGNA uses a diffusion prior on attention values, to efficiently account for all paths between the pair of disconnected nodes. We demonstrate in theory and experiments that MAGNA captures large-scale structural information in every layer, and has a low-pass effect that eliminates noisy high-frequency information from graph data. Experimental results on node classification as well as the knowledge graph completion benchmarks show that MAGNA achieves state-of-the-art results: MAGNA achieves up to 5.7% relative error reduction over the previous state-of-the-art on Cora, Citeseer, and Pubmed. MAGNA also obtains the best performance on a large-scale Open Graph Benchmark dataset. On knowledge graph completion MAGNA advances state-of-the-art on WN18RR and FB15k-237 across four different performance metrics.

Download Full-text

Graph Neural Networks for Prediction of Fuel Ignition Quality

10.26434/chemrxiv.12280325.v1 ◽

2020 ◽

Author(s):

Artur Schweidtmann ◽

Jan Rittig ◽

Andrea König ◽

Martin Grohe ◽

Alexander Mitsos ◽

...

Keyword(s):

Neural Networks ◽

Octane Number ◽

Molecular Graph ◽

Chemical Properties ◽

Graph Representation ◽

Structure Property ◽

Oxygenated Hydrocarbons ◽

Physico Chemical ◽

Ignition Quality ◽

Graph Neural Networks

<div>Prediction of combustion-related properties of (oxygenated) hydrocarbons is an important and challenging task for which quantitative structure-property relationship (QSPR) models are frequently employed. Recently, a machine learning method, graph neural networks (GNNs), has shown promising results for the prediction of structure-property relationships. GNNs utilize a graph representation of molecules, where atoms correspond to nodes and bonds to edges containing information about the molecular structure. More specifically, GNNs learn physico-chemical properties as a function of the molecular graph in a supervised learning setup using a backpropagation algorithm. This end-to-end learning approach eliminates the need for selection of molecular descriptors or structural groups, as it learns optimal fingerprints through graph convolutions and maps the fingerprints to the physico-chemical properties by deep learning. We develop GNN models for predicting three fuel ignition quality indicators, i.e., the derived cetane number (DCN), the research octane number (RON), and the motor octane number (MON), of oxygenated and non-oxygenated hydrocarbons. In light of limited experimental data in the order of hundreds, we propose a combination of multi-task learning, transfer learning, and ensemble learning. The results show competitive performance of the proposed GNN approach compared to state-of-the-art QSPR models making it a promising field for future research. The prediction tool is available via a web front-end at www.avt.rwth-aachen.de/gnn.</div>

Download Full-text

A Novel Method to Predict Drug-Target Interactions Based on Large-Scale Graph Representation Learning

Cancers ◽

10.3390/cancers13092111 ◽

2021 ◽

Vol 13 (9) ◽

pp. 2111

Author(s):

Bo-Wei Zhao ◽

Zhu-Hong You ◽

Lun Hu ◽

Zhen-Hao Guo ◽

Lei Wang ◽

...

Keyword(s):

Drug Target ◽

Large Scale ◽

Computational Models ◽

Structural Information ◽

Characteristic Curve ◽

Representation Learning ◽

Graph Representation ◽

Convolutional Network ◽

Novel Method

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.

Download Full-text

Deep neural networks-based relevant latent representation learning for hyperspectral image classification

Pattern Recognition ◽

10.1016/j.patcog.2021.108224 ◽

2021 ◽

pp. 108224

Author(s):

Akrem Sellami ◽

Salvatore Tabbone

Keyword(s):

Neural Networks ◽

Image Classification ◽

Deep Neural Networks ◽

Hyperspectral Image ◽

Representation Learning ◽

Hyperspectral Image Classification

Download Full-text

Label Distribution for Learning with Noisy Labels

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/356 ◽

2020 ◽

Author(s):

Yun-Peng Liu ◽

Ning Xu ◽

Yu Zhang ◽

Xin Geng

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Learning Algorithm ◽

State Of The Art ◽

Confidence Estimation ◽

Novel Method ◽

Real World Datasets ◽

Label Distribution ◽

Noisy Labels

The performances of deep neural networks (DNNs) crucially rely on the quality of labeling. In some situations, labels are easily corrupted, and therefore some labels become noisy labels. Thus, designing algorithms that deal with noisy labels is of great importance for learning robust DNNs. However, it is difficult to distinguish between clean labels and noisy labels, which becomes the bottleneck of many methods. To address the problem, this paper proposes a novel method named Label Distribution based Confidence Estimation (LDCE). LDCE estimates the confidence of the observed labels based on label distribution. Then, the boundary between clean labels and noisy labels becomes clear according to confidence scores. To verify the effectiveness of the method, LDCE is combined with the existing learning algorithm to train robust DNNs. Experiments on both synthetic and real-world datasets substantiate the superiority of the proposed algorithm against state-of-the-art methods.

Download Full-text

An Attention-Based Graph Neural Network for Heterogeneous Structural Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5833 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4132-4139

Author(s):

Huiting Hong ◽

Hantao Guo ◽

Yucheng Lin ◽

Xiaoqing Yang ◽

Zang Li ◽

...

Keyword(s):

Neural Network ◽

Structural Information ◽

Representation Learning ◽

Graph Representation ◽

Heterogeneous Information ◽

Domain Experts ◽

Proposed Model ◽

Meta Path ◽

Low Dimensional ◽

Public Datasets

In this paper, we focus on graph representation learning of heterogeneous information network (HIN), in which various types of vertices are connected by various types of relations. Most of the existing methods conducted on HIN revise homogeneous graph embedding models via meta-paths to learn low-dimensional vector space of HIN. In this paper, we propose a novel Heterogeneous Graph Structural Attention Neural Network (HetSANN) to directly encode structural information of HIN without meta-path and achieve more informative representations. With this method, domain experts will not be needed to design meta-path schemes and the heterogeneous information can be processed automatically by our proposed model. Specifically, we implicitly represent heterogeneous information using the following two methods: 1) we model the transformation between heterogeneous vertices through a projection in low-dimensional entity spaces; 2) afterwards, we apply the graph neural network to aggregate multi-relational information of projected neighborhood by means of attention mechanism. We also present three extensions of HetSANN, i.e., voices-sharing product attention for the pairwise relationships in HIN, cycle-consistency loss to retain the transformation between heterogeneous entity spaces, and multi-task learning with full use of information. The experiments conducted on three public datasets demonstrate that our proposed models achieve significant and consistent improvements compared to state-of-the-art solutions.

Download Full-text