Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/204 ◽

2021 ◽

Author(s):

Ming Jin ◽

Yizhen Zheng ◽

Yuan-Fang Li ◽

Chen Gong ◽

Chuan Zhou ◽

...

Keyword(s):

State Of The Art ◽

Representation Learning ◽

Vital Role ◽

Graph Representation ◽

Input Graph ◽

Global Perspectives ◽

Multi Scale ◽

Recent Success ◽

Real World Datasets ◽

Siamese Networks

Graph representation learning plays a vital role in processing graph-structured data. However, prior arts on graph representation learning heavily rely on labeling information. To overcome this problem, inspired by the recent success of graph contrastive learning and Siamese networks in visual representation learning, we propose a novel self-supervised approach in this paper to learn node representations by enhancing Siamese self-distillation with multi-scale contrastive learning. Specifically, we first generate two augmented views from the input graph based on local and global perspectives. Then, we employ two objectives called cross-view and cross-network contrastiveness to maximize the agreement between node representations across different views and networks. To demonstrate the effectiveness of our approach, we perform empirical experiments on five real-world datasets. Our method not only achieves new state-of-the-art results but also surpasses some semi-supervised counterparts by large margins. Code is made available at https://github.com/GRAND-Lab/MERIT

Download Full-text

Molecular Graph Contrastive Learning with Parameterized Explainable Augmentations

10.1101/2021.12.03.471150 ◽

2021 ◽

Author(s):

Yingheng Wang ◽

Yaosen Min ◽

Erzhuo Shao ◽

Ji Wu

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Structural Information ◽

Molecular Graph ◽

Representation Learning ◽

Graph Representation ◽

Input Graph ◽

Recent Success ◽

Real World Datasets ◽

Comparative Results

ABSTRACTLearning generalizable, transferable, and robust representations for molecule data has always been a challenge. The recent success of contrastive learning (CL) for self-supervised graph representation learning provides a novel perspective to learn molecule representations. The most prevailing graph CL framework is to maximize the agreement of representations in different augmented graph views. However, existing graph CL frameworks usually adopt stochastic augmentations or schemes according to pre-defined rules on the input graph to obtain different graph views in various scales (e.g. node, edge, and subgraph), which may destroy topological semantemes and domain prior in molecule data, leading to suboptimal performance. Therefore, designing parameterized, learnable, and explainable augmentation is quite necessary for molecular graph contrastive learning. A well-designed parameterized augmentation scheme can preserve chemically meaningful structural information and intrinsically essential attributes for molecule graphs, which helps to learn representations that are insensitive to perturbation on unimportant atoms and bonds. In this paper, we propose a novel Molecular Graph Contrastive Learning with Parameterized Explainable Augmentations, MolCLE for brevity, that self-adaptively incorporates chemically significative information from both topological and semantic aspects of molecular graphs. Specifically, we apply deep neural networks to parameterize the augmentation process for both the molecular graph topology and atom attributes, to highlight contributive molecular substructures and recognize underlying chemical semantemes. Comprehensive experiments on a variety of real-world datasets demonstrate that our proposed method consistently outperforms compared baselines, which verifies the effectiveness of the proposed framework. Detailedly, our self-supervised MolCLE model surpasses many supervised counterparts, and meanwhile only uses hundreds of thousands of parameters to achieve comparative results against the state-of-the-art baseline, which has tens of millions of parameters. We also provide detailed case studies to validate the explainability of augmented graph views.CCS CONCEPTS• Mathematics of computing → Graph algorithms; • Applied computing → Bioinformatics; • Computing methodologies → Neural networks; Unsupervised learning.

Download Full-text

UniGNN: a Unified Framework for Graph and Hypergraph Neural Networks

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/353 ◽

2021 ◽

Author(s):

Jing Huang ◽

Jie Yang

Keyword(s):

Neural Networks ◽

Message Passing ◽

State Of The Art ◽

Representation Learning ◽

Graph Representation ◽

Challenging Problem ◽

Unified Framework ◽

Real World Datasets ◽

Graph Neural Networks ◽

Research Domains

Hypergraph, an expressive structure with flexibility to model the higher-order correlations among entities, has recently attracted increasing attention from various research domains. Despite the success of Graph Neural Networks (GNNs) for graph representation learning, how to adapt the powerful GNN-variants directly into hypergraphs remains a challenging problem. In this paper, we propose UniGNN, a unified framework for interpreting the message passing process in graph and hypergraph neural networks, which can generalize general GNN models into hypergraphs. In this framework, meticulously-designed architectures aiming to deepen GNNs can also be incorporated into hypergraphs with the least effort. Extensive experiments have been conducted to demonstrate the effectiveness of UniGNN on multiple real-world datasets, which outperform the state-of-the-art approaches with a large margin. Especially for the DBLP dataset, we increase the accuracy from 77.4% to 88.8% in the semi-supervised hypernode classification task. We further prove that the proposed message-passing based UniGNN models are at most as powerful as the 1-dimensional Generalized Weisfeiler-Leman (1-GWL) algorithm in terms of distinguishing non-isomorphic hypergraphs. Our code is available at https://github.com/OneForward/UniGNN.

Download Full-text

TripletProt: Deep Representation Learning of Proteins based on Siamese Networks

10.1101/2020.05.11.088237 ◽

2020 ◽

Author(s):

Esmaeil Nourani ◽

Ehsaneddin Asgari ◽

Alice C. McHardy ◽

Mohammad R.K. Mofrad

Keyword(s):

Functional Annotation ◽

Cellular Localization ◽

State Of The Art ◽

Language Model ◽

Representation Learning ◽

Learning Problems ◽

Ppi Network ◽

New Approach ◽

Protein Protein Interaction ◽

Siamese Networks

AbstractWe introduce TripletProt, a new approach for protein representation learning based on the Siamese neural networks. We evaluate TripletProt comprehensively in protein functional annotation tasks including sub-cellular localization (14 categories) and gene ontology prediction (more than 2000 classes), which are both challenging multi-class multi-label classification machine learning problems. We compare the performance of TripletProt with the state-of-the-art approaches including recurrent language model-based approach (i.e., UniRep), as well as protein-protein interaction (PPI) network and sequence-based method (i.e., DeepGO). Our TripletProt showed an overall improvement of F1 score in the above mentioned comprehensive functional annotation tasks, solely relying on the PPI network. TripletProt and in general Siamese Network offer great potentials for the protein informatics tasks and can be widely applied to similar tasks.

Download Full-text

Hierarchical and Unsupervised Graph Representation Learning with Loukas’s Coarsening

Algorithms ◽

10.3390/a13090206 ◽

2020 ◽

Vol 13 (9) ◽

pp. 206

Author(s):

Louis Béthune ◽

Yacouba Kaloga ◽

Pierre Borgnat ◽

Aurélien Garivier ◽

Amaury Habrard

Keyword(s):

State Of The Art ◽

Back Propagation ◽

Representation Learning ◽

Graph Representation ◽

High Quality ◽

Attributed Graphs ◽

Information Maximization ◽

Classification Tasks ◽

Micro Structures ◽

Mutual Information Maximization

We propose a novel algorithm for unsupervised graph representation learning with attributed graphs. It combines three advantages addressing some current limitations of the literature: (i) The model is inductive: it can embed new graphs without re-training in the presence of new data; (ii) The method takes into account both micro-structures and macro-structures by looking at the attributed graphs at different scales; (iii) The model is end-to-end differentiable: it is a building block that can be plugged into deep learning pipelines and allows for back-propagation. We show that combining a coarsening method having strong theoretical guarantees with mutual information maximization suffices to produce high quality embeddings. We evaluate them on classification tasks with common benchmarks of the literature. We show that our algorithm is competitive with state of the art among unsupervised graph representation learning methods.

Download Full-text

Multi-scale Information Diffusion Prediction with Reinforced Recurrent Networks

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/560 ◽

2019 ◽

Cited By ~ 7

Author(s):

Cheng Yang ◽

Jian Tang ◽

Maosong Sun ◽

Ganqu Cui ◽

Zhiyuan Liu

Keyword(s):

Information Diffusion ◽

State Of The Art ◽

Sequential Data ◽

Recurrent Networks ◽

Multi Scale ◽

Structural Context ◽

Learning Techniques ◽

Proposed Model ◽

Real World Datasets ◽

Diffusion Prediction

Information diffusion prediction is an important task which studies how information items spread among users. With the success of deep learning techniques, recurrent neural networks (RNNs) have shown their powerful capability in modeling information diffusion as sequential data. However, previous works focused on either microscopic diffusion prediction which aims at guessing the next influenced user or macroscopic diffusion prediction which estimates the total numbers of influenced users during the diffusion process. To the best of our knowledge, no previous works have suggested a unified model for both microscopic and macroscopic scales. In this paper, we propose a novel multi-scale diffusion prediction model based on reinforcement learning (RL). RL incorporates the macroscopic diffusion size information into the RNN-based microscopic diffusion model by addressing the non-differentiable problem. We also employ an effective structural context extraction strategy to utilize the underlying social graph information. Experimental results show that our proposed model outperforms state-of-the-art baseline models on both microscopic and macroscopic diffusion predictions on three real-world datasets.

Download Full-text

Star Topology Convolution for Graph Representation Learning

10.36227/techrxiv.12805799.v2 ◽

2020 ◽

Author(s):

Chong Wu ◽

Zhenan Feng ◽

Jiangbin Zheng ◽

Houwang Zhang ◽

Jiawang Cao ◽

...

Keyword(s):

Protein Identification ◽

State Of The Art ◽

Feature Space ◽

Representation Learning ◽

Graph Representation ◽

Global Features ◽

Star Topology ◽

Identification Methods ◽

Benchmark Datasets ◽

Deep Layers

<div><div><div><p>We present a novel graph convolutional method called star topology convolution (STC). This method makes graph convolution more similar to conventional convolutional neural networks (CNNs) in Euclidean feature space. Unlike most existing spectral convolutional methods, this method learns subgraphs which have a star topology rather than a fixed graph. It has fewer parameters in its convolutional filter and is inductive so that it is more flexible and can be applied to large and evolving graphs. As for CNNs in Euclidean feature space, the convolutional filter is localized and maintains a good weight sharing property. By introducing deep layers, the method can learn global features like a CNN. To validate the method, STC was compared to state-of-the-art spectral convolutional and spatial convolutional methods in a supervised learning setting on three benchmark datasets: Cora, Citeseer and Pubmed. The experimental results show that STC outperforms the other methods. STC was also applied to protein identification tasks and outperformed traditional and advanced protein identification methods.</p></div></div></div>

Download Full-text

Graph Debiased Contrastive Learning with Joint Representation Clustering

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/473 ◽

2021 ◽

Author(s):

Han Zhao ◽

Xu Yang ◽

Zhenru Wang ◽

Erkun Yang ◽

Cheng Deng

Keyword(s):

State Of The Art ◽

False Negative ◽

Poor Performance ◽

Representation Learning ◽

Graph Representation ◽

Learning Framework ◽

Clustering And Classification ◽

Class Information ◽

Classification Tasks ◽

Joint Representation

By contrasting positive-negative counterparts, graph contrastive learning has become a prominent technique for unsupervised graph representation learning. However, existing methods fail to consider the class information and will introduce false-negative samples in the random negative sampling, causing poor performance. To this end, we propose a graph debiased contrastive learning framework, which can jointly perform representation learning and clustering. Specifically, representations can be optimized by aligning with clustered class information, and simultaneously, the optimized representations can promote clustering, leading to more powerful representations and clustering results. More importantly, we randomly select negative samples from the clusters which are different from the positive sample's cluster. In this way, as the supervisory signals, the clustering results can be utilized to effectively decrease the false-negative samples. Extensive experiments on five datasets demonstrate that our method achieves new state-of-the-art results on graph clustering and classification tasks.

Download Full-text

GRAHIES: Multi-Scale Graph Representation Learning with Latent Hierarchical Structure

2019 IEEE First International Conference on Cognitive Machine Intelligence (CogMI) ◽

10.1109/cogmi48466.2019.00011 ◽

2019 ◽

Author(s):

Lei Yu ◽

Qi Zhang ◽

Donna Dillenberger ◽

Ling Liu ◽

Calton Pu ◽

...

Keyword(s):

Hierarchical Structure ◽

Representation Learning ◽

Graph Representation ◽

Multi Scale

Download Full-text

Multi-Aspect Embedding for Attribute-Aware Trajectories

Symmetry ◽

10.3390/sym11091149 ◽

2019 ◽

Vol 11 (9) ◽

pp. 1149

Author(s):

Thapana Boonchoo ◽

Xiang Ao ◽

Qing He

Keyword(s):

Real World ◽

Execution Time ◽

State Of The Art ◽

Representation Learning ◽

Learning Approach ◽

Trajectory Data ◽

Trajectory Mining ◽

Trajectory Similarity ◽

Effectiveness And Efficiency ◽

Real World Datasets

Motivated by the proliferation of trajectory data produced by advanced GPS-enabled devices, trajectory is gaining in complexity and beginning to embroil additional attributes beyond simply the coordinates. As a consequence, this creates the potential to define the similarity between two attribute-aware trajectories. However, most existing trajectory similarity approaches focus only on location based proximities and fail to capture the semantic similarities encompassed by these additional asymmetric attributes (aspects) of trajectories. In this paper, we propose multi-aspect embedding for attribute-aware trajectories (MAEAT), a representation learning approach for trajectories that simultaneously models the similarities according to their multiple aspects. MAEAT is built upon a sentence embedding algorithm and directly learns whole trajectory embedding via predicting the context aspect tokens when given a trajectory. Two kinds of token generation methods are proposed to extract multiple aspects from the raw trajectories, and a regularization is devised to control the importance among aspects. Extensive experiments on the benchmark and real-world datasets show the effectiveness and efficiency of the proposed MAEAT compared to the state-of-the-art and baseline methods. The results of MAEAT can well support representative downstream trajectory mining and management tasks, and the algorithm outperforms other compared methods in execution time by at least two orders of magnitude.

Download Full-text

Unsupervised Structural Graph Node Representation Learning

10.18122/td/1754/boisestate ◽

2020 ◽

Author(s):

Mikel Joaristi

Keyword(s):

Real World ◽

State Of The Art ◽

Structural Information ◽

Representation Learning ◽

Graph Representation ◽

Learning Methods ◽

Structural Graph ◽

Connectivity Information ◽

Latent Space ◽

Previous State

Unsupervised Graph Representation Learning methods learn a numerical representation of the nodes in a graph. The generated representations encode meaningful information about the nodes' properties, making them a powerful tool for tasks in many areas of study, such as social sciences, biology or communication networks. These methods are particularly interesting because they facilitate the direct use of standard Machine Learning models on graphs. Graph representation learning methods can be divided into two main categories depending on the information they encode, methods preserving the nodes connectivity information, and methods preserving nodes' structural information. Connectivity-based methods focus on encoding relationships between nodes, with neighboring nodes being closer together in the resulting latent space. On the other hand, structure-based methods generate a latent space where nodes serving a similar structural function in the network are encoded close to each other, independently of them being connected or even close to each other in the graph. While there are a lot of works that focus on preserving nodes' connectivity information, only a few works study the problem of encoding nodes' structure, specially in an unsupervised way. In this dissertation, we demonstrate that properly encoding nodes' structural information is fundamental for many real-world applications, as it can be leveraged to successfully solve many tasks where connectivity-based methods fail. One concrete example is presented first. In this example, the task consists of detecting malicious entities in a real-world financial network. We show that to solve this problem, connectivity information is not enough and show how leveraging structural information provides considerable performance improvements. This particular example pinpoints the need for further research on the area of structural graph representation learning, together with the limitations of the previous state-of-the-art. We use the acquired knowledge as a starting point and inspiration for the research and development of three independent unsupervised structural graph representation learning methods: Structural Iterative Representation learning approach for Graph Nodes (SIR-GN), Structural Iterative Lexicographic Autoencoded Node Representation (SILA), and Sparse Structural Node Representation (SparseStruct). We show how each of our methods tackles specific limitations on the previous state-of-the-art on structural graph representation learning such as scalability, representation meaning, and lack of formal proof that guarantees the preservation of structural properties. We provide an extensive experimental section where we compare our three proposed methods to the current state-of-the-art on both connectivity-based and structure-based representation learning methods. Finally, in this dissertation, we look at extensions of the basic structural graph representation learning problem. We study the problem of temporal structural graph representation. We also provide a method for representation explainability.

Download Full-text