scholarly journals Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning

Author(s):  
Ming Jin ◽  
Yizhen Zheng ◽  
Yuan-Fang Li ◽  
Chen Gong ◽  
Chuan Zhou ◽  
...  

Graph representation learning plays a vital role in processing graph-structured data. However, prior arts on graph representation learning heavily rely on labeling information. To overcome this problem, inspired by the recent success of graph contrastive learning and Siamese networks in visual representation learning, we propose a novel self-supervised approach in this paper to learn node representations by enhancing Siamese self-distillation with multi-scale contrastive learning. Specifically, we first generate two augmented views from the input graph based on local and global perspectives. Then, we employ two objectives called cross-view and cross-network contrastiveness to maximize the agreement between node representations across different views and networks. To demonstrate the effectiveness of our approach, we perform empirical experiments on five real-world datasets. Our method not only achieves new state-of-the-art results but also surpasses some semi-supervised counterparts by large margins. Code is made available at https://github.com/GRAND-Lab/MERIT

2021 ◽  
Author(s):  
Yingheng Wang ◽  
Yaosen Min ◽  
Erzhuo Shao ◽  
Ji Wu

ABSTRACTLearning generalizable, transferable, and robust representations for molecule data has always been a challenge. The recent success of contrastive learning (CL) for self-supervised graph representation learning provides a novel perspective to learn molecule representations. The most prevailing graph CL framework is to maximize the agreement of representations in different augmented graph views. However, existing graph CL frameworks usually adopt stochastic augmentations or schemes according to pre-defined rules on the input graph to obtain different graph views in various scales (e.g. node, edge, and subgraph), which may destroy topological semantemes and domain prior in molecule data, leading to suboptimal performance. Therefore, designing parameterized, learnable, and explainable augmentation is quite necessary for molecular graph contrastive learning. A well-designed parameterized augmentation scheme can preserve chemically meaningful structural information and intrinsically essential attributes for molecule graphs, which helps to learn representations that are insensitive to perturbation on unimportant atoms and bonds. In this paper, we propose a novel Molecular Graph Contrastive Learning with Parameterized Explainable Augmentations, MolCLE for brevity, that self-adaptively incorporates chemically significative information from both topological and semantic aspects of molecular graphs. Specifically, we apply deep neural networks to parameterize the augmentation process for both the molecular graph topology and atom attributes, to highlight contributive molecular substructures and recognize underlying chemical semantemes. Comprehensive experiments on a variety of real-world datasets demonstrate that our proposed method consistently outperforms compared baselines, which verifies the effectiveness of the proposed framework. Detailedly, our self-supervised MolCLE model surpasses many supervised counterparts, and meanwhile only uses hundreds of thousands of parameters to achieve comparative results against the state-of-the-art baseline, which has tens of millions of parameters. We also provide detailed case studies to validate the explainability of augmented graph views.CCS CONCEPTS• Mathematics of computing → Graph algorithms; • Applied computing → Bioinformatics; • Computing methodologies → Neural networks; Unsupervised learning.


Author(s):  
Jing Huang ◽  
Jie Yang

Hypergraph, an expressive structure with flexibility to model the higher-order correlations among entities, has recently attracted increasing attention from various research domains. Despite the success of Graph Neural Networks (GNNs) for graph representation learning, how to adapt the powerful GNN-variants directly into hypergraphs remains a challenging problem. In this paper, we propose UniGNN, a unified framework for interpreting the message passing process in graph and hypergraph neural networks, which can generalize general GNN models into hypergraphs. In this framework, meticulously-designed architectures aiming to deepen GNNs can also be incorporated into hypergraphs with the least effort. Extensive experiments have been conducted to demonstrate the effectiveness of UniGNN on multiple real-world datasets, which outperform the state-of-the-art approaches with a large margin. Especially for the DBLP dataset, we increase the accuracy from 77.4% to 88.8% in the semi-supervised hypernode classification task. We further prove that the proposed message-passing based UniGNN models are at most as powerful as the 1-dimensional Generalized Weisfeiler-Leman (1-GWL) algorithm in terms of distinguishing non-isomorphic hypergraphs. Our code is available at https://github.com/OneForward/UniGNN.


2020 ◽  
Author(s):  
Esmaeil Nourani ◽  
Ehsaneddin Asgari ◽  
Alice C. McHardy ◽  
Mohammad R.K. Mofrad

AbstractWe introduce TripletProt, a new approach for protein representation learning based on the Siamese neural networks. We evaluate TripletProt comprehensively in protein functional annotation tasks including sub-cellular localization (14 categories) and gene ontology prediction (more than 2000 classes), which are both challenging multi-class multi-label classification machine learning problems. We compare the performance of TripletProt with the state-of-the-art approaches including recurrent language model-based approach (i.e., UniRep), as well as protein-protein interaction (PPI) network and sequence-based method (i.e., DeepGO). Our TripletProt showed an overall improvement of F1 score in the above mentioned comprehensive functional annotation tasks, solely relying on the PPI network. TripletProt and in general Siamese Network offer great potentials for the protein informatics tasks and can be widely applied to similar tasks.


Algorithms ◽  
2020 ◽  
Vol 13 (9) ◽  
pp. 206
Author(s):  
Louis Béthune ◽  
Yacouba Kaloga ◽  
Pierre Borgnat ◽  
Aurélien Garivier ◽  
Amaury Habrard

We propose a novel algorithm for unsupervised graph representation learning with attributed graphs. It combines three advantages addressing some current limitations of the literature: (i) The model is inductive: it can embed new graphs without re-training in the presence of new data; (ii) The method takes into account both micro-structures and macro-structures by looking at the attributed graphs at different scales; (iii) The model is end-to-end differentiable: it is a building block that can be plugged into deep learning pipelines and allows for back-propagation. We show that combining a coarsening method having strong theoretical guarantees with mutual information maximization suffices to produce high quality embeddings. We evaluate them on classification tasks with common benchmarks of the literature. We show that our algorithm is competitive with state of the art among unsupervised graph representation learning methods.


Author(s):  
Cheng Yang ◽  
Jian Tang ◽  
Maosong Sun ◽  
Ganqu Cui ◽  
Zhiyuan Liu

Information diffusion prediction is an important task which studies how information items spread among users. With the success of deep learning techniques, recurrent neural networks (RNNs) have shown their powerful capability in modeling information diffusion as sequential data. However, previous works focused on either microscopic diffusion prediction which aims at guessing the next influenced user or macroscopic diffusion prediction which estimates the total numbers of influenced users during the diffusion process. To the best of our knowledge, no previous works have suggested a unified model for both microscopic and macroscopic scales. In this paper, we propose a novel multi-scale diffusion prediction model based on reinforcement learning (RL). RL incorporates the macroscopic diffusion size information into the RNN-based microscopic diffusion model by addressing the non-differentiable problem. We also employ an effective structural context extraction strategy to utilize the underlying social graph information. Experimental results show that our proposed model outperforms state-of-the-art baseline models on both microscopic and macroscopic diffusion predictions on three real-world datasets.


2020 ◽  
Author(s):  
Chong Wu ◽  
Zhenan Feng ◽  
Jiangbin Zheng ◽  
Houwang Zhang ◽  
Jiawang Cao ◽  
...  

<div><div><div><p>We present a novel graph convolutional method called star topology convolution (STC). This method makes graph convolution more similar to conventional convolutional neural networks (CNNs) in Euclidean feature space. Unlike most existing spectral convolutional methods, this method learns subgraphs which have a star topology rather than a fixed graph. It has fewer parameters in its convolutional filter and is inductive so that it is more flexible and can be applied to large and evolving graphs. As for CNNs in Euclidean feature space, the convolutional filter is localized and maintains a good weight sharing property. By introducing deep layers, the method can learn global features like a CNN. To validate the method, STC was compared to state-of-the-art spectral convolutional and spatial convolutional methods in a supervised learning setting on three benchmark datasets: Cora, Citeseer and Pubmed. The experimental results show that STC outperforms the other methods. STC was also applied to protein identification tasks and outperformed traditional and advanced protein identification methods.</p></div></div></div>


Author(s):  
Han Zhao ◽  
Xu Yang ◽  
Zhenru Wang ◽  
Erkun Yang ◽  
Cheng Deng

By contrasting positive-negative counterparts, graph contrastive learning has become a prominent technique for unsupervised graph representation learning. However, existing methods fail to consider the class information and will introduce false-negative samples in the random negative sampling, causing poor performance. To this end, we propose a graph debiased contrastive learning framework, which can jointly perform representation learning and clustering. Specifically, representations can be optimized by aligning with clustered class information, and simultaneously, the optimized representations can promote clustering, leading to more powerful representations and clustering results. More importantly, we randomly select negative samples from the clusters which are different from the positive sample's cluster. In this way, as the supervisory signals, the clustering results can be utilized to effectively decrease the false-negative samples. Extensive experiments on five datasets demonstrate that our method achieves new state-of-the-art results on graph clustering and classification tasks.


Symmetry ◽  
2019 ◽  
Vol 11 (9) ◽  
pp. 1149
Author(s):  
Thapana Boonchoo ◽  
Xiang Ao ◽  
Qing He

Motivated by the proliferation of trajectory data produced by advanced GPS-enabled devices, trajectory is gaining in complexity and beginning to embroil additional attributes beyond simply the coordinates. As a consequence, this creates the potential to define the similarity between two attribute-aware trajectories. However, most existing trajectory similarity approaches focus only on location based proximities and fail to capture the semantic similarities encompassed by these additional asymmetric attributes (aspects) of trajectories. In this paper, we propose multi-aspect embedding for attribute-aware trajectories (MAEAT), a representation learning approach for trajectories that simultaneously models the similarities according to their multiple aspects. MAEAT is built upon a sentence embedding algorithm and directly learns whole trajectory embedding via predicting the context aspect tokens when given a trajectory. Two kinds of token generation methods are proposed to extract multiple aspects from the raw trajectories, and a regularization is devised to control the importance among aspects. Extensive experiments on the benchmark and real-world datasets show the effectiveness and efficiency of the proposed MAEAT compared to the state-of-the-art and baseline methods. The results of MAEAT can well support representative downstream trajectory mining and management tasks, and the algorithm outperforms other compared methods in execution time by at least two orders of magnitude.


2020 ◽  
Author(s):  
Mikel Joaristi

Unsupervised Graph Representation Learning methods learn a numerical representation of the nodes in a graph. The generated representations encode meaningful information about the nodes' properties, making them a powerful tool for tasks in many areas of study, such as social sciences, biology or communication networks. These methods are particularly interesting because they facilitate the direct use of standard Machine Learning models on graphs. Graph representation learning methods can be divided into two main categories depending on the information they encode, methods preserving the nodes connectivity information, and methods preserving nodes' structural information. Connectivity-based methods focus on encoding relationships between nodes, with neighboring nodes being closer together in the resulting latent space. On the other hand, structure-based methods generate a latent space where nodes serving a similar structural function in the network are encoded close to each other, independently of them being connected or even close to each other in the graph. While there are a lot of works that focus on preserving nodes' connectivity information, only a few works study the problem of encoding nodes' structure, specially in an unsupervised way. In this dissertation, we demonstrate that properly encoding nodes' structural information is fundamental for many real-world applications, as it can be leveraged to successfully solve many tasks where connectivity-based methods fail. One concrete example is presented first. In this example, the task consists of detecting malicious entities in a real-world financial network. We show that to solve this problem, connectivity information is not enough and show how leveraging structural information provides considerable performance improvements. This particular example pinpoints the need for further research on the area of structural graph representation learning, together with the limitations of the previous state-of-the-art. We use the acquired knowledge as a starting point and inspiration for the research and development of three independent unsupervised structural graph representation learning methods: Structural Iterative Representation learning approach for Graph Nodes (SIR-GN), Structural Iterative Lexicographic Autoencoded Node Representation (SILA), and Sparse Structural Node Representation (SparseStruct). We show how each of our methods tackles specific limitations on the previous state-of-the-art on structural graph representation learning such as scalability, representation meaning, and lack of formal proof that guarantees the preservation of structural properties. We provide an extensive experimental section where we compare our three proposed methods to the current state-of-the-art on both connectivity-based and structure-based representation learning methods. Finally, in this dissertation, we look at extensions of the basic structural graph representation learning problem. We study the problem of temporal structural graph representation. We also provide a method for representation explainability.


Sign in / Sign up

Export Citation Format

Share Document