scholarly journals Graph Debiased Contrastive Learning with Joint Representation Clustering

Author(s):  
Han Zhao ◽  
Xu Yang ◽  
Zhenru Wang ◽  
Erkun Yang ◽  
Cheng Deng

By contrasting positive-negative counterparts, graph contrastive learning has become a prominent technique for unsupervised graph representation learning. However, existing methods fail to consider the class information and will introduce false-negative samples in the random negative sampling, causing poor performance. To this end, we propose a graph debiased contrastive learning framework, which can jointly perform representation learning and clustering. Specifically, representations can be optimized by aligning with clustered class information, and simultaneously, the optimized representations can promote clustering, leading to more powerful representations and clustering results. More importantly, we randomly select negative samples from the clusters which are different from the positive sample's cluster. In this way, as the supervisory signals, the clustering results can be utilized to effectively decrease the false-negative samples. Extensive experiments on five datasets demonstrate that our method achieves new state-of-the-art results on graph clustering and classification tasks.

Algorithms ◽  
2020 ◽  
Vol 13 (9) ◽  
pp. 206
Author(s):  
Louis Béthune ◽  
Yacouba Kaloga ◽  
Pierre Borgnat ◽  
Aurélien Garivier ◽  
Amaury Habrard

We propose a novel algorithm for unsupervised graph representation learning with attributed graphs. It combines three advantages addressing some current limitations of the literature: (i) The model is inductive: it can embed new graphs without re-training in the presence of new data; (ii) The method takes into account both micro-structures and macro-structures by looking at the attributed graphs at different scales; (iii) The model is end-to-end differentiable: it is a building block that can be plugged into deep learning pipelines and allows for back-propagation. We show that combining a coarsening method having strong theoretical guarantees with mutual information maximization suffices to produce high quality embeddings. We evaluate them on classification tasks with common benchmarks of the literature. We show that our algorithm is competitive with state of the art among unsupervised graph representation learning methods.


Author(s):  
Yuqiao Yang ◽  
Xiaoqiang Lin ◽  
Geng Lin ◽  
Zengfeng Huang ◽  
Changjian Jiang ◽  
...  

In this paper, we explore to learn representations of legislation and legislator for the prediction of roll call results. The most popular approach for this topic is named the ideal point model that relies on historical voting information for representation learning of legislators. It largely ignores the context information of the legislative data. We, therefore, propose to incorporate context information to learn dense representations for both legislators and legislation. For legislators, we incorporate relations among them via graph convolutional neural networks (GCN) for their representation learning. For legislation, we utilize its narrative description via recurrent neural networks (RNN) for representation learning. In order to align two kinds of representations in the same vector space, we introduce a triplet loss for the joint training. Experimental results on a self-constructed dataset show the effectiveness of our model for roll call results prediction compared to some state-of-the-art baselines.


2020 ◽  
Author(s):  
Chong Wu ◽  
Zhenan Feng ◽  
Jiangbin Zheng ◽  
Houwang Zhang ◽  
Jiawang Cao ◽  
...  

<div><div><div><p>We present a novel graph convolutional method called star topology convolution (STC). This method makes graph convolution more similar to conventional convolutional neural networks (CNNs) in Euclidean feature space. Unlike most existing spectral convolutional methods, this method learns subgraphs which have a star topology rather than a fixed graph. It has fewer parameters in its convolutional filter and is inductive so that it is more flexible and can be applied to large and evolving graphs. As for CNNs in Euclidean feature space, the convolutional filter is localized and maintains a good weight sharing property. By introducing deep layers, the method can learn global features like a CNN. To validate the method, STC was compared to state-of-the-art spectral convolutional and spatial convolutional methods in a supervised learning setting on three benchmark datasets: Cora, Citeseer and Pubmed. The experimental results show that STC outperforms the other methods. STC was also applied to protein identification tasks and outperformed traditional and advanced protein identification methods.</p></div></div></div>


Author(s):  
Jing Huang ◽  
Jie Yang

Hypergraph, an expressive structure with flexibility to model the higher-order correlations among entities, has recently attracted increasing attention from various research domains. Despite the success of Graph Neural Networks (GNNs) for graph representation learning, how to adapt the powerful GNN-variants directly into hypergraphs remains a challenging problem. In this paper, we propose UniGNN, a unified framework for interpreting the message passing process in graph and hypergraph neural networks, which can generalize general GNN models into hypergraphs. In this framework, meticulously-designed architectures aiming to deepen GNNs can also be incorporated into hypergraphs with the least effort. Extensive experiments have been conducted to demonstrate the effectiveness of UniGNN on multiple real-world datasets, which outperform the state-of-the-art approaches with a large margin. Especially for the DBLP dataset, we increase the accuracy from 77.4% to 88.8% in the semi-supervised hypernode classification task. We further prove that the proposed message-passing based UniGNN models are at most as powerful as the 1-dimensional Generalized Weisfeiler-Leman (1-GWL) algorithm in terms of distinguishing non-isomorphic hypergraphs. Our code is available at https://github.com/OneForward/UniGNN.


2020 ◽  
Author(s):  
Mikel Joaristi

Unsupervised Graph Representation Learning methods learn a numerical representation of the nodes in a graph. The generated representations encode meaningful information about the nodes' properties, making them a powerful tool for tasks in many areas of study, such as social sciences, biology or communication networks. These methods are particularly interesting because they facilitate the direct use of standard Machine Learning models on graphs. Graph representation learning methods can be divided into two main categories depending on the information they encode, methods preserving the nodes connectivity information, and methods preserving nodes' structural information. Connectivity-based methods focus on encoding relationships between nodes, with neighboring nodes being closer together in the resulting latent space. On the other hand, structure-based methods generate a latent space where nodes serving a similar structural function in the network are encoded close to each other, independently of them being connected or even close to each other in the graph. While there are a lot of works that focus on preserving nodes' connectivity information, only a few works study the problem of encoding nodes' structure, specially in an unsupervised way. In this dissertation, we demonstrate that properly encoding nodes' structural information is fundamental for many real-world applications, as it can be leveraged to successfully solve many tasks where connectivity-based methods fail. One concrete example is presented first. In this example, the task consists of detecting malicious entities in a real-world financial network. We show that to solve this problem, connectivity information is not enough and show how leveraging structural information provides considerable performance improvements. This particular example pinpoints the need for further research on the area of structural graph representation learning, together with the limitations of the previous state-of-the-art. We use the acquired knowledge as a starting point and inspiration for the research and development of three independent unsupervised structural graph representation learning methods: Structural Iterative Representation learning approach for Graph Nodes (SIR-GN), Structural Iterative Lexicographic Autoencoded Node Representation (SILA), and Sparse Structural Node Representation (SparseStruct). We show how each of our methods tackles specific limitations on the previous state-of-the-art on structural graph representation learning such as scalability, representation meaning, and lack of formal proof that guarantees the preservation of structural properties. We provide an extensive experimental section where we compare our three proposed methods to the current state-of-the-art on both connectivity-based and structure-based representation learning methods. Finally, in this dissertation, we look at extensions of the basic structural graph representation learning problem. We study the problem of temporal structural graph representation. We also provide a method for representation explainability.


2020 ◽  
Vol 34 (04) ◽  
pp. 7007-7014
Author(s):  
Shichao Zhu ◽  
Lewei Zhou ◽  
Shirui Pan ◽  
Chuan Zhou ◽  
Guiying Yan ◽  
...  

Graph Neural Networks (GNNs) have achieved state-of-the-art performance in many graph data analysis tasks. However, they still suffer from two limitations for graph representation learning. First, they exploit non-smoothing node features which may result in suboptimal embedding and degenerated performance for graph classification. Second, they only exploit neighbor information but ignore global topological knowledge. Aiming to overcome these limitations simultaneously, in this paper, we propose a novel, flexible, and end-to-end framework, Graph Smoothing Splines Neural Networks (GSSNN), for graph classification. By exploiting the smoothing splines, which are widely used to learn smoothing fitting function in regression, we develop an effective feature smoothing and enhancement module Scaled Smoothing Splines (S3) to learn graph embedding. To integrate global topological information, we design a novel scoring module, which exploits closeness, degree, as well as self-attention values, to select important node features as knots for smoothing splines. These knots can be potentially used for interpreting classification results. In extensive experiments on biological and social datasets, we demonstrate that our model achieves state-of-the-arts and GSSNN is superior in learning more robust graph representations. Furthermore, we show that S3 module is easily plugged into existing GNNs to improve their performance.


2020 ◽  
Vol 34 (04) ◽  
pp. 6438-6445
Author(s):  
Yuan Wu ◽  
Yuhong Guo

With the advent of deep learning, the performance of text classification models have been improved significantly. Nevertheless, the successful training of a good classification model requires a sufficient amount of labeled data, while it is always expensive and time consuming to annotate data. With the rapid growth of digital data, similar classification tasks can typically occur in multiple domains, while the availability of labeled data can largely vary across domains. Some domains may have abundant labeled data, while in some other domains there may only exist a limited amount (or none) of labeled data. Meanwhile text classification tasks are highly domain-dependent — a text classifier trained in one domain may not perform well in another domain. In order to address these issues, in this paper we propose a novel dual adversarial co-learning approach for multi-domain text classification (MDTC). The approach learns shared-private networks for feature extraction and deploys dual adversarial regularizations to align features across different domains and between labeled and unlabeled data simultaneously under a discrepancy based co-learning framework, aiming to improve the classifiers' generalization capacity with the learned features. We conduct experiments on multi-domain sentiment classification datasets. The results show the proposed approach achieves the state-of-the-art MDTC performance.


2021 ◽  
Author(s):  
Jiahua Rao ◽  
Shuangjia Zheng ◽  
Ying Song ◽  
Jianwen Chen ◽  
Chengtao Li ◽  
...  

AbstractSummaryRecently, novel representation learning algorithms have shown potential for predicting molecular properties. However, unified frameworks have not yet emerged for fairly measuring algorithmic progress, and experimental procedures of different representation models often lack rigorousness and are hardly reproducible. Herein, we have developed MolRep by unifying 16 state-of-the-art models across 4 popular molecular representations for application and comparison. Furthermore, we ran more than 12.5 million experiments to optimize hyperparameters for each method on 12 common benchmark data sets. As a result, CMPNN achieves the best results ranked the 1st in 5 out of 12 tasks with an average rank of 1.75. Relatively, ECC has good performance in classification tasks and MAT good for regression (both ranked 1st for 3 tasks) with an average rank of 2.71 and 2.6, respectively.AvailabilityThe source code is available at: https://github.com/biomed-AI/MolRepSupplementary informationSupplementary data are available online.


Author(s):  
Ming Jin ◽  
Yizhen Zheng ◽  
Yuan-Fang Li ◽  
Chen Gong ◽  
Chuan Zhou ◽  
...  

Graph representation learning plays a vital role in processing graph-structured data. However, prior arts on graph representation learning heavily rely on labeling information. To overcome this problem, inspired by the recent success of graph contrastive learning and Siamese networks in visual representation learning, we propose a novel self-supervised approach in this paper to learn node representations by enhancing Siamese self-distillation with multi-scale contrastive learning. Specifically, we first generate two augmented views from the input graph based on local and global perspectives. Then, we employ two objectives called cross-view and cross-network contrastiveness to maximize the agreement between node representations across different views and networks. To demonstrate the effectiveness of our approach, we perform empirical experiments on five real-world datasets. Our method not only achieves new state-of-the-art results but also surpasses some semi-supervised counterparts by large margins. Code is made available at https://github.com/GRAND-Lab/MERIT


Sign in / Sign up

Export Citation Format

Share Document