Node classification problem based on potential information and representation learning between nodes

Author(s):  
Mingming Li ◽  
Yan Yang
2020 ◽  
Vol 36 (4) ◽  
pp. 305-323
Author(s):  
Quan Hoang Nguyen ◽  
Ly Vu ◽  
Quang Uy Nguyen

Sentiment classification (SC) aims to determine whether a document conveys a positive or negative opinion. Due to the rapid development of the digital world, SC has become an important research topic that affects many aspects of our life. In SC based on machine learning, the representation of the document strongly influences on its accuracy. Word Embedding (WE)-based techniques, i.e., Word2vec techniques, are proved to be beneficial techniques to the SC problem. However, Word2vec is often not enough to represent the semantic of documents with complex sentences of Vietnamese. In this paper, we propose a new representation learning model called a \textbf{two-channel vector} to learn a higher-level feature of a document in SC. Our model uses two neural networks to learn the semantic feature, i.e., Word2vec and the syntactic feature, i.e., Part of Speech tag (POS). Two features are then combined and input to a \textit{Softmax} function to make the final classification. We carry out intensive experiments on $4$ recent Vietnamese sentiment datasets to evaluate the performance of the proposed architecture. The experimental results demonstrate that the proposed model can significantly enhance the accuracy of SC problems compared to two single models and a state-of-the-art ensemble method.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Shicong Chen ◽  
Deyu Yuan ◽  
Shuhua Huang ◽  
Yang Chen

The goal of network representation learning is to extract deep-level abstraction from data features that can also be viewed as a process of transforming the high-dimensional data to low-dimensional features. Learning the mapping functions between two vector spaces is an essential problem. In this paper, we propose a new similarity index based on traditional machine learning, which integrates the concepts of common neighbor, local path, and preferential attachment. Furthermore, for applying the link prediction methods to the field of node classification, we have innovatively established an architecture named multitask graph autoencoder. Specifically, in the context of structural deep network embedding, the architecture designs a framework of high-order loss function by calculating the node similarity from multiple angles so that the model can make up for the deficiency of the second-order loss function. Through the parameter fine-tuning, the high-order loss function is introduced into the optimized autoencoder. Proved by the effective experiments, the framework is generally applicable to the majority of classical similarity indexes.


2020 ◽  
Vol 34 (03) ◽  
pp. 2991-2999 ◽  
Author(s):  
Xiao Shen ◽  
Quanyu Dai ◽  
Fu-lai Chung ◽  
Wei Lu ◽  
Kup-Sze Choi

In this paper, the task of cross-network node classification, which leverages the abundant labeled nodes from a source network to help classify unlabeled nodes in a target network, is studied. The existing domain adaptation algorithms generally fail to model the network structural information, and the current network embedding models mainly focus on single-network applications. Thus, both of them cannot be directly applied to solve the cross-network node classification problem. This motivates us to propose an adversarial cross-network deep network embedding (ACDNE) model to integrate adversarial domain adaptation with deep network embedding so as to learn network-invariant node representations that can also well preserve the network structural information. In ACDNE, the deep network embedding module utilizes two feature extractors to jointly preserve attributed affinity and topological proximities between nodes. In addition, a node classifier is incorporated to make node representations label-discriminative. Moreover, an adversarial domain adaptation technique is employed to make node representations network-invariant. Extensive experimental results demonstrate that the proposed ACDNE model achieves the state-of-the-art performance in cross-network node classification.


2020 ◽  
Vol 10 (20) ◽  
pp. 7214
Author(s):  
Cheng-Te Li ◽  
Hong-Yu Lin

Network representation learning (NRL) is crucial in generating effective node features for downstream tasks, such as node classification (NC) and link prediction (LP). However, existing NRL methods neither properly identify neighbor nodes that should be pushed together and away in the embedding space, nor model coarse-grained community knowledge hidden behind the network topology. In this paper, we propose a novel NRL framework, Structural Hierarchy Enhancement (SHE), to deal with such two issues. The main idea is to construct a structural hierarchy from the network based on community detection, and to utilize such a hierarchy to perform level-wise NRL. In addition, lower-level node embeddings are passed to higher-level ones so that community knowledge can be aware of in NRL. Experiments conducted on benchmark network datasets show that SHE can significantly boost the performance of NRL in both tasks of NC and LP, compared to other hierarchical NRL methods.


Mathematics ◽  
2021 ◽  
Vol 9 (15) ◽  
pp. 1767
Author(s):  
Xin Xu ◽  
Yang Lu ◽  
Yupeng Zhou ◽  
Zhiguo Fu ◽  
Yanjie Fu ◽  
...  

Network representation learning aims to learn low-dimensional, compressible, and distributed representational vectors of nodes in networks. Due to the expensive costs of obtaining label information of nodes in networks, many unsupervised network representation learning methods have been proposed, where random walk strategy is one of the wildly utilized approaches. However, the existing random walk based methods have some challenges, including: 1. The insufficiency of explaining what network knowledge in the walking path-samplings; 2. The adverse effects caused by the mixture of different information in networks; 3. The poor generality of the methods with hyper-parameters on different networks. This paper proposes an information-explainable random walk based unsupervised network representation learning framework named Probabilistic Accepted Walk (PAW) to obtain network representation from the perspective of the stationary distribution of networks. In the framework, we design two stationary distributions based on nodes’ self-information and local-information of networks to guide our proposed random walk strategy to learn representational vectors of networks through sampling paths of nodes. Numerous experimental results demonstrated that the PAW could obtain more expressive representation than the other six widely used unsupervised network representation learning baselines on four real-world networks in single-label and multi-label node classification tasks.


2020 ◽  
Vol 34 (04) ◽  
pp. 5810-5817
Author(s):  
Masoumeh Soflaei ◽  
Hongyu Guo ◽  
Ali Al-Bashabsheh ◽  
Yongyi Mao ◽  
Richong Zhang

We consider the problem of learning a neural network classifier. Under the information bottleneck (IB) principle, we associate with this classification problem a representation learning problem, which we call “IB learning”. We show that IB learning is, in fact, equivalent to a special class of the quantization problem. The classical results in rate-distortion theory then suggest that IB learning can benefit from a “vector quantization” approach, namely, simultaneously learning the representations of multiple input objects. Such an approach assisted with some variational techniques, result in a novel learning framework, “Aggregated Learning”, for classification with neural network models. In this framework, several objects are jointly classified by a single neural network. The effectiveness of this framework is verified through extensive experiments on standard image recognition and text classification tasks.


2022 ◽  
Vol 31 (2) ◽  
pp. 1-34
Author(s):  
Patrick Keller ◽  
Abdoul Kader Kaboré ◽  
Laura Plein ◽  
Jacques Klein ◽  
Yves Le Traon ◽  
...  

Recent successes in training word embeddings for Natural Language Processing ( NLP ) tasks have encouraged a wave of research on representation learning for source code, which builds on similar NLP methods. The overall objective is then to produce code embeddings that capture the maximum of program semantics. State-of-the-art approaches invariably rely on a syntactic representation (i.e., raw lexical tokens, abstract syntax trees, or intermediate representation tokens) to generate embeddings, which are criticized in the literature as non-robust or non-generalizable. In this work, we investigate a novel embedding approach based on the intuition that source code has visual patterns of semantics. We further use these patterns to address the outstanding challenge of identifying semantic code clones. We propose the WySiWiM  ( ‘ ‘What You See Is What It Means ” ) approach where visual representations of source code are fed into powerful pre-trained image classification neural networks from the field of computer vision to benefit from the practical advantages of transfer learning. We evaluate the proposed embedding approach on the task of vulnerable code prediction in source code and on two variations of the task of semantic code clone identification: code clone detection (a binary classification problem), and code classification (a multi-classification problem). We show with experiments on the BigCloneBench (Java), Open Judge (C) that although simple, our WySiWiM  approach performs as effectively as state-of-the-art approaches such as ASTNN or TBCNN. We also showed with data from NVD and SARD that WySiWiM  representation can be used to learn a vulnerable code detector with reasonable performance (accuracy ∼90%). We further explore the influence of different steps in our approach, such as the choice of visual representations or the classification algorithm, to eventually discuss the promises and limitations of this research direction.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Siquan Yu ◽  
Jiaxin Liu ◽  
Zhi Han ◽  
Yong Li ◽  
Yandong Tang ◽  
...  

Image clustering is a complex procedure, which is significantly affected by the choice of image representation. Most of the existing image clustering methods treat representation learning and clustering separately, which usually bring two problems. On the one hand, image representations are difficult to select and the learned representations are not suitable for clustering. On the other hand, they inevitably involve some clustering step, which may bring some error and hurt the clustering results. To tackle these problems, we present a new clustering method that efficiently builds an image representation and precisely discovers cluster assignments. For this purpose, the image clustering task is regarded as a binary pairwise classification problem with local structure preservation. Specifically, we propose here such an approach for image clustering based on a fully convolutional autoencoder and deep adaptive clustering (DAC). To extract the essential representation and maintain the local structure, a fully convolutional autoencoder is applied. To manipulate feature to clustering space and obtain a suitable image representation, the DAC algorithm participates in the training of autoencoder. Our method can learn an image representation that is suitable for clustering and discover the precise clustering label for each image. A series of real-world image clustering experiments verify the effectiveness of the proposed algorithm.


Sign in / Sign up

Export Citation Format

Share Document