Exponential Family Graph Embeddings

Abdulkadir Celikkanat; Fragkiskos D. Malliaros

doi:10.1609/aaai.v34i04.5737

Exponential Family Graph Embeddings

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5737 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3357-3364

Author(s):

Abdulkadir Celikkanat ◽

Fragkiskos D. Malliaros

Keyword(s):

Random Walk ◽

Exponential Family ◽

Representation Learning ◽

Learning Problems ◽

Interaction Patterns ◽

Network Representation ◽

Learning Tasks ◽

Learning Techniques ◽

Real World Datasets ◽

Low Dimensional

Representing networks in a low dimensional latent space is a crucial task with many interesting applications in graph learning problems, such as link prediction and node classification. A widely applied network representation learning paradigm is based on the combination of random walks for sampling context nodes and the traditional Skip-Gram model to capture center-context node relationships. In this paper, we emphasize on exponential family distributions to capture rich interaction patterns between nodes in random walk sequences. We introduce the generic exponential family graph embedding model, that generalizes random walk-based network representation learning techniques to exponential family conditional distributions. We study three particular instances of this model, analyzing their properties and showing their relationship to existing unsupervised learning models. Our experimental evaluation on real-world datasets demonstrates that the proposed techniques outperform well-known baseline methods in two downstream machine learning tasks.

Download Full-text

An Information-Explainable Random Walk Based Unsupervised Network Representation Learning Framework on Node Classification Tasks

Mathematics ◽

10.3390/math9151767 ◽

2021 ◽

Vol 9 (15) ◽

pp. 1767

Author(s):

Xin Xu ◽

Yang Lu ◽

Yupeng Zhou ◽

Zhiguo Fu ◽

Yanjie Fu ◽

...

Keyword(s):

Random Walk ◽

Representation Learning ◽

Local Information ◽

Learning Framework ◽

Network Representation ◽

Label Node ◽

Label Information ◽

Classification Tasks ◽

Node Classification ◽

Low Dimensional

Network representation learning aims to learn low-dimensional, compressible, and distributed representational vectors of nodes in networks. Due to the expensive costs of obtaining label information of nodes in networks, many unsupervised network representation learning methods have been proposed, where random walk strategy is one of the wildly utilized approaches. However, the existing random walk based methods have some challenges, including: 1. The insufficiency of explaining what network knowledge in the walking path-samplings; 2. The adverse effects caused by the mixture of different information in networks; 3. The poor generality of the methods with hyper-parameters on different networks. This paper proposes an information-explainable random walk based unsupervised network representation learning framework named Probabilistic Accepted Walk (PAW) to obtain network representation from the perspective of the stationary distribution of networks. In the framework, we design two stationary distributions based on nodes’ self-information and local-information of networks to guide our proposed random walk strategy to learn representational vectors of networks through sampling paths of nodes. Numerous experimental results demonstrated that the PAW could obtain more expressive representation than the other six widely used unsupervised network representation learning baselines on four real-world networks in single-label and multi-label node classification tasks.

Download Full-text

Fairness in Network Representation by Latent Structural Heterogeneity in Observational Data

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5792 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3809-3816

Author(s):

Xin Du ◽

Yulong Pei ◽

Wouter Duivesteijn ◽

Mykola Pechenizkiy

Keyword(s):

Machine Learning ◽

Observational Data ◽

Representation Learning ◽

Structural Heterogeneity ◽

Heterogeneous Distribution ◽

Network Representation ◽

Representation Model ◽

Real World Datasets ◽

Synthetic Datasets ◽

Low Dimensional

While recent advances in machine learning put many focuses on fairness of algorithmic decision making, topics about fairness of representation, especially fairness of network representation, are still underexplored. Network representation learning learns a function mapping nodes to low-dimensional vectors. Structural properties, e.g. communities and roles, are preserved in the latent embedding space. In this paper, we argue that latent structural heterogeneity in the observational data could bias the classical network representation model. The unknown heterogeneous distribution across subgroups raises new challenges for fairness in machine learning. Pre-defined groups with sensitive attributes cannot properly tackle the potential unfairness of network representation. We propose a method which can automatically discover subgroups which are unfairly treated by the network representation model. The fairness measure we propose can evaluate complex targets with multi-degree interactions. We conduct randomly controlled experiments on synthetic datasets and verify our methods on real-world datasets. Both quantitative and quantitative results show that our method is effective to recover the fairness of network representations. Our research draws insight on how structural heterogeneity across subgroups restricted by attributes would affect the fairness of network representation learning.

Download Full-text

The Network Representation Learning Algorithm Based on Semi-Supervised Random Walk

IEEE Access ◽

10.1109/access.2020.3044367 ◽

2020 ◽

Vol 8 ◽

pp. 222956-222965

Author(s):

Dong Liu ◽

Qinpeng Li ◽

Yan Ru ◽

Jun Zhang

Keyword(s):

Random Walk ◽

Learning Algorithm ◽

Representation Learning ◽

Network Representation

Download Full-text

Network Representation Learning-Based Drug Mechanism Discovery and Anti-Inflammatory Response Against COVID-19

10.26434/chemrxiv.12531314.v3 ◽

2021 ◽

Author(s):

Wang Xiaoqi ◽

Bin Xin ◽

Zhijian Xu ◽

Kenli LI ◽

Fei Li ◽

...

Keyword(s):

Inflammatory Response ◽

Inflammatory Responses ◽

Representation Learning ◽

Binding Modes ◽

Network Representation ◽

Drug Mechanism ◽

Docking Program ◽

Therapeutic Development ◽

Anti Inflammatory ◽

Low Dimensional

<p>Recent studies have been demonstrated that the excessive inflammatory response is an important factor of death in COVID-19 patients. In this study, we proposed a network representation learning-based methodology, termed AIdrug2cov, to discover drug mechanism and anti-inflammatory response for patients with COVID-19. This work explores the multi-hub characteristic of a heterogeneous drug network integrating 8 unique networks. Inspired by the multi-hub characteristic, we design three billion special meta paths to train a deep representation model for learning low-dimensional vectors that integrate long-range structure dependency and complex semantic relation among network nodes. Using the representation vectors, AIdrug2cov identifies 40 potential targets and 22 high-confidence drugs that bind to tumor necrosis factor(TNF)-α or interleukin(IL)-6 to prevent excessive inflammatory responses in COVID-19 patients. Finally, we analyze mechanisms of action based on PubMed publications and ongoing clinical trials, and explore the possible binding modes between the new predicted drugs and targets via docking program. In addition, the results in 5 pharmacological application suggested that AIdrug2cov significantly outperforms 5 other state-of-the-art network representation approaches, future demonstrating the availability of AIdrug2cov in drug development field. In summary, AIdrug2cov is practically useful for accelerating COVID-19 therapeutic development. The source code and data can be downloaded from https://github.com/pengsl-lab/AIdrug2cov.git.</p>

Download Full-text

Feature Hashing for Network Representation Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/390 ◽

2018 ◽

Cited By ~ 2

Author(s):

Qixiang Wang ◽

Shanfeng Wang ◽

Maoguo Gong ◽

Yue Wu

Keyword(s):

Link Prediction ◽

Feature Space ◽

Representation Learning ◽

Learning Approaches ◽

Network Representation ◽

Proximity Matrix ◽

Low Dimensional ◽

Vector Representations ◽

Feature Hashing ◽

Node Embeddings

The goal of network representation learning is to embed nodes so as to encode the proximity structures of a graph into a continuous low-dimensional feature space. In this paper, we propose a novel algorithm called node2hash based on feature hashing for generating node embeddings. This approach follows the encoder-decoder framework. There are two main mapping functions in this framework. The first is an encoder to map each node into high-dimensional vectors. The second is a decoder to hash these vectors into a lower dimensional feature space. More specifically, we firstly derive a proximity measurement called expected distance as target which combines position distribution and co-occurrence statistics of nodes over random walks so as to build a proximity matrix, then introduce a set of T different hash functions into feature hashing to generate uniformly distributed vector representations of nodes from the proximity matrix. Compared with the existing state-of-the-art network representation learning approaches, node2hash shows a competitive performance on multi-class node classification and link prediction tasks on three real-world networks from various domains.

Download Full-text

TransPath: Representation Learning for Heterogeneous Information Networks via Translation Mechanism

10.20944/preprints201801.0147.v1 ◽

2018 ◽

Author(s):

Yang Fang ◽

Xiang Zhao ◽

Zhen Tan

Keyword(s):

Large Scale ◽

Representation Learning ◽

Information Networks ◽

Heterogeneous Information ◽

Structure Information ◽

Heterogeneous Information Networks ◽

Network Representation ◽

Meta Path ◽

Translation Mechanism ◽

Real World Datasets

In this paper, we propose a novel network representation learning model TransPath to encode heterogeneous information networks (HINs). Traditional network representation learning models aim to learn the embeddings of a homogeneous network. TransPath is able to capture the rich semantic and structure information of a HIN via meta-paths. We take advantage of the concept of translation mechanism in knowledge graph which regards a meta-path, instead of an edge, as a translating operation from the first node to the last node. Moreover, we propose a user-guided meta-path sampling strategy which takes users' preference as a guidance, which could explore the semantics of a path more precisely, and meanwhile improve model efficiency via the avoidance of other noisy and meaningless meta-paths. We evaluate our model on two large-scale real-world datasets DBLP and YELP, and two benchmark tasks similarity search and node classification. We observe that TransPath outperforms other state-of-the-art baselines consistently and significantly.

Download Full-text

An Attributed Network Representation Learning Method Based on Biased Random Walk

Procedia Computer Science ◽

10.1016/j.procs.2020.06.088 ◽

2020 ◽

Vol 174 ◽

pp. 291-298

Author(s):

Wei Dou ◽

Weiyu Zhang ◽

Ziqiang Weng

Keyword(s):

Random Walk ◽

Representation Learning ◽

Learning Method ◽

Network Representation ◽

Biased Random Walk ◽

Attributed Network

Download Full-text

Inferring the Disease-Associated miRNAs Based on Network Representation Learning and Convolutional Neural Networks

International Journal of Molecular Sciences ◽

10.3390/ijms20153648 ◽

2019 ◽

Vol 20 (15) ◽

pp. 3648 ◽

Cited By ~ 9

Author(s):

Xuan ◽

Sun ◽

Wang ◽

Zhang ◽

Pan

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Prediction Models ◽

Prediction Method ◽

Feature Space ◽

Representation Learning ◽

Superior Performance ◽

Network Representation ◽

Disease Associations ◽

Low Dimensional

Identification of disease-associated miRNAs (disease miRNAs) are critical for understanding etiology and pathogenesis. Most previous methods focus on integrating similarities and associating information contained in heterogeneous miRNA-disease networks. However, these methods establish only shallow prediction models that fail to capture complex relationships among miRNA similarities, disease similarities, and miRNA-disease associations. We propose a prediction method on the basis of network representation learning and convolutional neural networks to predict disease miRNAs, called CNNMDA. CNNMDA deeply integrates the similarity information of miRNAs and diseases, miRNA-disease associations, and representations of miRNAs and diseases in low-dimensional feature space. The new framework based on deep learning was built to learn the original and global representation of a miRNA-disease pair. First, diverse biological premises about miRNAs and diseases were combined to construct the embedding layer in the left part of the framework, from a biological perspective. Second, the various connection edges in the miRNA-disease network, such as similarity and association connections, were dependent on each other. Therefore, it was necessary to learn the low-dimensional representations of the miRNA and disease nodes based on the entire network. The right part of the framework learnt the low-dimensional representation of each miRNA and disease node based on non-negative matrix factorization, and these representations were used to establish the corresponding embedding layer. Finally, the left and right embedding layers went through convolutional modules to deeply learn the complex and non-linear relationships among the similarities and associations between miRNAs and diseases. Experimental results based on cross validation indicated that CNNMDA yields superior performance compared to several state-of-the-art methods. Furthermore, case studies on lung, breast, and pancreatic neoplasms demonstrated the powerful ability of CNNMDA to discover potential disease miRNAs.

Download Full-text

HIN_DRL: A random walk based dynamic network representation learning method for heterogeneous information networks

Expert Systems with Applications ◽

10.1016/j.eswa.2020.113427 ◽

2020 ◽

Vol 158 ◽

pp. 113427

Author(s):

LU Meilian ◽

YE Danna

Keyword(s):

Random Walk ◽

Dynamic Network ◽

Representation Learning ◽

Information Networks ◽

Learning Method ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Network Representation

Download Full-text

Classification of Big Data: Machine Learning Problems and Challenges in Network Intrusion Prediction

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.36.25381 ◽

2018 ◽

Vol 7 (4.36) ◽

pp. 1189

Author(s):

Yasser Mohammad Al-Sharo ◽

Ghazi Shakah ◽

Mutasem Sh.Alkhaswneh ◽

Bajes Zeyad Aljunaeidi ◽

Malik Bader Alazzam

Keyword(s):

Machine Learning ◽

Big Data ◽

Representation Learning ◽

Learning Problems ◽

Traffic Information ◽

Classification Problems ◽

Network Intrusion ◽

Learning Techniques ◽

Geometric Techniques

Centre of attraction of paper is on the main complication on classification of Big Data on network encroachment on traffic. It also explains the disputes this system faces that is bestowed by the Big Data difficulties that are correlate with the network interruption forecast. Forecasting of an attainable interruption in a network entails a prolonged accumulation of traffic information or data and being able to get the concept on their features on motion. The constant accumulation in the network of traffic data thereafter ends with Big Data difficulties that as a result of the large amount, change and possessions of Big Data. In order to learn the features of a network, one needs to have the skills in the machine techniques that are always able to capture world skills and knowledge of the traffic to be in order. The properties of Big Data will always end to an important system disputes to be able to apply machine learning foundation. The paper also discusses the disputes and problems in the way of taking care of Big Data categorization representing geometric techniques of learning along with the existing technologies of Big networking. The study particularly explains challenges that have a relationship with the combined directed by the techniques one learns, machine long learning techniques, and representation-learning techniques and technologies that are related to Big Data for example Hive, Hadoop and Cloud that are basics that enhances problem-solving that gives relevant solutions to classification problems in traffic networking.

Download Full-text