scholarly journals Effective Decoding in Graph Auto-Encoder Using Triadic Closure

2020 ◽  
Vol 34 (01) ◽  
pp. 906-913
Author(s):  
Han Shi ◽  
Haozheng Fan ◽  
James T. Kwok

The (variational) graph auto-encoder and its variants have been popularly used for representation learning on graph-structured data. While the encoder is often a powerful graph convolutional network, the decoder reconstructs the graph structure by only considering two nodes at a time, thus ignoring possible interactions among edges. On the other hand, structured prediction, which considers the whole graph simultaneously, is computationally expensive. In this paper, we utilize the well-known triadic closure property which is exhibited in many real-world networks. We propose the triad decoder, which considers and predicts the three edges involved in a local triad together. The triad decoder can be readily used in any graph-based auto-encoder. In particular, we incorporate this to the (variational) graph auto-encoder. Experiments on link prediction, node clustering and graph generation show that the use of triads leads to more accurate prediction, clustering and better preservation of the graph characteristics.

Information ◽  
2020 ◽  
Vol 11 (11) ◽  
pp. 525
Author(s):  
Franz Hell ◽  
Yasser Taha ◽  
Gereon Hinz ◽  
Sabine Heibei ◽  
Harald Müller ◽  
...  

Recent advancements in deep neural networks for graph-structured data have led to state-of-the-art performance in recommender system benchmarks. Adapting these methods to pharmacy product cross-selling recommendation tasks with a million products and hundreds of millions of sales remains a challenge, due to the intricate medical and legal properties of pharmaceutical data. To tackle this challenge, we developed a graph convolutional network (GCN) algorithm called PharmaSage, which uses graph convolutions to generate embeddings for pharmacy products, which are then used in a downstream recommendation task. In the underlying graph, we incorporate both cross-sales information from the sales transaction within the graph structure, as well as product information as node features. Via modifications to the sampling involved in the network optimization process, we address a common phenomenon in recommender systems, the so-called popularity bias: popular products are frequently recommended, while less popular items are often neglected and recommended seldomly or not at all. We deployed PharmaSage using real-world sales data and trained it on 700,000 articles represented as nodes in a graph with edges between nodes representing approximately 100 million sales transactions. By exploiting the pharmaceutical product properties, such as their indications, ingredients, and adverse effects, and combining these with large sales histories, we achieved better results than with a purely statistics based approach. To our knowledge, this is the first application of deep graph embeddings for pharmacy product cross-selling recommendation at this scale to date.


2020 ◽  
Vol 12 (22) ◽  
pp. 9621
Author(s):  
Shichen Huang ◽  
Chunfu Shao ◽  
Juan Li ◽  
Xiong Yang ◽  
Xiaoyu Zhang ◽  
...  

Extraction of traffic features constitutes a key research direction in traffic safety planning. In previous traffic tasks, road network features are extracted manually. In contrast, Network Representation Learning aims to automatically learn low-dimensional node representations. Enlightened by feature learning in Natural Language Processing, representation learning of urban nodes is studied as a supervised task in this paper. Following this line of thinking, a deep learning framework, called StreetNode2VEC, is proposed for learning feature representations for nodes in the road network based on travel routes, and then model parameter calibration is performed. We explain the effectiveness of features from visualization, similarity analysis, and link prediction. In visualization, the features of nodes naturally present a clustered pattern, and different clusters correspond to different regions in the road network. Meanwhile, the features of nodes still retain their spatial information in similarity analysis. The proposed method StreetNode2VEC obtains a AUC score of 0.813 in link prediction, which is greater than that obtained from Graph Convolutional Network (GCN) and Node2vec. This suggests that the features of nodes can be used to effectively and credibly predict whether a link should be established between two nodes. Overall, our work provides a new way of representing road nodes in the road network, which have potential in the traffic safety planning field.


2021 ◽  
Vol 25 (3) ◽  
pp. 711-738
Author(s):  
Phu Pham ◽  
Phuc Do

Link prediction on heterogeneous information network (HIN) is considered as a challenge problem due to the complexity and diversity in types of nodes and links. Currently, there are remained challenges of meta-path-based link prediction in HIN. Previous works of link prediction in HIN via network embedding approach are mainly focused on exploiting features of node rather than existing relations in forms of meta-paths between nodes. In fact, predicting the existence of new links between non-linked nodes is absolutely inconvincible. Moreover, recent HIN-based embedding models also lack of thorough evaluations on the topic similarity between text-based nodes along given meta-paths. To tackle these challenges, in this paper, we proposed a novel approach of topic-driven multiple meta-path-based HIN representation learning framework, namely W-MMP2Vec. Our model leverages the quality of node representations by combining multiple meta-paths as well as calculating the topic similarity weight for each meta-path during the processes of network embedding learning in content-based HINs. To validate our approach, we apply W-TMP2Vec model in solving several link prediction tasks in both content-based and non-content-based HINs (DBLP, IMDB and BlogCatalog). The experimental outputs demonstrate the effectiveness of proposed model which outperforms recent state-of-the-art HIN representation learning models.


Cancers ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 2111
Author(s):  
Bo-Wei Zhao ◽  
Zhu-Hong You ◽  
Lun Hu ◽  
Zhen-Hao Guo ◽  
Lei Wang ◽  
...  

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.


Author(s):  
Jung-Hoon Cho ◽  
Seung Woo Ham ◽  
Dong-Kyu Kim

With the growth of the bike-sharing system, the problem of demand forecasting has become important to the bike-sharing system. This study aims to develop a novel prediction model that enhances the accuracy of the peak hourly demand. A spatiotemporal graph convolutional network (STGCN) is constructed to consider both the spatial and temporal features. One of the model’s essential steps is determining the main component of the adjacency matrix and the node feature matrix. To achieve this, 131 days of data from the bike-sharing system in Seoul are used and experiments conducted on the models with various adjacency matrices and node feature matrices, including public transit usage. The results indicate that the STGCN models reflecting the previous demand pattern to the adjacency matrix show outstanding performance in predicting demand compared with the other models. The results also show that the model that includes bus boarding and alighting records is more accurate than the model that contains subway records, inferring that buses have a greater connection to bike-sharing than the subway. The proposed STGCN with public transit data contributes to the alleviation of unmet demand by enhancing the accuracy in predicting peak demand.


2020 ◽  
Vol 54 (1) ◽  
pp. 1-2
Author(s):  
Shubhanshu Mishra

Information extraction (IE) aims at extracting structured data from unstructured or semi-structured data. The thesis starts by identifying social media data and scholarly communication data as a special case of digital social trace data (DSTD). This identification allows us to utilize the graph structure of the data (e.g., user connected to a tweet, author connected to a paper, author connected to authors, etc.) for developing new information extraction tasks. The thesis focuses on information extraction from DSTD, first, using only the text data from tweets and scholarly paper abstracts, and then using the full graph structure of Twitter and scholarly communications datasets. This thesis makes three major contributions. First, new IE tasks based on DSTD representation of the data are introduced. For scholarly communication data, methods are developed to identify article and author level novelty [Mishra and Torvik, 2016] and expertise. Furthermore, interfaces for examining the extracted information are introduced. A social communication temporal graph (SCTG) is introduced for comparing different communication data like tweets tagged with sentiment, tweets about a search query, and Facebook group posts. For social media, new text classification categories are introduced, with the aim of identifying enthusiastic and supportive users, via their tweets. Additionally, the correlation between sentiment classes and Twitter meta-data in public corpora is analyzed, leading to the development of a better model for sentiment classification [Mishra and Diesner, 2018]. Second, methods are introduced for extracting information from social media and scholarly data. For scholarly data, a semi-automatic method is introduced for the construction of a large-scale taxonomy of computer science concepts. The method relies on the Wikipedia category tree. The constructed taxonomy is used for identifying key computer science phrases in scholarly papers, and tracking their evolution over time. Similarly, for social media data, machine learning models based on human-in-the-loop learning [Mishra et al., 2015], semi-supervised learning [Mishra and Diesner, 2016], and multi-task learning [Mishra, 2019] are introduced for identifying sentiment, named entities, part of speech tags, phrase chunks, and super-sense tags. The machine learning models are developed with a focus on leveraging all available data. The multi-task models presented here result in competitive performance against other methods, for most of the tasks, while reducing inference time computational costs. Finally, this thesis has resulted in the creation of multiple open source tools and public data sets (see URL below), which can be utilized by the research community. The thesis aims to act as a bridge between research questions and techniques used in DSTD from different domains. The methods and tools presented here can help advance work in the areas of social media and scholarly data analysis.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Weiwei Gu ◽  
Fei Gao ◽  
Xiaodan Lou ◽  
Jiang Zhang

AbstractIn this paper, we propose graph attention based network representation (GANR) which utilizes the graph attention architecture and takes graph structure as the supervised learning information. Compared with node classification based representations, GANR can be used to learn representation for any given graph. GANR is not only capable of learning high quality node representations that achieve a competitive performance on link prediction, network visualization and node classification but it can also extract meaningful attention weights that can be applied in node centrality measuring task. GANR can identify the leading venture capital investors, discover highly cited papers and find the most influential nodes in Susceptible Infected Recovered Model. We conclude that link structures in graphs are not limited on predicting linkage itself, it is capable of revealing latent node information in an unsupervised way once a appropriate learning algorithm, like GANR, is provided.


Author(s):  
Shengsheng Qian ◽  
Jun Hu ◽  
Quan Fang ◽  
Changsheng Xu

In this article, we focus on fake news detection task and aim to automatically identify the fake news from vast amount of social media posts. To date, many approaches have been proposed to detect fake news, which includes traditional learning methods and deep learning-based models. However, there are three existing challenges: (i) How to represent social media posts effectively, since the post content is various and highly complicated; (ii) how to propose a data-driven method to increase the flexibility of the model to deal with the samples in different contexts and news backgrounds; and (iii) how to fully utilize the additional auxiliary information (the background knowledge and multi-modal information) of posts for better representation learning. To tackle the above challenges, we propose a novel Knowledge-aware Multi-modal Adaptive Graph Convolutional Networks (KMAGCN) to capture the semantic representations by jointly modeling the textual information, knowledge concepts, and visual information into a unified framework for fake news detection. We model posts as graphs and use a knowledge-aware multi-modal adaptive graph learning principal for the effective feature learning. Compared with existing methods, the proposed KMAGCN addresses challenges from three aspects: (1) It models posts as graphs to capture the non-consecutive and long-range semantic relations; (2) it proposes a novel adaptive graph convolutional network to handle the variability of graph data; and (3) it leverages textual information, knowledge concepts and visual information jointly for model learning. We have conducted extensive experiments on three public real-world datasets and superior results demonstrate the effectiveness of KMAGCN compared with other state-of-the-art algorithms.


Sign in / Sign up

Export Citation Format

Share Document