scholarly journals Network Structure and Transfer Behaviors Embedding via Deep Prediction Model

Author(s):  
Xin Sun ◽  
Zenghui Song ◽  
Junyu Dong ◽  
Yongbo Yu ◽  
Claudia Plant ◽  
...  

Network-structured data is becoming increasingly popular in many applications. However, these data present great challenges to feature engineering due to its high non-linearity and sparsity. The issue on how to transfer the link-connected nodes of the huge network into feature representations is critical. As basic properties of the real-world networks, the local and global structure can be reflected by dynamical transfer behaviors from node to node. In this work, we propose a deep embedding framework to preserve the transfer possibilities among the network nodes. We first suggest a degree-weight biased random walk model to capture the transfer behaviors of the network. Then a deep embedding framework is introduced to preserve the transfer possibilities among the nodes. A network structure embedding layer is added into the conventional Long Short-Term Memory Network to utilize its sequence prediction ability. To keep the local network neighborhood, we further perform a Laplacian supervised space optimization on the embedding feature representations. Experimental studies are conducted on various real-world datasets including social networks and citation networks. The results show that the learned representations can be effectively used as features in a variety of tasks, such as clustering, visualization and classification, and achieve promising performance compared with state-of-the-art models.

2021 ◽  
Vol 14 (11) ◽  
pp. 1979-1991
Author(s):  
Zifeng Yuan ◽  
Huey Eng Chua ◽  
Sourav S Bhowmick ◽  
Zekun Ye ◽  
Wook-Shin Han ◽  
...  

Canned patterns ( i.e. , small subgraph patterns) in visual graph query interfaces (a.k.a GUI) facilitate efficient query formulation by enabling pattern-at-a-time construction mode. However, existing GUIS for querying large networks either do not expose any canned patterns or if they do then they are typically selected manually based on domain knowledge. Unfortunately, manual generation of canned patterns is not only labor intensive but may also lack diversity for supporting efficient visual formulation of a wide range of subgraph queries. In this paper, we present a novel, generic, and extensible framework called TATTOO that takes a data-driven approach to automatically select canned patterns for a GUI from large networks. Specifically, it first decomposes the underlying network into truss-infested and truss-oblivious regions. Then candidate canned patterns capturing different real-world query topologies are generated from these regions. Canned patterns based on a user-specified plug are then selected for the GUI from these candidates by maximizing coverage and diversity , and by minimizing the cognitive load of the pattern set. Experimental studies with real-world datasets demonstrate the benefits of TATTOO. Importantly, this work takes a concrete step towards realizing plug-and-play visual graph query interfaces for large networks.


Author(s):  
Shijie Zhang ◽  
Hongzhi Yin ◽  
Qinyong Wang ◽  
Tong Chen ◽  
Hongxu Chen ◽  
...  

On E-commerce platforms, understanding the relationships (e.g., substitute and complement) among products from user's explicit feedback, such as users' online transactions, is of great importance to boost extra sales. However, the significance of such relationships is usually neglected by existing recommender systems. In this paper, we propose a semisupervised deep embedding model, namely, Substitute Products Embedding Model (SPEM), which models the substitutable relationships between products by preserving the second-order proximity, negative first-order proximity and semantic similarity in a product co-purchasing graph based on user's purchasing behaviours. With SPEM, the learned representations of two substitutable products align closely in the latent embedding space. Extensive experiments on real-world datasets are conducted, and the results verify that our model outperforms state-of-the-art baselines.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Alaa Sagheer ◽  
Mostafa Kotb

AbstractCurrently, most real-world time series datasets are multivariate and are rich in dynamical information of the underlying system. Such datasets are attracting much attention; therefore, the need for accurate modelling of such high-dimensional datasets is increasing. Recently, the deep architecture of the recurrent neural network (RNN) and its variant long short-term memory (LSTM) have been proven to be more accurate than traditional statistical methods in modelling time series data. Despite the reported advantages of the deep LSTM model, its performance in modelling multivariate time series (MTS) data has not been satisfactory, particularly when attempting to process highly non-linear and long-interval MTS datasets. The reason is that the supervised learning approach initializes the neurons randomly in such recurrent networks, disabling the neurons that ultimately must properly learn the latent features of the correlated variables included in the MTS dataset. In this paper, we propose a pre-trained LSTM-based stacked autoencoder (LSTM-SAE) approach in an unsupervised learning fashion to replace the random weight initialization strategy adopted in deep LSTM recurrent networks. For evaluation purposes, two different case studies that include real-world datasets are investigated, where the performance of the proposed approach compares favourably with the deep LSTM approach. In addition, the proposed approach outperforms several reference models investigating the same case studies. Overall, the experimental results clearly show that the unsupervised pre-training approach improves the performance of deep LSTM and leads to better and faster convergence than other models.


2020 ◽  
Vol 36 (Supplement_2) ◽  
pp. i804-i812
Author(s):  
Sergio Doria-Belenguer ◽  
Markus K. Youssef ◽  
René Böttcher ◽  
Noël Malod-Dognin ◽  
Nataša Pržulj

Abstract Motivation Molecular interactions have been successfully modeled and analyzed as networks, where nodes represent molecules and edges represent the interactions between them. These networks revealed that molecules with similar local network structure also have similar biological functions. The most sensitive measures of network structure are based on graphlets. However, graphlet-based methods thus far are only applicable to unweighted networks, whereas real-world molecular networks may have weighted edges that can represent the probability of an interaction occurring in the cell. This information is commonly discarded when applying thresholds to generate unweighted networks, which may lead to information loss. Results We introduce probabilistic graphlets as a tool for analyzing the local wiring patterns of probabilistic networks. To assess their performance compared to unweighted graphlets, we generate synthetic networks based on different well-known random network models and edge probability distributions and demonstrate that probabilistic graphlets outperform their unweighted counterparts in distinguishing network structures. Then we model different real-world molecular interaction networks as weighted graphs with probabilities as weights on edges and we analyze them with our new weighted graphlets-based methods. We show that due to their probabilistic nature, probabilistic graphlet-based methods more robustly capture biological information in these data, while simultaneously showing a higher sensitivity to identify condition-specific functions compared to their unweighted graphlet-based method counterparts. Availabilityand implementation Our implementation of probabilistic graphlets is available at https://github.com/Serdobe/Probabilistic_Graphlets. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 15 (4) ◽  
pp. 1-24
Author(s):  
Suhansanu Kumar ◽  
Hari Sundaram

This article introduces a novel task-independent sampler for attributed networks. The problem is important because while data mining tasks on network content are common, sampling on internet-scale networks is costly. Link-trace samplers such as Snowball sampling, Forest Fire, Random Walk, and Metropolis–Hastings Random Walk are widely used for sampling from networks. The design of these attribute-agnostic samplers focuses on preserving salient properties of network structure, and are not optimized for tasks on node content. This article has three contributions. First, we propose a task-independent, attribute aware link-trace sampler grounded in Information Theory. Our sampler greedily adds to the sample the node with the most informative (i.e., surprising) neighborhood. The sampler tends to rapidly explore the attribute space, maximally reducing the surprise of unseen nodes. Second, we prove that content sampling is an NP-hard problem. A well-known algorithm best approximates the optimization solution within 1 − 1/ e , but requires full access to the entire graph. Third, we show through empirical counterfactual analysis that in many real-world datasets, network structure does not hinder the performance of surprise based link-trace samplers. Experimental results over 18 real-world datasets reveal: surprise-based samplers are sample efficient and outperform the state-of-the-art attribute-agnostic samplers by a wide margin (e.g., 45% performance improvement in clustering tasks).


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Jie Chen ◽  
Xin Wang ◽  
Shu Zhao ◽  
Yanping Zhang

It is meaningful for a researcher to find some proper collaborators in complex academic tasks. Academic collaborator recommendation models are always based on the network embedding of academic collaborator networks. Most of them focus on the network structure, text information, and the combination of them. The latent semantic relationships exist according to the text information of nodes in the academic collaborator network. However, these relationships are often ignored, which implies the similarity of the researchers. How to capture the latent semantic relationships among researchers in the academic collaborator network is a challenge. In this paper, we propose a content-enhanced network embedding model for academic collaborator recommendation, namely, CNEacR. We build a content-enhanced academic collaborator network based on the weighted text representation of each researcher. The content-enhanced academic collaborator network contains intrinsic collaboration relationships and latent semantic relationships. Firstly, the weighted text representation of each researcher is obtained according to its text information. Secondly, a content-enhanced academic collaborator network is built via the similarity of the weighted text representation of researchers and intrinsic collaboration relationships. Thirdly, each researcher is represented as a latent vector using network representation learning. Finally, top- k similar researchers are recommended for each target researcher. Experiment results on the real-world datasets show that CNEacR achieves better performance than academic collaborator recommendation baselines.


Author(s):  
Qiannan Zhu ◽  
Xiaofei Zhou ◽  
Jia Wu ◽  
Jianlong Tan ◽  
Li Guo

Multilingual knowledge graphs constructed by entity alignment are the indispensable resources for numerous AI-related applications. Most existing entity alignment methods only use the triplet-based knowledge to find the aligned entities across multilingual knowledge graphs, they usually ignore the neighborhood subgraph knowledge of entities that implies more richer alignment information for aligning entities. In this paper, we incorporate neighborhood subgraph-level information of entities, and propose a neighborhood-aware attentional representation method NAEA for multilingual knowledge graphs. NAEA devises an attention mechanism to learn neighbor-level representation by aggregating neighbors' representations with a weighted combination. The attention mechanism enables entities not only capture different impacts of their neighbors on themselves, but also attend over their neighbors' feature representations with different importance. We evaluate our model on two real-world datasets DBP15K and DWY100K, and the experimental results show that the proposed model NAEA significantly and consistently outperforms state-of-the-art entity alignment models.


2019 ◽  
Author(s):  
Majid Manoochehri

Memory span in humans has been intensely studied for more than a century. In spite of the critical role of memory span in our cognitive system, which intensifies the importance of fundamental determinants of its evolution, few studies have investigated it by taking an evolutionary approach. Overall, we know hardly anything about the evolution of memory components. In the present study, I briefly review the experimental studies of memory span in humans and non-human animals and shortly discuss some of the relevant evolutionary hypotheses.


2021 ◽  
Vol 186 (Supplement_1) ◽  
pp. 445-451
Author(s):  
Yifei Sun ◽  
Navid Rashedi ◽  
Vikrant Vaze ◽  
Parikshit Shah ◽  
Ryan Halter ◽  
...  

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.


2021 ◽  
Vol 21 (3) ◽  
pp. 1-17
Author(s):  
Wu Chen ◽  
Yong Yu ◽  
Keke Gai ◽  
Jiamou Liu ◽  
Kim-Kwang Raymond Choo

In existing ensemble learning algorithms (e.g., random forest), each base learner’s model needs the entire dataset for sampling and training. However, this may not be practical in many real-world applications, and it incurs additional computational costs. To achieve better efficiency, we propose a decentralized framework: Multi-Agent Ensemble. The framework leverages edge computing to facilitate ensemble learning techniques by focusing on the balancing of access restrictions (small sub-dataset) and accuracy enhancement. Specifically, network edge nodes (learners) are utilized to model classifications and predictions in our framework. Data is then distributed to multiple base learners who exchange data via an interaction mechanism to achieve improved prediction. The proposed approach relies on a training model rather than conventional centralized learning. Findings from the experimental evaluations using 20 real-world datasets suggest that Multi-Agent Ensemble outperforms other ensemble approaches in terms of accuracy even though the base learners require fewer samples (i.e., significant reduction in computation costs).


Sign in / Sign up

Export Citation Format

Share Document