Graph representation learning aims at learning low-dimension representations for nodes in graphs, and has been proven very useful in several downstream tasks. In this article, we propose a new model, Graph Community Infomax (GCI), that can adversarial learn representations for nodes in attributed networks. Different from other adversarial network embedding models, which would assume that the data follow some prior distributions and generate fake examples, GCI utilizes the community information of networks, using nodes as positive(or real) examples and negative(or fake) examples at the same time. An autoencoder is applied to learn the embedding vectors for nodes and reconstruct the adjacency matrix, and a discriminator is used to maximize the mutual information between nodes and communities. Experiments on several real-world and synthetic networks have shown that GCI outperforms various network embedding methods on community detection tasks.
While most network embedding techniques model the proximity between nodes in a network, recently there has been significant interest in
that are based on node
, a notion rooted in sociology: equivalences or positions are collections of nodes that have similar roles—i.e., similar functions, ties or interactions with nodes in other positions—irrespective of their distance or reachability in the network. Unlike the proximity-based methods that are rigorously evaluated in the literature, the evaluation of structural embeddings is less mature. It relies on small synthetic or real networks with labels that are not perfectly defined, and its connection to sociological equivalences has hitherto been vague and tenuous. With new node embedding methods being developed at a breakneck pace,
proper evaluation, and systematic characterization of existing approaches will be essential to progress.
To fill in this gap, we set out to understand
types of equivalences structural embeddings capture. We are the first to contribute rigorous intrinsic and extrinsic evaluation methodology for structural embeddings, along with carefully-designed, diverse datasets of varying sizes. We observe a number of different evaluation variables that can lead to different results (e.g., choice of similarity measure, classifier, and label definitions). We find that degree distributions within nodes’ local neighborhoods can lead to simple yet effective baselines in their own right and guide the future development of structural embedding. We hope that our findings can influence the design of further node embedding methods and also pave the way for more comprehensive and fair evaluation of structural embedding methods.
Network-based information has been widely explored and exploited in the information retrieval literature. Attributed networks, consisting of nodes, edges as well as attributes describing properties of nodes, are a basic type of network-based data, and are especially useful for many applications. Examples include user profiling in social networks and item recommendation in user-item purchase networks. Learning useful and expressive representations of entities in attributed networks can provide more effective building blocks to down-stream network-based tasks such as link prediction and attribute inference. Practically, input features of attributed networks are normalized as unit directional vectors. However, most network embedding techniques ignore the
nature of inputs and focus on learning representations in a Gaussian or Euclidean space, which, we hypothesize, might lead to less effective representations. To obtain more effective representations of attributed networks, we investigate the problem of mapping an attributed network with unit normalized directional features into a non-Gaussian and non-Euclidean space. Specifically, we propose a hyperspherical variational co-embedding for attributed networks (HCAN), which is based on generalized variational auto-encoders for heterogeneous data with multiple types of entities. HCAN jointly learns latent embeddings for both nodes and attributes in a unified hyperspherical space such that the affinities between nodes and attributes can be captured effectively. We argue that this is a crucial feature in many real-world applications of attributed networks. Previous Gaussian network embedding algorithms break the assumption of uninformative prior, which leads to unstable results and poor performance. In contrast, HCAN embeds nodes and attributes as von Mises-Fisher distributions, and allows one to capture the uncertainty of the inferred representations. Experimental results on eight datasets show that HCAN yields better performance in a number of applications compared with nine state-of-the-art baselines.
User recommendation aims at recommending users with potential interests in the social network. Previous works have mainly focused on the undirected social networks with symmetric relationship such as friendship, whereas recent advances have been made on the asymmetric relationship such as the following and followed by relationship. Among the few existing direction-aware user recommendation methods, the random walk strategy has been widely adopted to extract the asymmetric proximity between users. However, according to our analysis on real-world directed social networks, we argue that the asymmetric proximity captured by existing random walk based methods are insufficient due to the inbalance in-degree and out-degree of nodes.
To tackle this challenge, we propose InfoWalk, a novel informative walk strategy to efficiently capture the asymmetric proximity solely based on random walks. By transferring the direction information into the weights of each step, InfoWalk is able to overcome the limitation of edges while simultaneously maintain both the direction and proximity. Based on the asymmetric proximity captured by InfoWalk, we further propose the qualitative (DNE-L) and quantitative (DNE-T) directed network embedding methods, capable of preserving the two properties in the embedding space. Extensive experiments conducted on six real-world benchmark datasets demonstrate the superiority of the proposed DNE model over several state-of-the-art approaches in various tasks.
WordNets built for low-resource languages, such as Assamese, often use the expansion methodology. This may result in missing lexical entries and missing synonymy relations. As the Assamese WordNet is also built using the expansion method, using the Hindi WordNet, it also has missing synonymy relations. As WordNets can be visualized as a network of unique words connected by synonymy relations, link prediction in complex network analysis is an effective way of predicting missing relations in a network. Hence, to predict the missing synonyms in the Assamese WordNet, link prediction methods were used in the current work that proved effective. It is also observed that for discovering missing relations in the Assamese WordNet, simple local proximity-based methods might be more effective as compared to global and complex supervised models using network embedding. Further, it is noticed that though a set of retrieved words are not synonyms per se, they are semantically related to the target word and may be categorized as semantic cohorts.
Information security is one of the key issues in e-commerce Internet of Things (IoT) platform research. The collusive spamming groups on e-commerce platforms can write a large number of fake reviews over a period of time for the evaluated products, which seriously affect the purchase decision behaviors of consumers and destroy the fair competition environment among merchants. To address this problem, we propose a network embedding based approach to detect collusive spamming groups. First, we use the idea of a meta-graph to construct a heterogeneous information network based on the user review dataset. Second, we exploit the modified DeepWalk algorithm to learn the low-dimensional vector representations of user nodes in the heterogeneous information network and employ the clustering methods to obtain candidate spamming groups. Finally, we leverage an indicator weighting strategy to calculate the spamming score of each candidate group, and the top-k groups with high spamming scores are considered to be the collusive spamming groups. The experimental results on two real-world review datasets show that the overall detection performance of the proposed approach is much better than that of baseline methods.
Networking is the use of physical links to connect individual isolated workstations or hosts together to form data links for the purpose of resource sharing and communication. In the field of web service application and consumer environment optimization, it has been shown that the introduction of network embedding methods can effectively alleviate the problems such as data sparsity in the recommendation process. However, existing network embedding methods mostly target a specific structure of network and do not collaborate with multiple relational networks from the root. Therefore, this paper proposes a service recommendation model based on the hybrid embedding of multiple networks and designs a multinetwork hybrid embedding recommendation algorithm. First, the user social relationship network and the user service heterogeneous information network are constructed; then, the embedding vectors of users and services in the same vector space are obtained through multinetwork hybrid embedding learning; finally, the representation vectors of users and services are applied to recommend services to target users. To verify the effectiveness of this paper’s method, a comparative analysis is conducted with a variety of representative service recommendation methods on three publicly available datasets, and the experimental results demonstrate that this paper’s multinetwork hybrid embedding method can effectively collaborate with multirelationship networks to improve service recommendation quality, in terms of recommendation efficiency and accuracy.