scholarly journals Hierarchical stochastic graphlet embedding for graph-based pattern recognition

2019 ◽  
Vol 32 (15) ◽  
pp. 11579-11596
Author(s):  
Anjan Dutta ◽  
Pau Riba ◽  
Josep Lladós ◽  
Alicia Fornés

AbstractDespite being very successful within the pattern recognition and machine learning community, graph-based methods are often unusable because of the lack of mathematical operations defined in graph domain. Graph embedding, which maps graphs to a vectorial space, has been proposed as a way to tackle these difficulties enabling the use of standard machine learning techniques. However, it is well known that graph embedding functions usually suffer from the loss of structural information. In this paper, we consider the hierarchical structure of a graph as a way to mitigate this loss of information. The hierarchical structure is constructed by topologically clustering the graph nodes and considering each cluster as a node in the upper hierarchical level. Once this hierarchical structure is constructed, we consider several configurations to define the mapping into a vector space given a classical graph embedding, in particular, we propose to make use of the stochastic graphlet embedding (SGE). Broadly speaking, SGE produces a distribution of uniformly sampled low-to-high-order graphlets as a way to embed graphs into the vector space. In what follows, the coarse-to-fine structure of a graph hierarchy and the statistics fetched by the SGE complements each other and includes important structural information with varied contexts. Altogether, these two techniques substantially cope with the usual information loss involved in graph embedding techniques, obtaining a more robust graph representation. This fact has been corroborated through a detailed experimental evaluation on various benchmark graph datasets, where we outperform the state-of-the-art methods.

2021 ◽  
Vol 95 ◽  
pp. 107383
Author(s):  
Ali.H. Alrubayi ◽  
M.A. Ahmed ◽  
A.A. Zaidan ◽  
A.S. Albahri ◽  
B.B. Zaidan ◽  
...  

2021 ◽  
Author(s):  
Rogini Runghen ◽  
Daniel B Stouffer ◽  
Giulio Valentino Dalla Riva

Collecting network interaction data is difficult. Non-exhaustive sampling and complex hidden processes often result in an incomplete data set. Thus, identifying potentially present but unobserved interactions is crucial both in understanding the structure of large scale data, and in predicting how previously unseen elements will interact. Recent studies in network analysis have shown that accounting for metadata (such as node attributes) can improve both our understanding of how nodes interact with one another, and the accuracy of link prediction. However, the dimension of the object we need to learn to predict interactions in a network grows quickly with the number of nodes. Therefore, it becomes computationally and conceptually challenging for large networks. Here, we present a new predictive procedure combining a graph embedding method with machine learning techniques to predict interactions on the base of nodes' metadata. Graph embedding methods project the nodes of a network onto a---low dimensional---latent feature space. The position of the nodes in the latent feature space can then be used to predict interactions between nodes. Learning a mapping of the nodes' metadata to their position in a latent feature space corresponds to a classic---and low dimensional---machine learning problem. In our current study we used the Random Dot Product Graph model to estimate the embedding of an observed network, and we tested different neural networks architectures to predict the position of nodes in the latent feature space. Flexible machine learning techniques to map the nodes onto their latent positions allow to account for multivariate and possibly complex nodes' metadata. To illustrate the utility of the proposed procedure, we apply it to a large dataset of tourist visits to destinations across New Zealand. We found that our procedure accurately predicts interactions for both existing nodes and nodes newly added to the network, while being computationally feasible even for very large networks. Overall, our study highlights that by exploiting the properties of a well understood statistical model for complex networks and combining it with standard machine learning techniques, we can simplify the link prediction problem when incorporating multivariate node metadata. Our procedure can be immediately applied to different types of networks, and to a wide variety of data from different systems. As such, both from a network science and data science perspective, our work offers a flexible and generalisable procedure for link prediction.


2019 ◽  
Vol 1 (2) ◽  
pp. 684-697
Author(s):  
Mario Manzo ◽  
Alessandro Rozza

Graph-embedding algorithms map a graph into a vector space with the aim of preserving its structure and its intrinsic properties. Unfortunately, many of them are not able to encode the neighborhood information of the nodes well, especially from a topological prospective. To address this limitation, we propose a novel graph-embedding method called Deep-Order Proximity and Structural Information Embedding (DOPSIE). It provides topology and depth information at the same time through the analysis of the graph structure. Topological information is provided through clustering coefficients (CCs), which is connected to other structural properties, such as transitivity, density, characteristic path length, and efficiency, useful for representation in the vector space. The combination of individual node properties and neighborhood information constitutes an optimal network representation. Our experimental results show that DOPSIE outperforms state-of-the-art embedding methodologies in different classification problems.


Author(s):  
YUESHENG HE ◽  
YUAN YAN TANG

Graphical avatars have gained popularity in many application domains such as three-dimensional (3D) animation movies and animated simulations for product design. However, the methods to edit avatars' behaviors in the 3D graphical environment remained to be a challenging research topic. Since the hand-crafted methods are time-consuming and inefficient, the automatic actions of the avatars are required. To achieve the autonomous behaviors of the avatars, artificial intelligence should be used in this research area. In this paper, we present a novel approach to construct a system of automatic avatars in the 3D graphical environments based on the machine learning techniques. Specific framework is created for controlling the behaviors of avatars, such as classifying the difference among the environments and using hierarchical structure to describe these actions. Because of the requirement of simulating the interactions between avatars and environments after the classification of the environment, Reinforcement Learning is used to compute the policy to control the avatar intelligently in the 3D environment for the solution of the problem of different situations. Thus, our approach has solved problems such as where the levels of the missions will be defined and how the learning algorithm will be used to control the avatars. In this paper, our method to achieve these goals will be presented. The main contributions of this paper are presenting a hierarchical structure to control avatars automatically, developing a method for avatars to recognize environment and presenting an approach for making the policy of avatars' actions intelligently.


Author(s):  
Samir Bandyopadhyay ◽  
Shawni Dutta

Cardiovascular disease (CVD) may sometimes unexpected loss of life. It affects the heart and blood vessels of body. CVD plays an important factor of life since it may cause death of human. It is necessary to detect early of this disease for securing patients life. In this chpter two exclusively different methods are proposed for detection of heart disease. The first one is Pattern Recognition Approach with grammatical concept and the second one is machine learning approach. In the syntactic pattern recognition approach initially ECG wave from different leads is decomposed into pattern primitive based on diagnostic criteria. These primitives are then used as terminals of the proposed grammar. Pattern primitives are then input to the grammar. The parsing table is created in a tabular form. It finally indicates the patient with any disease or normal. Here five diseases beside normal are considered. Different Machine Learning (ML) approaches may be used for detecting patients with CVD and assisting health care systems also. These are useful for learning and utilizing the patterns discovered from large databases. It applies to a set of information in order to recognize underlying relationship patterns from the information set. It is basically a learning stage. Unknown incoming set of patterns can be tested using these methods. Due to its self-adaptive structure Deep Learning (DL) can process information with minimal processing time. DL exemplifies the use of neural network. A predictive model follows DL techniques for analyzing and assessing patients with heart disease. A hybrid approach based on Convolutional Layer and Gated-Recurrent Unit (GRU) are used in the paper for diagnosing the heart disease.


Author(s):  
Boris Sovetov ◽  
Tatiana Tatarnikova ◽  
Ekaterina Poymanova

Introduction: The implementation of data storage process requires timely scaling of the infrastructure to accommodate the data received for storage. Given the rapid accumulation of data, new models of storage capacity management are needed, which should take into account the hierarchical structure of the data storage, various requirements for file storage and restrictions on the storage media size. Purpose: To propose a model for timely scaling of the storage infrastructure based on predictive estimates of the moment when the data storage media is fully filled. Results: A model of storage capacity management is presented, based on the analysis of storage system state patterns. A pattern is a matrix each cell of which reflects the filling state of the storage medium at an appropriate level in the hierarchical structure of the storage system. A matrix cell is characterized by the real, limit, and maximum values of its carrier capacity. To solve the scaling problem for a data storage system means to predict the moments when the limit capacity and maximum capacity of the data carrier are reached. The difference between the predictive estimatesis the time which the administrator has to connect extra media. It is proposed to calculate the values of the predictive estimates programmatically, using machine learning methods. It is shown that when making a short-term prediction, machine learning methods have lower accuracy than ARIMA, an integrated model of autoregression and moving average. However, when making a long-term forecast, machine learning methods provide results commensurate with those from ARIMA. Practical relevance: The proposed model is necessary for timely allocation of storage capacity for incoming data. The implementation of this model at the storage input allows you to automate the process of connecting media, which helps prevent the loss of data entering the system.


Sign in / Sign up

Export Citation Format

Share Document