scholarly journals Embedding-based Silhouette community detection

2020 ◽  
Vol 109 (11) ◽  
pp. 2161-2193 ◽  
Author(s):  
Blaž Škrlj ◽  
Jan Kralj ◽  
Nada Lavrač

AbstractMining complex data in the form of networks is of increasing interest in many scientific disciplines. Network communities correspond to densely connected subnetworks, and often represent key functional parts of real-world systems. This paper proposes the embedding-based Silhouette community detection (SCD), an approach for detecting communities, based on clustering of network node embeddings, i.e. real valued representations of nodes derived from their neighborhoods. We investigate the performance of the proposed SCD approach on 234 synthetic networks, as well as on a real-life social network. Even though SCD is not based on any form of modularity optimization, it performs comparably or better than state-of-the-art community detection algorithms, such as the InfoMap and Louvain. Further, we demonstrate that SCD’s outputs can be used along with domain ontologies in semantic subgroup discovery, yielding human-understandable explanations of communities detected in a real-life protein interaction network. Being embedding-based, SCD is widely applicable and can be tested out-of-the-box as part of many existing network learning and exploration pipelines.

2020 ◽  
Vol 34 (35) ◽  
pp. 2050408
Author(s):  
Sumit Gupta ◽  
Dhirendra Pratap Singh

In today’s world scenario, many of the real-life problems and application data can be represented with the help of the graphs. Nowadays technology grows day by day at a very fast rate; applications generate a vast amount of valuable data, due to which the size of their representation graphs is increased. How to get meaningful information from these data become a hot research topic. Methodical algorithms are required to extract useful information from these raw data. These unstructured graphs are not scattered in nature, but these show some relationships between their basic entities. Identifying communities based on these relationships improves the understanding of the applications represented by graphs. Community detection algorithms are one of the solutions which divide the graph into small size clusters where nodes are densely connected within the cluster and sparsely connected across. During the last decade, there are lots of algorithms proposed which can be categorized into mainly two broad categories; non-overlapping and overlapping community detection algorithm. The goal of this paper is to offer a comparative analysis of the various community detection algorithms. We bring together all the state of art community detection algorithms related to these two classes into a single article with their accessible benchmark data sets. Finally, we represent a comparison of these algorithms concerning two parameters: one is time efficiency, and the other is how accurately the communities are detected.


Author(s):  
Amany A. Naem ◽  
Neveen I. Ghali

Antlion Optimization (ALO) is one of the latest population based optimization methods that proved its good performance in a variety of applications. The ALO algorithm copies the hunting mechanism of antlions to ants in nature. Community detection in social networks is conclusive to understanding the concepts of the networks. Identifying network communities can be viewed as a problem of clustering a set of nodes into communities. k-median clustering is one of the popular techniques that has been applied in clustering. The problem of clustering network can be formalized as an optimization problem where a qualitatively objective function that captures the intuition of a cluster as a set of nodes with better in ternal connectivity than external connectivity is selected to be optimized. In this paper, a mixture antlion optimization and k-median for solving the community detection problem is proposed and named as K-median Modularity ALO. Experimental results which are applied on real life networks show the ability of the mixture antlion optimization and k-median to detect successfully an optimized community structure based on putting the modularity as an objective function.


2016 ◽  
Vol 7 (3) ◽  
pp. 50-70 ◽  
Author(s):  
Nidhi Arora ◽  
Hema Banati

Various evolving approaches have been extensively applied to evolve densely connected communities in complex networks. However these techniques have been primarily single objective optimization techniques, which optimize only a specific feature of the network missing on other important features. Multiobjective optimization techniques can overcome this drawback by simultaneously optimizing multiple features of a network. This paper proposes MGSO, a multiobjective variant of Group Search Optimization (GSO) algorithm to globally search and evolve densely connected communities. It uses inherent animal food searching behavior of GSO to simultaneously optimize two negatively correlated objective functions and overcomes the drawbacks of single objective based CD algorithms. The algorithm reduces random initializations which results in fast convergence. It was applied on 6 real world and 33 synthetic network datasets and results were compared with varied state of the art community detection algorithms. The results established show the efficacy of MGSO to find accurate community structures.


Author(s):  
Swarup Chattopadhyay ◽  
Tanmay Basu ◽  
Asit K. Das ◽  
Kuntal Ghosh ◽  
Late C. A. Murthy

AbstractAutomated community detection is an important problem in the study of complex networks. The idea of community detection is closely related to the concept of data clustering in pattern recognition. Data clustering refers to the task of grouping similar objects and segregating dissimilar objects. The community detection problem can be thought of as finding groups of densely interconnected nodes with few connections to nodes outside the group. A node similarity measure is proposed here that finds the similarity between two nodes by considering both neighbors and non-neighbors of these two nodes. Subsequently, a method is introduced for identifying communities in complex networks using this node similarity measure and the notion of data clustering. The significant characteristic of the proposed method is that it does not need any prior knowledge about the actual communities of a network. Extensive experiments on several real world and artificial networks with known ground-truth communities are reported. The proposed method is compared with various state of the art community detection algorithms by using several criteria, viz. normalized mutual information, f-measure etc. Moreover, it has been successfully applied in improving the effectiveness of a recommender system which is rapidly becoming a crucial tool in e-commerce applications. The empirical results suggest that the proposed technique has the potential to improve the performance of a recommender system and hence it may be useful for other e-commerce applications.


Data ◽  
2019 ◽  
Vol 4 (4) ◽  
pp. 149
Author(s):  
Amulyashree Sridhar ◽  
Sharvani GS ◽  
AH Manjunatha Reddy ◽  
Biplab Bhattacharjee ◽  
Kalyan Nagaraj

Exploring gene networks is crucial for identifying significant biological interactions occurring in a disease condition. These interactions can be acknowledged by modeling the tie structure of networks. Such tie orientations are often detected within embedded community structures. However, most of the prevailing community detection modules are intended to capture information from nodes and its attributes, usually ignoring the ties. In this study, a modularity maximization algorithm is proposed based on nonlinear representation of local tangent space alignment (LTSA). Initially, the tangent coordinates are computed locally to identify k-nearest neighbors across the genes. These local neighbors are further optimized by generating a nonlinear network embedding function for detecting gene communities based on eigenvector decomposition. Experimental results suggest that this algorithm detects gene modules with a better modularity index of 0.9256, compared to other traditional community detection algorithms. Furthermore, co-expressed genes across these communities are identified by discovering the characteristic tie structures. These detected ties are known to have substantial biological influence in the progression of schizophrenia, thereby signifying the influence of tie patterns in biological networks. This technique can be extended logically on other diseases networks for detecting substantial gene “hotspots”.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
László Hajdu ◽  
Miklós Krész ◽  
András Bóta

AbstractBoth community detection and influence maximization are well-researched fields of network science. Here, we investigate how several popular community detection algorithms can be used as part of a heuristic approach to influence maximization. The heuristic is based on the community value, a node-based metric defined on the outputs of overlapping community detection algorithms. This metric is used to select nodes as high influence candidates for expanding the set of influential nodes. Our aim in this paper is twofold. First, we evaluate the performance of eight frequently used overlapping community detection algorithms on this specific task to show how much improvement can be gained compared to the originally proposed method of Kempe et al. Second, selecting the community detection algorithm(s) with the best performance, we propose a variant of the influence maximization heuristic with significantly reduced runtime, at the cost of slightly reduced quality of the output. We use both artificial benchmarks and real-life networks to evaluate the performance of our approach.


2020 ◽  
Vol 13 (2) ◽  
pp. 128-136 ◽  
Author(s):  
Seema Rani ◽  
Monica Mehrotra

Background: In today’s world, complex systems are conceptually observed in the form of network structure. Communities inherently existing in the networks have a recognizable elucidation in understanding the organization of networks. Community discovery in networks has grabbed the attention of researchers from multi-discipline. Community detection problem has been modeled as an optimization problem. In broad-spectrum, existing community detection algorithms have adopted modularity as the optimizing function. However, the modularity is not able to identify communities of smaller size as compared to the size of the network. Methods: This paper addresses the problem of the resolution limit posed by modularity. Modular density measure succeeds in countering the resolution limit problem. Finding network communities with maximum modular density is an NP-hard problem In this work, the discrete bat algorithm with modular density as the optimization function is recommended. Results: Experiments are conducted on three real-world datasets. For determining the consistency, ten independent runs of the proposed algorithm has been carried out. The experimental results show that our proposed algorithm produces high-quality community structure along with small size communities. Conclusion: The results are compared with traditional and evolutionary community detection algorithms. The final outcome shows the superiority of discrete bat algorithm with modular density as the optimization function with respect to number of communities, maximum modularity, and average modularity.


Information ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 199 ◽  
Author(s):  
Christos Makris ◽  
Georgios Pispirigos ◽  
Ioannis Orestis Rizos

Presently, due to the extended availability of gigantic information networks and the beneficial application of graph analysis in various scientific fields, the necessity for efficient and highly scalable community detection algorithms has never been more essential. Despite the significant amount of published research, the existing methods—such as the Girvan–Newman, random-walk edge betweenness, vertex centrality, InfoMap, spectral clustering, etc.—have virtually been proven incapable of handling real-life social graphs due to the intrinsic computational restrictions that lead to mediocre performance and poor scalability. The purpose of this article is to introduce a novel, distributed community detection methodology which in accordance with the community prediction concept, leverages the reduced complexity and the decreased variance of the bagging ensemble methods, to unveil the subjacent community hierarchy. The proposed approach has been thoroughly tested, meticulously compared against different classic community detection algorithms, and practically proven exceptionally scalable, eminently efficient, and promisingly accurate in unfolding the underlying community structure.


2021 ◽  
Author(s):  
Xi Chen ◽  
Ralf van der Lans ◽  
Michael Trusov

This paper presents a structural discrete choice model with social influence for large-scale social networks. The model is based on an incomplete information game and permits individual-specific parameters of consumers. It is challenging to apply this type of models to real-life scenarios for two reasons: (1) The computation of the Bayesian–Nash equilibrium is highly demanding; and (2) the identification of social influence requires the use of excluded variables that are oftentimes unavailable. To address these challenges, we derive the unique equilibrium conditions of the game, which allow us to employ a stochastic Bayesian estimation procedure that is scalable to large social networks. To facilitate the identification, we utilize community-detection algorithms to divide the network into different groups that, in turn, can be used to construct excluded variables. We validate the proposed structural model with the login decisions of more than 25,000 users of an online social game. Importantly, this data set also contains promotions that were exogenously determined and targeted to only a subgroup of consumers. This information allows us to perform exogeneity tests to validate our identification strategy using community-detection algorithms. Finally, we demonstrate the managerial usefulness of the proposed methodology for improving the strategies of targeting influential consumers in large social networks. This paper was accepted by Matthew Shum, marketing.


2022 ◽  
Vol 16 (4) ◽  
pp. 1-22
Author(s):  
Liudmila Prokhorenkova ◽  
Alexey Tikhonov ◽  
Nelly Litvak

Information diffusion, spreading of infectious diseases, and spreading of rumors are fundamental processes occurring in real-life networks. In many practical cases, one can observe when nodes become infected, but the underlying network, over which a contagion or information propagates, is hidden. Inferring properties of the underlying network is important since these properties can be used for constraining infections, forecasting, viral marketing, and so on. Moreover, for many applications, it is sufficient to recover only coarse high-level properties of this network rather than all its edges. This article conducts a systematic and extensive analysis of the following problem: Given only the infection times, find communities of highly interconnected nodes. This task significantly differs from the well-studied community detection problem since we do not observe a graph to be clustered. We carry out a thorough comparison between existing and new approaches on several large datasets and cover methodological challenges specific to this problem. One of the main conclusions is that the most stable performance and the most significant improvement on the current state-of-the-art are achieved by our proposed simple heuristic approaches agnostic to a particular graph structure and epidemic model. We also show that some well-known community detection algorithms can be enhanced by including edge weights based on the cascade data.


Sign in / Sign up

Export Citation Format

Share Document