Efficient detection of communities with significant overlaps in networks: Partial community merger algorithm

AbstractDetecting communities in large-scale social networks is a challenging task where each vertex may belong to multiple communities. Such behavior of vertices and the implied strong overlaps among communities render many detection algorithms invalid. We develop a Partial Community Merger Algorithm (PCMA) for detecting communities with significant overlaps as well as slightly overlapping and disjoint ones. It is a bottom-up approach based on properly reassembling partial information of communities revealed in ego networks of vertices to reconstruct complete communities. We propose a novel similarity measure of communities and an efficient merger process to address the two key issues—noise control and merger order—in implementing this approach. PCMA is tested against two benchmarks and overall it outperforms all compared algorithms in both accuracy and efficiency. It is applied to two huge online social networks, Friendster and Sina Weibo. Millions of communities are detected and they are of higher qualities than the corresponding metadata groups. We find that the latter should not be regarded as the ground-truth of structural communities. The significant overlapping pattern found in the detected communities confirms the need of new algorithms, such as PCMA, to handle multiple memberships of vertices in social networks.

Download Full-text

Efficient Vector Partitioning Algorithms for Graph Clustering

journal of Data Intelligence ◽

10.26421/jdi1.2-1 ◽

2020 ◽

Vol 1 (2) ◽

pp. 101-123

Author(s):

Hiroaki Shiokawa ◽

Yasunori Futamura

Keyword(s):

Social Networks ◽

Large Scale ◽

Clustering Algorithm ◽

Ground Truth ◽

Graph Clustering ◽

Mining Communities ◽

Fine Grained ◽

Efficient Vector ◽

Public Datasets ◽

Many Core

This paper addressed the problem of finding clusters included in graph-structured data such as Web graphs, social networks, and others. Graph clustering is one of the fundamental techniques for understanding structures present in the complex graphs such as Web pages, social networks, and others. In the Web and data mining communities, the modularity-based graph clustering algorithm is successfully used in many applications. However, it is difficult for the modularity-based methods to find fine-grained clusters hidden in large-scale graphs; the methods fail to reproduce the ground truth. In this paper, we present a novel modularity-based algorithm, \textit{CAV}, that shows better clustering results than the traditional algorithm. The proposed algorithm employs a cohesiveness-aware vector partitioning into the graph spectral analysis to improve the clustering accuracy. Additionally, this paper also presents a novel efficient algorithm \textit{P-CAV} for further improving the clustering speed of CAV; P-CAV is an extension of CAV that utilizes the thread-based parallelization on a many-core CPU. Our extensive experiments on synthetic and public datasets demonstrate the performance superiority of our approaches over the state-of-the-art approaches.

Download Full-text

A Survey of Malicious Accounts Detection in Large-Scale Online Social Networks

2018 IEEE 4th International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing, (HPSC) and IEEE International Conference on Intelligent Data and Security (IDS) ◽

10.1109/bds/hpsc/ids18.2018.00043 ◽

2018 ◽

Author(s):

Yang Xin ◽

Chensu Zhao ◽

Hongliang Zhu ◽

Mingcheng Gao

Keyword(s):

Social Networks ◽

Online Social Networks ◽

Large Scale

Download Full-text

Hashkat: large-scale simulations of online social networks

Social Network Analysis and Mining ◽

10.1007/s13278-017-0424-7 ◽

2017 ◽

Vol 7 (1) ◽

Cited By ~ 5

Author(s):

Kevin Ryczko ◽

Adam Domurad ◽

Nicholas Buhagiar ◽

Isaac Tamblyn

Keyword(s):

Social Networks ◽

Online Social Networks ◽

Large Scale ◽

Large Scale Simulations

Download Full-text

Community Detection in Large-Scale Social Networks

Advances in Wireless Technologies and Telecommunication - Graph Theoretic Approaches for Analyzing Large-Scale Social Networks ◽

10.4018/978-1-5225-2814-2.ch012 ◽

2018 ◽

pp. 189-206 ◽

Cited By ~ 2

Author(s):

S Rao Chintalapudi ◽

M. H. M. Krishna Prasad

Keyword(s):

Social Networks ◽

Community Detection ◽

Large Scale ◽

The Other ◽

Edge Density ◽

Challenging Problem ◽

Overlapping Community Detection ◽

Clustering Problem ◽

Detection Algorithms ◽

Overlapping Community

Community Structure is one of the most important properties of social networks. Detecting such structures is a challenging problem in the area of social network analysis. Community is a collection of nodes with dense connections than with the rest of the network. It is similar to clustering problem in which intra cluster edge density is more than the inter cluster edge density. Community detection algorithms are of two categories, one is disjoint community detection, in which a node can be a member of only one community at most, and the other is overlapping community detection, in which a node can be a member of more than one community. This chapter reviews the state-of-the-art disjoint and overlapping community detection algorithms. Also, the measures needed to evaluate a disjoint and overlapping community detection algorithms are discussed in detail.

Download Full-text

Prediction of Social Influence for Provenance of Misinformation in Online Social Network Using Big Data Approach

The Computer Journal ◽

10.1093/comjnl/bxaa132 ◽

2020 ◽

Author(s):

Kumaran P ◽

Rajeswari Sridhar

Keyword(s):

Social Networks ◽

Social Network ◽

Law Enforcement ◽

Online Social Networks ◽

Large Scale ◽

Online Social Network ◽

Root Cause ◽

Minimum Number ◽

Targeted Marketing ◽

Implicit And Explicit

Abstract Online social networks (OSNs) is a platform that plays an essential role in identifying misinformation like false rumors, insults, pranks, hoaxes, spear phishing and computational propaganda in a better way. Detection of misinformation finds its applications in areas such as law enforcement to pinpoint culprits who spread rumors to harm the society, targeted marketing in e-commerce to identify the user who originates dissatisfaction messages about products or services that harm an organizations reputation. The process of identifying and detecting misinformation is very crucial in complex social networks. As misinformation in social network is identified by designing and placing the monitors, computing the minimum number of monitors for detecting misinformation is a very trivial work in the complex social network. The proposed approach determines the top suspected sources of misinformation using a tweet polarity-based ranking system in tandem with sarcasm detection (both implicit and explicit sarcasm) with optimization approaches on large-scale incomplete network. The algorithm subsequently uses this determined feature to place the minimum set of monitors in the network for detecting misinformation. The proposed work focuses on the timely detection of misinformation by limiting the distance between the suspected sources and the monitors. The proposed work also determines the root cause of misinformation (provenance) by using a combination of network-based and content-based approaches. The proposed work is compared with the state-of-art work and has observed that the proposed algorithm produces better results than existing methods.

Download Full-text

Graph compaction in analyzing large scale online social networks

2017 IEEE International Conference on Communications (ICC) ◽

10.1109/icc.2017.7996910 ◽

2017 ◽

Cited By ~ 1

Author(s):

Sima Das ◽

Jennifer Leopold ◽

Susmita Ghosh ◽

Sajal K. Das

Keyword(s):

Social Networks ◽

Online Social Networks ◽

Large Scale

Download Full-text

Distributed Coverage of Ego Networks in F2F Online Social Networks

2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld) ◽

10.1109/uic-atc-scalcom-cbdcom-iop-smartworld.2016.0078 ◽

2016 ◽

Cited By ~ 3

Author(s):

Andrea De Salve ◽

Barbara Guidi ◽

Paolo Mori ◽

Laura Ricci

Keyword(s):

Social Networks ◽

Online Social Networks ◽

Ego Networks

Download Full-text

Online Social Networks (OSN) Evolution Model Based on Homophily and Preferential Attachment

Symmetry ◽

10.3390/sym10110654 ◽

2018 ◽

Vol 10 (11) ◽

pp. 654 ◽

Cited By ~ 1

Author(s):

Jebran Khan ◽

Sungchang Lee

Keyword(s):

Social Networks ◽

Structural Properties ◽

Online Social Networks ◽

Large Scale ◽

Preferential Attachment ◽

Real Life ◽

Synthetic Data ◽

Evolution Model ◽

Scale Invariant ◽

Scale Free

In this paper, we propose a new scale-free social networks (SNs) evolution model that is based on homophily combined with preferential attachments. Our model enables the SN researchers to generate SN synthetic data for the evaluation of multi-facet SN models that are dependent on users’ attributes and similarities. Homophily is one of the key factors for interactive relationship formation in SN. The synthetic graph generated by our model is scale-invariant and has symmetric relationships. The model is dynamic and sustainable to changes in input parameters, such as number of nodes and nodes’ attributes, by conserving its structural properties. Simulation and evaluation of models for large-scale SN applications need large datasets. One way to get SN data is to generate synthetic data by using SN evolution models. Various SN evolution models are proposed to approximate the real-life SN graphs in previous research. These models are based on SN structural properties such as preferential attachment. The data generated by these models is suitable to evaluate SN models that are structure dependent but not suitable to evaluate models which depend on the SN users’ attributes and similarities. In our proposed model, users’ attributes and similarities are utilized to synthesize SN graphs. We evaluated the resultant synthetic graph by analyzing its structural properties. In addition, we validated our model by comparing its measures with the publicly available real-life SN datasets and previous SN evolution models. Simulation results show our resultant graph to be a close representation of real-life SN graphs with users’ attributes.

Download Full-text

A new viral marketing strategy with the competition in the large-scale online social networks

2016 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future (RIVF) ◽

10.1109/rivf.2016.7800260 ◽

2016 ◽

Cited By ~ 4

Author(s):

Canh V. Pham ◽

Dung K. Ha ◽

Dung Q. Ngo ◽

Quang C. Vu ◽

Huan X. Hoang

Keyword(s):

Social Networks ◽

Marketing Strategy ◽

Online Social Networks ◽

Large Scale ◽

Viral Marketing

Download Full-text

The structure of online social networks modulates the rate of lexical change

10.31234/osf.io/be8q7 ◽

2021 ◽

Author(s):

Jian Zhu ◽

David Jurgens

Keyword(s):

Social Networks ◽

Online Communities ◽

Online Social Networks ◽

Large Scale ◽

Scale Analysis ◽

New Words ◽

Large Scale Analysis ◽

Lexical Change ◽

Underlying Network ◽

The Many

New words are regularly introduced to communities, yet not all of these words persist in a community's lexicon. Among the many factors contributing to lexical change, we focus on the understudied effect of social networks. We conduct a large-scale analysis of over 80k neologisms in 4420 online communities across a decade. Using Poisson regression and survival analysis, our study demonstrates that the community's network structure plays a significant role in lexical change. Apart from overall size, properties including dense connections, the lack of local clusters and more external contacts promote lexical innovation and retention. Unlike offline communities, these topic-based communities do not experience strong lexical levelling despite increased contact but accommodate more niche words. Our work provides support for the sociolinguistic hypothesis that lexical change is partially shaped by the structure of the underlying network but also uncovers findings specific to online communities.

Download Full-text