Local Graph Edge Partitioning

Graph edge partitioning, which is essential for the efficiency of distributed graph computation systems, divides a graph into several balanced partitions within a given size to minimize the number of vertices to be cut. Existing graph partitioning models can be classified into two categories: offline and streaming graph partitioning models. The former requires global graph information during the partitioning, which is expensive in terms of time and memory for large-scale graphs. The latter creates partitions based solely on the received graph information. However, the streaming model may result in a lower partitioning quality compared with the offline model. Therefore, this study introduces a Local Graph Edge Partitioning model, which considers only the local information (i.e., a portion of a graph instead of the entire graph) during the partitioning. Considering only the local graph information is meaningful because acquiring complete information for large-scale graphs is expensive. Based on the Local Graph Edge Partitioning model, two local graph edge partitioning algorithms—Two-stage Local Partitioning and Adaptive Local Partitioning—are given. Experimental results obtained on 14 real-world graphs demonstrate that the proposed algorithms outperform rival algorithms in most tested cases. Furthermore, the proposed algorithms are proven to significantly improve the efficiency of the real graph computation system GraphX.

Download Full-text

Scene text removal via cascaded text stroke detection and erasing

Computational Visual Media ◽

10.1007/s41095-021-0242-8 ◽

2021 ◽

Vol 8 (2) ◽

pp. 273-287

Author(s):

Xuewei Bian ◽

Chaoqun Wang ◽

Weize Quan ◽

Juntao Ye ◽

Xiaopeng Zhang ◽

...

Keyword(s):

Performance Improvement ◽

Real World ◽

Large Scale ◽

State Of The Art ◽

The State ◽

Experimental Results ◽

Processing Unit ◽

Final Model ◽

Scene Text ◽

End To End

AbstractRecent learning-based approaches show promising performance improvement for the scene text removal task but usually leave several remnants of text and provide visually unpleasant results. In this work, a novel end-to-end framework is proposed based on accurate text stroke detection. Specifically, the text removal problem is decoupled into text stroke detection and stroke removal; we design separate networks to solve these two subproblems, the latter being a generative network. These two networks are combined as a processing unit, which is cascaded to obtain our final model for text removal. Experimental results demonstrate that the proposed method substantially outperforms the state-of-the-art for locating and erasing scene text. A new large-scale real-world dataset with 12,120 images has been constructed and is being made available to facilitate research, as current publicly available datasets are mainly synthetic so cannot properly measure the performance of different methods.

Download Full-text

Local Graph Edge Partitioning with a Two-Stage Heuristic Method

2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS) ◽

10.1109/icdcs.2019.00031 ◽

2019 ◽

Cited By ~ 2

Author(s):

Shengwei Ji ◽

Chenyang Bu ◽

Lei Li ◽

Xindong Wu

Keyword(s):

Heuristic Method ◽

Two Stage ◽

Local Graph ◽

Edge Partitioning

Download Full-text

Distributed Hybrid Two-Stage Multi-Sensor Fusion for Cooperative Modulation Classification in Large-Scale Wireless Sensor Networks

Sensors ◽

10.3390/s19194339 ◽

2019 ◽

Vol 19 (19) ◽

pp. 4339 ◽

Cited By ~ 2

Author(s):

Markovic ◽

Sokolovic ◽

Dukic

Keyword(s):

Large Scale ◽

Partial Information ◽

Complete Information ◽

Decision Fusion ◽

Fusion Process ◽

Modulation Classification ◽

Two Stage ◽

Multiple Sensors ◽

Cluster Data ◽

Large Scale Networks

Recent studies showed that the performance of the modulation classification (MC) is considerably improved by using multiple sensors deployed in a cooperative manner. Such cooperative MC solutions are based on the centralized fusion of independent features or decisions made at sensors. Essentially, the cooperative MC employs multiple uncorrelated observations of the unknown signal to gather more complete information, compared to the single sensor reception, which is used in the fusion process to refine the MC decision. However, the non-cooperative nature of MC inherently induces large loss in cooperative MC performance due to the unreliable measure of quality for the MC results obtained at individual sensors (which causes the partial information loss while performing centralized fusion). In this paper, the distributed two-stage fusion concept for the cooperative MC using multiple sensors is proposed. It is shown that the proposed distributed fusion, which combines feature (cumulant) fusion and decision fusion, facilitate preservation of information during the fusion process and thus considerably improve the MC performance. The clustered architecture is employed, with the influence of mismatched references restricted to the intra-cluster data fusion in the first stage. The adopted distributed concept represents a flexible and scalable solution that is suitable for implementation of large-scale networks.

Download Full-text

Modified Password Guessing Methods Based on TarGuess-I

Wireless Communications and Mobile Computing ◽

10.1155/2020/8837210 ◽

2020 ◽

Vol 2020 ◽

pp. 1-22

Author(s):

Zhijie Xie ◽

Min Zhang ◽

Yuqi Guo ◽

Zhenhan Li ◽

Hongjun Wang

Keyword(s):

Real World ◽

Relative Position ◽

Large Scale ◽

Experimental Results ◽

Demographic Information ◽

Modified Model ◽

Password Security ◽

Personally Identifiable Information

TarGuess − I is a leading online targeted password guessing model using users’ personally identifiable information (PII) proposed at ACM CCS 2016 by Wang et al. It has attracted widespread attention in password security owing to its superior guessing performance. Yet, after analyzing the users’ vulnerable behaviors of using popular passwords and constructing passwords with users’ PII, we find that this model does not take into account popular passwords, keyboard patterns, and the special strings. The special strings are the strings related to users but do not appear in the users’ demographic information. Thus, we propose TarGuess − I + K P X , a modified password guessing model with three semantic methods, including (1) identifying popular passwords by generating top-300 lists from similar websites, (2) recognizing keyboard patterns by relative position, and (3) catching the special strings by extracting continuous characters from user-generated PII. We conduct a series of evaluations on six large-scale real-world leaked password datasets. The experimental results show that our modified model outperforms TarGuess − I by 2.62% within 100 guesses.

Download Full-text

Abstractive Text Summarization by Incorporating Reader Comments

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016399 ◽

2019 ◽

Vol 33 ◽

pp. 6399-6406 ◽

Cited By ~ 4

Author(s):

Shen Gao ◽

Xiuying Chen ◽

Piji Li ◽

Zhaochun Ren ◽

Lidong Bing ◽

...

Keyword(s):

Real World ◽

Large Scale ◽

Text Summarization ◽

Experimental Results ◽

Semantic Gap ◽

Main Aspect ◽

Adversarial Learning ◽

Large Scale Dataset ◽

Reader Comments ◽

Abstractive Summarization

In neural abstractive summarization field, conventional sequence-to-sequence based models often suffer from summarizing the wrong aspect of the document with respect to the main aspect. To tackle this problem, we propose the task of reader-aware abstractive summary generation, which utilizes the reader comments to help the model produce better summary about the main aspect. Unlike traditional abstractive summarization task, reader-aware summarization confronts two main challenges: (1) Comments are informal and noisy; (2) jointly modeling the news document and the reader comments is challenging. To tackle the above challenges, we design an adversarial learning model named reader-aware summary generator (RASG), which consists of four components: (1) a sequence-to-sequence based summary generator; (2) a reader attention module capturing the reader focused aspects; (3) a supervisor modeling the semantic gap between the generated summary and reader focused aspects; (4) a goal tracker producing the goal for each generation step. The supervisor and the goal tacker are used to guide the training of our framework in an adversarial manner. Extensive experiments are conducted on our large-scale real-world text summarization dataset, and the results show that RASG achieves the stateof-the-art performance in terms of both automatic metrics and human evaluations. The experimental results also demonstrate the effectiveness of each module in our framework. We release our large-scale dataset for further research1.

Download Full-text

Combining ant colony optimization with 1-opt local search method for solving constrained forest transportation planning problems

Artificial Intelligence Research ◽

10.5430/air.v6n2p27 ◽

2017 ◽

Vol 6 (2) ◽

pp. 27

Author(s):

Pengpeng Lin ◽

Ruxin Dai ◽

Marco A. Contreras ◽

Jun Zhang

Keyword(s):

Local Search ◽

Ant Colony Optimization ◽

Real World ◽

Transportation Planning ◽

Large Scale ◽

Ant Colony ◽

Multiple Time ◽

Two Stage ◽

Timber Sales ◽

Planning Problems

We developed a two-stage approach (ACOLS) combining the ant colony optimization (ACO) algorithm and a 1-opt local search to solve forest transportation planning problems (FTPPs) considering fixed and variables costs and sediment yields expected to erode from road surfaces as side constraints. The ACOLS was designed for improving ACO performance and ensure the applicability to real-world, large-scale FTPPs with multiple time periods. It consists of three major routines: i) least-cost route finding process from all timber sales simultaneously, ii) two stage search process developed to quickly find feasible (stage I) and high-quality (stage II) solutions and, iii) 1-opt local search solution refinement to further improve solution quality. The ACOLS was first applied to a medium-scale hypothetical FTPP on which four cases with increasing level of sediment constraint were considered. To test for robustness, the ACOLS was then applied to ten different problems instances created basing on the same topology of the hypothetical FTPP. Lastly, the ACOLS was applied to a real-world, large-scale FTPP considering thousands of roads segments, hundreds of timber sales, and multiple products and planning periods. Feasible solutions were found for all cases indicating the usefulness of our approach to provide managers with an efficient tool to address large-scale transportation problems.

Download Full-text

Efficient network immunization under limited knowledge

National Science Review ◽

10.1093/nsr/nwaa229 ◽

2020 ◽

Cited By ~ 1

Author(s):

Yangyang Liu ◽

Hillel Sanhedrai ◽

GaoGao Dong ◽

Louis M Shekhtman ◽

Fan Wang ◽

...

Keyword(s):

Real World ◽

Large Scale ◽

Complete Information ◽

Analytical Framework ◽

Epidemic Spreading ◽

Immunization Strategy ◽

Scale Free ◽

Limited Knowledge ◽

Small N ◽

Large Scale Networks

Abstract Targeted immunization of centralized nodes in large-scale networks has attracted significant attention. However, in real-world scenarios, knowledge and observations of the network may be limited, thereby precluding a full assessment of the optimal nodes to immunize (or quarantine) in order to avoid epidemic spreading such as that of the current coronavirus disease (COVID-19) epidemic. Here, we study a novel immunization strategy where only n nodes are observed at a time and the most central among these n nodes is immunized. This process can globally immunize a network. We find that even for small n (≈10) there is significant improvement in the immunization (quarantine), which is very close to the levels of immunization with full knowledge. We develop an analytical framework for our method and determine the critical percolation threshold pc and the size of the giant component P∞ for networks with arbitrary degree distributions P(k). In the limit of n → ∞ we recover prior work on targeted immunization, whereas for n = 1 we recover the known case of random immunization. Between these two extremes, we observe that, as n increases, pc increases quickly towards its optimal value under targeted immunization with complete information. In particular, we find a new general scaling relationship between |pc(∞) − pc(n)| and n as |pc(∞) − pc(n)| ∼ n−1exp(−αn). For scale-free (SF) networks, where P(k) ∼ k−γ, 2 < γ < 3, we find that pc has a transition from zero to nonzero when n increases from n = 1 to O(log N) (where N is the size of the network). Thus, for SF networks, having knowledge of ≈log N nodes and immunizing the most optimal among them can dramatically reduce epidemic spreading. We also demonstrate our limited knowledge immunization strategy on several real-world networks and confirm that in these real networks, pc increases significantly even for small n.

Download Full-text

Key Node Ranking in Complex Networks: A Novel Entropy and Mutual Information-Based Approach

Entropy ◽

10.3390/e22010052 ◽

2019 ◽

Vol 22 (1) ◽

pp. 52 ◽

Cited By ~ 3

Author(s):

Yichuan Li ◽

Weihong Cai ◽

Yao Li ◽

Xin Du

Keyword(s):

Network Analysis ◽

Mutual Information ◽

Real World ◽

Large Scale ◽

Local Information ◽

Global Information ◽

Network Characteristics ◽

Node Ranking ◽

Digital Network ◽

Key Nodes

Numerous problems in many fields can be solved effectively through the approach of modeling by complex network analysis. Finding key nodes is one of the most important and challenging problems in network analysis. In previous studies, methods have been proposed to identify key nodes. However, they rely mainly on a limited field of local information, lack large-scale access to global information, and are also usually NP-hard. In this paper, a novel entropy and mutual information-based centrality approach (EMI) is proposed, which attempts to capture a far wider range and a greater abundance of information for assessing how vital a node is. We have developed countermeasures to assess the influence of nodes: EMI is no longer confined to neighbor nodes, and both topological and digital network characteristics are taken into account. We employ mutual information to fix a flaw that exists in many methods. Experiments on real-world connected networks demonstrate the outstanding performance of the proposed approach in both correctness and efficiency as compared with previous approaches.

Download Full-text