A parametric approximation algorithm for spatial group keyword queries

With the application of big data, various queries arise for information retrieval. Spatial group keyword queries aim to find a set of spatial objects that cover the query keywords and minimize a goal function such as the total distance between the objects and the query point. This problem is widely found in database applications and is known to be NP-hard. Efficient algorithms for solving this problem can only provide approximate solutions, and most of these algorithms achieve a fixed approximation ratio (the upper bound of the ratio of an approximate goal value to the optimal goal value). Thus, to obtain a self-adjusting algorithm, we propose an approximation algorithm for achieving a parametric approximation ratio. The algorithm makes a trade-off between the approximation ratio and time consumption enabling the users to assign arbitrary query accuracy. Additionally, it runs in an on-the-fly manner, making it scalable to large-scale applications. The efficiency and scalability of the algorithm were further validated using benchmark datasets.

Download Full-text

Large-scale Semantic Parsing without Question-Answer Pairs

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00190 ◽

2014 ◽

Vol 2 ◽

pp. 377-392 ◽

Cited By ~ 40

Author(s):

Siva Reddy ◽

Mirella Lapata ◽

Mark Steedman

Keyword(s):

Natural Language ◽

Large Scale ◽

Graph Matching ◽

State Of The Art ◽

The State ◽

Semantic Parsing ◽

Matching Problem ◽

Weak Supervision ◽

Benchmark Datasets

In this paper we introduce a novel semantic parsing approach to query Freebase in natural language without requiring manual annotations or question-answer pairs. Our key insight is to represent natural language via semantic graphs whose topology shares many commonalities with Freebase. Given this representation, we conceptualize semantic parsing as a graph matching problem. Our model converts sentences to semantic graphs using CCG and subsequently grounds them to Freebase guided by denotations as a form of weak supervision. Evaluation experiments on a subset of the Free917 and WebQuestions benchmark datasets show our semantic parser improves over the state of the art.

Download Full-text

Incremental Community Detection on Large Complex Attributed Network

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3451216 ◽

2021 ◽

Vol 15 (6) ◽

pp. 1-20

Author(s):

Zhe Chen ◽

Aixin Sun ◽

Xiaokui Xiao

Keyword(s):

Community Detection ◽

Large Scale ◽

Network Data ◽

Topological Information ◽

Community Membership ◽

Attributed Network ◽

Benchmark Datasets ◽

Modularity Maximization ◽

Large Scale Networks

Community detection on network data is a fundamental task, and has many applications in industry. Network data in industry can be very large, with incomplete and complex attributes, and more importantly, growing. This calls for a community detection technique that is able to handle both attribute and topological information on large scale networks, and also is incremental. In this article, we propose inc-AGGMMR, an incremental community detection framework that is able to effectively address the challenges that come from scalability, mixed attributes, incomplete values, and evolving of the network. Through construction of augmented graph, we map attributes into the network by introducing attribute centers and belongingness edges. The communities are then detected by modularity maximization. During this process, we adjust the weights of belongingness edges to balance the contribution between attribute and topological information to the detection of communities. The weight adjustment mechanism enables incremental updates of community membership of all vertices. We evaluate inc-AGGMMR on five benchmark datasets against eight strong baselines. We also provide a case study to incrementally detect communities on a PayPal payment network which contains users with transactions. The results demonstrate inc-AGGMMR’s effectiveness and practicability.

Download Full-text

Algebraic multigrid methods

Acta Numerica ◽

10.1017/s0962492917000083 ◽

2017 ◽

Vol 26 ◽

pp. 591-721 ◽

Cited By ~ 37

Author(s):

Jinchao Xu ◽

Ludmil Zikatanov

Keyword(s):

Minimization Problem ◽

Large Scale ◽

Multigrid Method ◽

Approximate Solutions ◽

Multigrid Methods ◽

Algebraic Multigrid ◽

Unified Framework ◽

Smoothed Aggregation ◽

Algebraic Multigrid Methods ◽

Coarse Space

This paper provides an overview of AMG methods for solving large-scale systems of equations, such as those from discretizations of partial differential equations. AMG is often understood as the acronym of ‘algebraic multigrid’, but it can also be understood as ‘abstract multigrid’. Indeed, we demonstrate in this paper how and why an algebraic multigrid method can be better understood at a more abstract level. In the literature, there are many different algebraic multigrid methods that have been developed from different perspectives. In this paper we try to develop a unified framework and theory that can be used to derive and analyse different algebraic multigrid methods in a coherent manner. Given a smoother$R$for a matrix$A$, such as Gauss–Seidel or Jacobi, we prove that the optimal coarse space of dimension$n_{c}$is the span of the eigenvectors corresponding to the first$n_{c}$eigenvectors$\bar{R}A$(with$\bar{R}=R+R^{T}-R^{T}AR$). We also prove that this optimal coarse space can be obtained via a constrained trace-minimization problem for a matrix associated with$\bar{R}A$, and demonstrate that coarse spaces of most existing AMG methods can be viewed as approximate solutions of this trace-minimization problem. Furthermore, we provide a general approach to the construction of quasi-optimal coarse spaces, and we prove that under appropriate assumptions the resulting two-level AMG method for the underlying linear system converges uniformly with respect to the size of the problem, the coefficient variation and the anisotropy. Our theory applies to most existing multigrid methods, including the standard geometric multigrid method, classical AMG, energy-minimization AMG, unsmoothed and smoothed aggregation AMG and spectral AMGe.

Download Full-text

AN APPROXIMATION ALGORITHM FOR LOCATING MAXIMAL DISKS WITHIN CONVEX POLYGONS

International Journal of Computational Geometry & Applications ◽

10.1142/s0218195911003858 ◽

2011 ◽

Vol 21 (06) ◽

pp. 661-684

Author(s):

HIROFUMI AOTA ◽

TAKURO FUKUNAGA ◽

HIROSHI NAGAMOCHI

Keyword(s):

Approximation Algorithm ◽

Convex Polygon ◽

Computation Time ◽

Approximation Ratio ◽

Convex Polygons ◽

The Given

This paper considers a problem of locating the given number of disks into a container so that the area covered by the disks is maximized. In the problem, the radii of the disks can be changed arbitrarily unless they overlap outside of the container, and the disks are allowed to overlap with each other. We present an approximation algorithm for this problem assuming that the container is a convex polygon. Our algorithm achieves approximation ratio (0.78 - ϵ) for any small ϵ > 0. Since the computation time of our algorithm depends on the number of corners of the convex polygon exponentially, we also give a heuristic to reduce the number of corners.

Download Full-text

Efficient Algorithms for Max-Weighted Point Sweep Coverage on Lines

Sensors ◽

10.3390/s21041457 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1457

Author(s):

Dieyan Liang ◽

Hong Shen

Keyword(s):

Optimal Algorithm ◽

Approximate Solutions ◽

Approximation Ratio ◽

Mobile Sensors ◽

Np Hard ◽

Coverage Problem ◽

Polynomial Time Approximation Algorithm ◽

Weighted Point ◽

Eulerian Graph ◽

Set Of Points

As an important application of wireless sensor networks (WSNs), deployment of mobile sensors to periodically monitor (sweep cover) a set of points of interest (PoIs) arises in various applications, such as environmental monitoring and data collection. For a set of PoIs in an Eulerian graph, the point sweep coverage problem of deploying the fewest sensors to periodically cover a set of PoIs is known to be Non-deterministic Polynomial Hard (NP-hard), even if all sensors have the same velocity. In this paper, we consider the problem of finding the set of PoIs on a line periodically covered by a given set of mobile sensors that has the maximum sum of weight. The problem is first proven NP-hard when sensors are with different velocities in this paper. Optimal and approximate solutions are also presented for sensors with the same and different velocities, respectively. For M sensors and N PoIs, the optimal algorithm for the case when sensors are with the same velocity runs in O(MN) time; our polynomial-time approximation algorithm for the case when sensors have a constant number of velocities achieves approximation ratio 12; for the general case of arbitrary velocities, 12α and 12(1−1/e) approximation algorithms are presented, respectively, where integer α≥2 is the tradeoff factor between time complexity and approximation ratio.

Download Full-text

Efficient Heuristics for Large-Scale Vehicle Routing Problems Using Particle Swarm Optimization

International Journal of Green Computing ◽

10.4018/jgc.2012070103 ◽

2012 ◽

Vol 3 (2) ◽

pp. 34-50

Author(s):

A. Chandramouli ◽

L. Vivek Srinivasan ◽

T. T. Narendran

Keyword(s):

Particle Swarm Optimization ◽

Vehicle Routing ◽

Large Scale ◽

Particle Swarm ◽

Computational Effort ◽

Swarm Optimization ◽

Routing Problem ◽

Customer Base ◽

Benchmark Datasets ◽

Problem Instances

This paper addresses the Capacitated Vehicle Routing Problem (CVRP) with a homogenous fleet of vehicles serving a large customer base. The authors propose a multi-phase heuristic that clusters the nodes based on proximity, orients them along a route, and allots vehicles. For the final phase of determining the routes for each vehicle, they have developed a Particle Swarm Optimization (PSO) approach. Benchmark datasets as well as hypothetical datasets have been used for computational trials. The proposed heuristic is found to perform exceedingly well even for large problem instances, both in terms of quality of solutions and in terms of computational effort.

Download Full-text

A Comprehensive Taxonomy of Dynamic Texture Representation

ACM Computing Surveys ◽

10.1145/3487892 ◽

2023 ◽

Vol 55 (1) ◽

pp. 1-39

Author(s):

Thanh Tuan Nguyen ◽

Thanh Phuong Nguyen

Keyword(s):

Large Scale ◽

Environmental Changes ◽

State Of The Art ◽

The State ◽

Future Research ◽

Research Activities ◽

Potential Applications ◽

Benchmark Datasets ◽

Negative Impacts ◽

Made In

Representing dynamic textures (DTs) plays an important role in many real implementations in the computer vision community. Due to the turbulent and non-directional motions of DTs along with the negative impacts of different factors (e.g., environmental changes, noise, illumination, etc.), efficiently analyzing DTs has raised considerable challenges for the state-of-the-art approaches. For 20 years, many different techniques have been introduced to handle the above well-known issues for enhancing the performance. Those methods have shown valuable contributions, but the problems have been incompletely dealt with, particularly recognizing DTs on large-scale datasets. In this article, we present a comprehensive taxonomy of DT representation in order to purposefully give a thorough overview of the existing methods along with overall evaluations of their obtained performances. Accordingly, we arrange the methods into six canonical categories. Each of them is then taken in a brief presentation of its principal methodology stream and various related variants. The effectiveness levels of the state-of-the-art methods are then investigated and thoroughly discussed with respect to quantitative and qualitative evaluations in classifying DTs on benchmark datasets. Finally, we point out several potential applications and the remaining challenges that should be addressed in further directions. In comparison with two existing shallow DT surveys (i.e., the first one is out of date as it was made in 2005, while the newer one (published in 2016) is an inadequate overview), we believe that our proposed comprehensive taxonomy not only provides a better view of DT representation for the target readers but also stimulates future research activities.

Download Full-text

Detection of Botnet Based Attacks on Network

Handbook of Research on Network Forensics and Analysis Techniques - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-4100-4.ch007 ◽

2018 ◽

pp. 101-116

Author(s):

Prachi

Keyword(s):

Large Scale ◽

Flow Analysis ◽

Traffic Analysis ◽

High Accuracy ◽

Machine Learning Techniques ◽

Botnet Detection ◽

Learning Techniques ◽

Proposed Model ◽

Benchmark Datasets ◽

Traffic Flow Analysis

This chapter describes how with Botnets becoming more and more the leading cyber threat on the web nowadays, they also serve as the key platform for carrying out large-scale distributed attacks. Although a substantial amount of research in the fields of botnet detection and analysis, bot-masters inculcate new techniques to make them more sophisticated, destructive and hard to detect with the help of code encryption and obfuscation. This chapter proposes a new model to detect botnet behavior on the basis of traffic analysis and machine learning techniques. Traffic analysis behavior does not depend upon payload analysis so the proposed technique is immune to code encryption and other evasion techniques generally used by bot-masters. This chapter analyzes the benchmark datasets as well as real-time generated traffic to determine the feasibility of botnet detection using traffic flow analysis. Experimental results clearly indicate that a proposed model is able to classify the network traffic as a botnet or as normal traffic with a high accuracy and low false-positive rates.

Download Full-text

A Combination of Spatial Pyramid and Inverted Index for Large-Scale Image Retrieval

Computer Vision ◽

10.4018/978-1-5225-5204-8.ch054 ◽

2018 ◽

pp. 1307-1321

Author(s):

Vinh-Tiep Nguyen ◽

Thanh Duc Ngo ◽

Minh-Triet Tran ◽

Duy-Dinh Le ◽

Duc Anh Duong

Keyword(s):

Image Retrieval ◽

Large Scale ◽

Spatial Information ◽

Real Life ◽

Inverted Index ◽

Bag Of Words ◽

Visual Words ◽

Benchmark Datasets ◽

Large Scale Image Retrieval ◽

Inverted Indexing

Large-scale image retrieval has been shown remarkable potential in real-life applications. The standard approach is based on Inverted Indexing, given images are represented using Bag-of-Words model. However, one major limitation of both Inverted Index and Bag-of-Words presentation is that they ignore spatial information of visual words in image presentation and comparison. As a result, retrieval accuracy is decreased. In this paper, the authors investigate an approach to integrate spatial information into Inverted Index to improve accuracy while maintaining short retrieval time. Experiments conducted on several benchmark datasets (Oxford Building 5K, Oxford Building 5K+100K and Paris 6K) demonstrate the effectiveness of our proposed approach.

Download Full-text