scholarly journals The Power of Subsampling in Submodular Maximization

Author(s):  
Christopher Harshaw ◽  
Ehsan Kazemi ◽  
Moran Feldman ◽  
Amin Karbasi

We propose subsampling as a unified algorithmic technique for submodular maximization in centralized and online settings. The idea is simple: independently sample elements from the ground set and use simple combinatorial techniques (such as greedy or local search) on these sampled elements. We show that this approach leads to optimal/state-of-the-art results despite being much simpler than existing methods. In the usual off-line setting, we present SampleGreedy, which obtains a [Formula: see text]-approximation for maximizing a submodular function subject to a p-extendible system using [Formula: see text] evaluation and feasibility queries, where k is the size of the largest feasible set. The approximation ratio improves to p + 1 and p for monotone submodular and linear objectives, respectively. In the streaming setting, we present Sample-Streaming, which obtains a [Formula: see text]-approximation for maximizing a submodular function subject to a p-matchoid using O(k) memory and [Formula: see text] evaluation and feasibility queries per element, and m is the number of matroids defining the p-matchoid. The approximation ratio improves to 4p for monotone submodular objectives. We empirically demonstrate the effectiveness of our algorithms on video summarization, location summarization, and movie recommendation tasks.

Author(s):  
Ganquan Shi ◽  
Shuyang Gu ◽  
Weili Wu

[Formula: see text]-submodular maximization is a generalization of submodular maximization, which requires us to select [Formula: see text] disjoint subsets instead of one subset. Attracted by practical values and applications, we consider [Formula: see text]-submodular maximization with two kinds of constraints. For total size and individual size difference constraints, we present a [Formula: see text]-approximation algorithm for maximizing a nonnegative k-submodular function, running in time [Formula: see text] at worst. Specially, if [Formula: see text] is multiple of [Formula: see text], the approximation ratio can reduce to [Formula: see text], running in time [Formula: see text] at worst. Besides, this algorithm can be applied to [Formula: see text]-bisubmodular achieving [Formula: see text]-approximation running in time [Formula: see text]. Furthermore, if [Formula: see text] is multiple of 2, the approximation ratio can reduce to [Formula: see text], running in time [Formula: see text] at worst. For individual size constraint, there is a [Formula: see text]-approximation algorithm for maximizing a nonnegative [Formula: see text]-submodular function and an nonnegative [Formula: see text]-bisubmodular function, running in time [Formula: see text] and [Formula: see text] respectively, at worst.


Author(s):  
Xuefeng Chen ◽  
Xin Cao ◽  
Yifeng Zeng ◽  
Yixiang Fang ◽  
Bin Yao

Region search is an important problem in location-based services due to its wide applications. In this paper, we study the problem of optimal region search with submodular maximization (ORS-SM). This problem considers a region as a connected subgraph. We compute an objective value over the locations in the region using a submodular function and a budget value by summing up the costs of edges in the region, and aim to search the region with the largest objective score under a budget value constraint. ORS-SM supports many applications such as the most diversified region search. We prove that the problem is NP-hard and develop two approximation algorithms with guaranteed error bounds. We conduct experiments on two applications using three real-world datasets. The results demonstrate that our algorithms can achieve high-quality solutions and are faster than a state-of-the-art method by orders of magnitude.


Author(s):  
Kai Han ◽  
Shuang Cui ◽  
Tianshuai Zhu ◽  
Enpei Zhang ◽  
Benwei Wu ◽  
...  

Data summarization, i.e., selecting representative subsets of manageable size out of massive data, is often modeled as a submodular optimization problem. Although there exist extensive algorithms for submodular optimization, many of them incur large computational overheads and hence are not suitable for mining big data. In this work, we consider the fundamental problem of (non-monotone) submodular function maximization with a knapsack constraint, and propose simple yet effective and efficient algorithms for it. Specifically, we propose a deterministic algorithm with approximation ratio 6 and a randomized algorithm with approximation ratio 4, and show that both of them can be accelerated to achieve nearly linear running time at the cost of weakening the approximation ratio by an additive factor of ε. We then consider a more restrictive setting without full access to the whole dataset, and propose streaming algorithms with approximation ratios of 8+ε and 6+ε that make one pass and two passes over the data stream, respectively. As a by-product, we also propose a two-pass streaming algorithm with an approximation ratio of 2+ε when the considered submodular function is monotone. To the best of our knowledge, our algorithms achieve the best performance bounds compared to the state-of-the-art approximation algorithms with efficient implementation for the same problem. Finally, we evaluate our algorithms in two concrete submodular data summarization applications for revenue maximization in social networks and image summarization, and the empirical results show that our algorithms outperform the existing ones in terms of both effectiveness and efficiency.


2006 ◽  
Vol 14 (2) ◽  
pp. 223-253 ◽  
Author(s):  
Frédéric Lardeux ◽  
Frédéric Saubion ◽  
Jin-Kao Hao

This paper presents GASAT, a hybrid algorithm for the satisfiability problem (SAT). The main feature of GASAT is that it includes a recombination stage based on a specific crossover and a tabu search stage. We have conducted experiments to evaluate the different components of GASAT and to compare its overall performance with state-of-the-art SAT algorithms. These experiments show that GASAT provides very competitive results.


Author(s):  
Zhicheng Liu ◽  
Hong Chang ◽  
Ran Ma ◽  
Donglei Du ◽  
Xiaoyan Zhang

Abstract We consider a two-stage submodular maximization problem subject to a cardinality constraint and k matroid constraints, where the objective function is the expected difference of a nonnegative monotone submodular function and a nonnegative monotone modular function. We give two bi-factor approximation algorithms for this problem. The first is a deterministic $\left( {{1 \over {k + 1}}\left( {1 - {1 \over {{e^{k + 1}}}}} \right),1} \right)$ -approximation algorithm, and the second is a randomized $\left( {{1 \over {k + 1}}\left( {1 - {1 \over {{e^{k + 1}}}}} \right) - \varepsilon ,1} \right)$ -approximation algorithm with improved time efficiency.


Author(s):  
JUNAID BABER ◽  
NITIN AFZULPURKAR ◽  
SHIN'ICHI SATOH

Rapid increase in video databases has forced the industry to have efficient and effective frameworks for video retrieval and indexing. Video segmentation into scenes is widely used for video summarization, partitioning, indexing and retrieval. In this paper, we propose a framework for scene detection mainly based on entropy and Speeded Up Robust Features (SURF) features. First, we detect the fade and abrupt boundaries based on frame entropy analysis and SURF features matching. Fade boundaries are smart indication of scenes beginning or ending in many videos and dramas, and are detected by frame entropy analysis. Before abrupt boundary detection, unnecessary frames which are obviously not abrupt boundaries, such as blank screens, high intensity influenced images, sliding credits, are removed. Candidate boundaries are detected to make SURF features efficient for abrupt boundary detection, and SURF features between candidate boundaries and their adjacent frames are used to detect the abrupt boundaries. Second, key frames are extracted from abrupt shots. We evaluate our key frame extraction with other famous algorithms and show the effectiveness of the key frames. Finally, scene boundaries are detected using sliding window of size K over the key frames in temporal order. In experimental evaluation on the TRECVID-2007 shot boundary test set, the algorithm for shot boundary achieves substantial improvements over state-of-the-art methods with the precision of 99% and the recall of 97.8%. Experimental results for video segmentation into scenes are also promising, compared to famous state-of-the-art techniques.


2013 ◽  
Vol 46 ◽  
pp. 687-716 ◽  
Author(s):  
S. Cai ◽  
K. Su ◽  
C. Luo ◽  
A. Sattar

The Minimum Vertex Cover (MVC) problem is a prominent NP-hard combinatorial optimization problem of great importance in both theory and application. Local search has proved successful for this problem. However, there are two main drawbacks in state-of-the-art MVC local search algorithms. First, they select a pair of vertices to exchange simultaneously, which is time-consuming. Secondly, although using edge weighting techniques to diversify the search, these algorithms lack mechanisms for decreasing the weights. To address these issues, we propose two new strategies: two-stage exchange and edge weighting with forgetting. The two-stage exchange strategy selects two vertices to exchange separately and performs the exchange in two stages. The strategy of edge weighting with forgetting not only increases weights of uncovered edges, but also decreases some weights for each edge periodically. These two strategies are used in designing a new MVC local search algorithm, which is referred to as NuMVC. We conduct extensive experimental studies on the standard benchmarks, namely DIMACS and BHOSLIB. The experiment comparing NuMVC with state-of-the-art heuristic algorithms show that NuMVC is at least competitive with the nearest competitor namely PLS on the DIMACS benchmark, and clearly dominates all competitors on the BHOSLIB benchmark. Also, experimental results indicate that NuMVC finds an optimal solution much faster than the current best exact algorithm for Maximum Clique on random instances as well as some structured ones. Moreover, we study the effectiveness of the two strategies and the run-time behaviour through experimental analysis.


Sign in / Sign up

Export Citation Format

Share Document