search space pruning
Recently Published Documents


TOTAL DOCUMENTS

34
(FIVE YEARS 8)

H-INDEX

5
(FIVE YEARS 2)

2022 ◽  
Vol 13 (1) ◽  
pp. 1-22
Author(s):  
M. Saqib Nawaz ◽  
Philippe Fournier-Viger ◽  
Unil Yun ◽  
Youxi Wu ◽  
Wei Song

High utility itemset mining (HUIM) is the task of finding all items set, purchased together, that generate a high profit in a transaction database. In the past, several algorithms have been developed to mine high utility itemsets (HUIs). However, most of them cannot properly handle the exponential search space while finding HUIs when the size of the database and total number of items increases. Recently, evolutionary and heuristic algorithms were designed to mine HUIs, which provided considerable performance improvement. However, they can still have a long runtime and some may miss many HUIs. To address this problem, this article proposes two algorithms for HUIM based on Hill Climbing (HUIM-HC) and Simulated Annealing (HUIM-SA). Both algorithms transform the input database into a bitmap for efficient utility computation and for search space pruning. To improve population diversity, HUIs discovered by evolution are used as target values for the next population instead of keeping the current optimal values in the next population. Through experiments on real-life datasets, it was found that the proposed algorithms are faster than state-of-the-art heuristic and evolutionary HUIM algorithms, that HUIM-SA discovers similar HUIs, and that HUIM-SA evolves linearly with the number of iterations.


2021 ◽  
Author(s):  
Xinping Wang ◽  
Liangyu Chen ◽  
Tong Wang ◽  
Mingang Chen ◽  
Min Zhang

2021 ◽  
Vol 7 (3) ◽  
pp. 1-33
Author(s):  
Joachim Gudmundsson ◽  
Michael Horton ◽  
John Pfeifer ◽  
Martin P. Seybold

We present a scalable approach for range and k nearest neighbor queries under computationally expensive metrics, like the continuous Fréchet distance on trajectory data. Based on clustering for metric indexes, we obtain a dynamic tree structure whose size is linear in the number of trajectories, regardless of the trajectory’s individual sizes or the spatial dimension, which allows one to exploit low “intrinsic dimensionality” of datasets for effective search space pruning. Since the distance computation is expensive, generic metric indexing methods are rendered impractical. We present strategies that (i) improve on known upper and lower bound computations, (ii) build cluster trees without any or very few distance calls, and (iii) search using bounds for metric pruning, interval orderings for reduction, and randomized pivoting for reporting the final results. We analyze the efficiency and effectiveness of our methods with extensive experiments on diverse synthetic and real-world datasets. The results show improvement over state-of-the-art methods for exact queries, and even further speedups are achieved for queries that may return approximate results. Surprisingly, the majority of exact nearest-neighbor queries on real datasets are answered without any distance computations.


Author(s):  
Aman Abidi ◽  
Rui Zhou ◽  
Lu Chen ◽  
Chengfei Liu

Enumerating maximal bicliques in a bipartite graph is an important problem in data mining, with innumerable real-world applications across different domains such as web community, bioinformatics, etc. Although substantial research has been conducted on this problem, surprisingly, we find that pivot-based search space pruning, which is quite effective in clique enumeration, has not been exploited in biclique scenario. Therefore, in this paper, we explore the pivot-based pruning for biclique enumeration. We propose an algorithm for implementing the pivot-based pruning, powered by an effective index structure Containment Directed Acyclic Graph (CDAG). Meanwhile, existing literature indicates contradictory findings on the order of vertex selection in biclique enumeration. As such, we re-examine the problem and suggest an offline ordering of vertices which expedites the pivot pruning. We conduct an extensive performance study using real-world datasets from a wide range of domains. The experimental results demonstrate that our algorithm is more scalable and outperforms all the existing algorithms across all datasets and can achieve a significant speedup against the previous algorithms.


Author(s):  
Adam Wieckowski ◽  
Jackie Ma ◽  
Heiko Schwarz ◽  
Detlev Marpe ◽  
Thomas Wiegand

2019 ◽  
Vol 252 ◽  
pp. 03005
Author(s):  
Radosław Klimek ◽  
Katarzyna Grobler-Dębska ◽  
Edyta Kucharska

The satisfiability problem (SAT) is one of the classical and also most important problems of the theoretical computer science and has a direct bearing on numerous practical cases. It is one of the most prominent problems in artificial intelligence and has important applications in many fields, such as hardware and software verification, test-case generation, AI planning, scheduling, and data structures that allow efficient implementation of search space pruning. In recent years, there has been a huge development in SAT solvers, especially CDCL-based solvers (Conflict-Driven Clause-Learning) for propositional logic formulas. The goal of this paper is to design and implement a simple but effective system for random generation of long and complex logical formulas with a variety of difficulties encoded inside. The resulting logical formulas, i.e. problem instances, could be used for testing existing SAT solvers. The entire system would be widely available as a web application in the client-server architecture. The proposed system enables generation of syntactically correct logical formulas with a random structure, encoded in a manner understandable to SAT Solvers. Logical formulas can be presented in different formats. A number of parameters affect the form of generated instances, their complexity and physical dimensions. The randomness factor can be entered to every generated formula. The developed application is easy to modify and open for further extensions. The final part of the paper describes examples of solvers’ tests of logical formulas generated by the implemented generator.


2018 ◽  
Vol 62 ◽  
pp. 665-727 ◽  
Author(s):  
Thomas Eiter ◽  
Tobias Kaminski ◽  
Christoph Redl ◽  
Antonius Weinzierl

Answer Set Programming (ASP) is a well-known declarative problem solving approach based on nonmonotonic logic programs, which has been successfully applied to a wide range of applications in artificial intelligence and beyond. To address the needs of modern applications, HEX-programs were introduced as an extension of ASP with external atoms for accessing information outside programs via an API style bi-directional interface mechanism. To evaluate such programs, conflict-driving learning algorithms for SAT and ASP solving have been extended in order to capture the semantics of external atoms. However, a drawback of the state-of-the-art approach is that external atoms are only evaluated under complete assignments (i.e., input to the external source) while in practice, their values often can be determined already based on partial assignments alone (i.e., from incomplete input to the external source). This prevents early backtracking in case of conflicts, and hinders more efficient evaluation of HEX-programs. We thus extend the notion of external atoms to allow for three-valued evaluation under partial assignments, while the two-valued semantics of the overall HEX-formalism remains unchanged. This paves the way for three enhancements: first, to evaluate external sources at any point during model search, which can trigger learning knowledge about the source behavior and/or early backtracking in the spirit of theory propagation in SAT modulo theories (SMT). Second, to optimize the knowledge learned in terms of so-called nogoods, which roughly speaking are impossible input-output configurations. Shrinking nogoods to their relevant input part leads to more effective search space pruning. And third, to make a necessary minimality check of candidate answer sets more efficient by exploiting early external evaluation calls. As this check usually accounts for a large share of the total runtime, optimization is here particularly important. We further present an experimental evaluation of an implementation of a novel HEX-algorithm that incorporates these enhancements using a benchmark suite. Our results demonstrate a clear efficiency gain over the state-of-the-art HEX-solver for the benchmarks, and provide insights regarding the most effective combinations of solver configurations.


Sign in / Sign up

Export Citation Format

Share Document