Journal of Experimental Algorithmics

Engineering Practical Lempel-Ziv Tries

Journal of Experimental Algorithmics ◽

10.1145/3481638 ◽

2021 ◽

Vol 26 (1) ◽

pp. 1-47

Author(s):

Diego Arroyuelo ◽

Rodrigo Cánovas ◽

Johannes Fischer ◽

Dominik Köppl ◽

Marvin Löbel ◽

...

Keyword(s):

Factor Structure ◽

Data Structures ◽

Independent Interest ◽

New Techniques ◽

Memory Space ◽

Compressed Data Structures ◽

Regular Factor ◽

Factorization Algorithms ◽

Compressed Data

The Lempel-Ziv 78 ( LZ78 ) and Lempel-Ziv-Welch ( LZW ) text factorizations are popular, not only for bare compression but also for building compressed data structures on top of them. Their regular factor structure makes them computable within space bounded by the compressed output size. In this article, we carry out the first thorough study of low-memory LZ78 and LZW text factorization algorithms, introducing more efficient alternatives to the classical methods, as well as new techniques that can run within less memory space than the necessary to hold the compressed file. Our results build on hash-based representations of tries that may have independent interest.

Download Full-text

Faster Support Vector Machines

Journal of Experimental Algorithmics ◽

10.1145/3484730 ◽

2021 ◽

Vol 26 (1) ◽

pp. 1-21

Author(s):

Sebastian Schlag ◽

Matthias Schmitt ◽

Christian Schulz

Keyword(s):

Support Vector Machines ◽

Time Complexity ◽

Original Problem ◽

Label Propagation ◽

Support Vector ◽

Propagation Algorithm ◽

Svm Model ◽

Vector Machines ◽

Data Points ◽

Classification Quality

The time complexity of support vector machines (SVMs) prohibits training on huge datasets with millions of data points. Recently, multilevel approaches to train SVMs have been developed to allow for time-efficient training on huge datasets. While regular SVMs perform the entire training in one—time-consuming—optimization step, multilevel SVMs first build a hierarchy of problems decreasing in size that resemble the original problem and then train an SVM model for each hierarchy level, benefiting from the solved models of previous levels. We present a faster multilevel support vector machine that uses a label propagation algorithm to construct the problem hierarchy. Extensive experiments indicate that our approach is up to orders of magnitude faster than the previous fastest algorithm while having comparable classification quality. For example, already one of our sequential solvers is on average a factor 15 faster than the parallel ThunderSVM algorithm, while having similar classification quality. 1

Download Full-text

The Model Counting Competition 2020

Journal of Experimental Algorithmics ◽

10.1145/3459080 ◽

2021 ◽

Vol 26 (1) ◽

pp. 1-26

Author(s):

Johannes K. Fichte ◽

Markus Hecher ◽

Florim Hamiti

Keyword(s):

Practical Problem ◽

Probabilistic Reasoning ◽

Modern Society ◽

Boolean Formula ◽

Counting Problem ◽

The Past ◽

Boolean Formulas ◽

Model Counting ◽

Weighted Model ◽

New Applications

Many computational problems in modern society account to probabilistic reasoning, statistics, and combinatorics. A variety of these real-world questions can be solved by representing the question in (Boolean) formulas and associating the number of models of the formula directly with the answer to the question. Since there has been an increasing interest in practical problem solving for model counting over the past years, the Model Counting Competition was conceived in fall 2019. The competition aims to foster applications, identify new challenging benchmarks, and promote new solvers and improve established solvers for the model counting problem and versions thereof. We hope that the results can be a good indicator of the current feasibility of model counting and spark many new applications. In this article, we report on details of the Model Counting Competition 2020, about carrying out the competition, and the results. The competition encompassed three versions of the model counting problem, which we evaluated in separate tracks. The first track featured the model counting problem, which asks for the number of models of a given Boolean formula. On the second track, we challenged developers to submit programs that solve the weighted model counting problem. The last track was dedicated to projected model counting. In total, we received a surprising number of nine solvers in 34 versions from eight groups.

Download Full-text

An Updated Experimental Evaluation of Graph Bipartization Methods

Journal of Experimental Algorithmics ◽

10.1145/3467968 ◽

2021 ◽

Vol 26 (1) ◽

pp. 1-24

Author(s):

Timothy D. Goodrich ◽

Eric Horton ◽

Blair D. Sullivan

Keyword(s):

Optimization Problems ◽

State Of The Art ◽

Vertex Cover ◽

Quantum Annealing ◽

Heuristic Solution ◽

Quadratic Unconstrained Binary Optimization ◽

Odd Cycle ◽

Iterative Compression ◽

Hard Problems ◽

Near Term

We experimentally evaluate the practical state-of-the-art in graph bipartization (Odd Cycle Transversal (OCT)), motivated by the need for good algorithms for embedding problems into near-term quantum computing hardware. We assemble a preprocessing suite of fast input reduction routines from the OCT and Vertex Cover (VC) literature and compare algorithm implementations using Quadratic Unconstrained Binary Optimization problems from the quantum literature. We also generate a corpus of frustrated cluster loop graphs, which have previously been used to benchmark quantum annealing hardware. The diversity of these graphs leads to harder OCT instances than in existing benchmarks. In addition to combinatorial branching algorithms for solving OCT directly, we study various reformulations into other NP-hard problems such as VC and Integer Linear Programming (ILP), enabling the use of solvers such as CPLEX. We find that for heuristic solutions with time constraints under a second, iterative compression routines jump-started with a heuristic solution perform best, after which point using a highly tuned solver like CPLEX is worthwhile. Results on exact solvers are split between using ILP formulations on CPLEX and solving VC formulations with a branch-and-reduce solver. We extend our results with a large corpus of synthetic graphs, establishing robustness and potential to generalize to other domain data. In total, over 8,000 graph instances are evaluated, compared to the previous canonical corpus of 100 graphs. Finally, we provide all code and data in an open source suite, including a Python API for accessing reduction routines and branching algorithms, along with scripts for fully replicating our results.

Download Full-text

Practical Wavelet Tree Construction

Journal of Experimental Algorithmics ◽

10.1145/3457197 ◽

2021 ◽

Vol 26 ◽

pp. 1-67

Author(s):

Patrick Dinklage ◽

Jonas Ellert ◽

Johannes Fischer ◽

Florian Kurpicz ◽

Marvin Löbel

Keyword(s):

Parallel Algorithms ◽

Shared Memory ◽

Distributed Memory ◽

Auxiliary Information ◽

Parallel Computers ◽

External Memory ◽

Sequential Algorithms ◽

Bottom Up ◽

Memory Efficiency ◽

Tree Construction

We present new sequential and parallel algorithms for wavelet tree construction based on a new bottom-up technique. This technique makes use of the structure of the wavelet trees—refining the characters represented in a node of the tree with increasing depth—in an opposite way, by first computing the leaves (most refined), and then propagating this information upwards to the root of the tree. We first describe new sequential algorithms, both in RAM and external memory. Based on these results, we adapt these algorithms to parallel computers, where we address both shared memory and distributed memory settings. In practice, all our algorithms outperform previous ones in both time and memory efficiency, because we can compute all auxiliary information solely based on the information we obtained from computing the leaves. Most of our algorithms are also adapted to the wavelet matrix , a variant that is particularly suited for large alphabets.

Download Full-text

Dynamic Windows Scheduling with Reallocation

Journal of Experimental Algorithmics ◽

10.1145/3462208 ◽

2021 ◽

Vol 26 ◽

pp. 1-19

Author(s):

Martín Farach-Colton ◽

Katia Leal ◽

Miguel A. Mosteiro ◽

Christopher Thraves Caro

Keyword(s):

Supply Chain ◽

Bin Packing ◽

Time Slot ◽

Peak Load ◽

Communication Channels ◽

Current Load ◽

Constant Amount ◽

Unit Fractions ◽

Consecutive Time ◽

Multiple Variants

We consider the Windows Scheduling (WS) problem, which is a restricted version of Unit-Fractions Bin Packing, and it is also called Inventory Replenishment in the context of Supply Chain. In brief, WS problem is to schedule the use of communication channels to clients. Each client c i is characterized by an active cycle and a window w i . During the period of time that any given client c i is active, there must be at least one transmission from c i scheduled in any w i consecutive time slots, but at most one transmission can be carried out in each channel per time slot. The goal is to minimize the number of channels used. We extend previous online models, where decisions are permanent, assuming that clients may be reallocated at some cost. We assume that such cost is a constant amount paid per reallocation. That is, we aim to minimize also the number of reallocations. We present three online reallocation algorithms for Windows Scheduling. We evaluate experimentally multiple variants of these protocols showing that, in practice, all three achieve constant amortized reallocations with close to optimal channel usage. Our simulations also expose interesting tradeoffs between reallocations and channel usage. We introduce a new objective function for WS with reallocations that can be also applied to models where reallocations are not possible. We analyze this metric for one of the algorithms that, to the best of our knowledge, is the first online WS protocol with theoretical guarantees that applies to scenarios where clients may leave and the analysis is against current load rather than peak load. Using previous results, we also observe bounds on channel usage for one of the algorithms.

Download Full-text

ELRUNA

Journal of Experimental Algorithmics ◽

10.1145/3450703 ◽

2021 ◽

Vol 26 ◽

pp. 1-32

Author(s):

Zirou Qiu ◽

Ruslan Shaydulin ◽

Xiaoyuan Liu ◽

Yuri Alexeev ◽

Christopher S. Henry ◽

...

Keyword(s):

Local Search ◽

Network Alignment ◽

Alignment Algorithm ◽

The Novel ◽

Alignment Problem ◽

Network Similarity ◽

Objective Value ◽

High Level ◽

Complex Phenomena ◽

Entire Network

Networks model a variety of complex phenomena across different domains. In many applications, one of the most essential tasks is to align two or more networks to infer the similarities between cross-network vertices and to discover potential node-level correspondence. In this article, we propose ELRUNA ( el imination ru le-based n etwork a lignment), a novel network alignment algorithm that relies exclusively on the underlying graph structure. Under the guidance of the elimination rules that we defined, ELRUNA computes the similarity between a pair of cross-network vertices iteratively by accumulating the similarities between their selected neighbors. The resulting cross-network similarity matrix is then used to infer a permutation matrix that encodes the final alignment of cross-network vertices. In addition to the novel alignment algorithm, we improve the performance of local search , a commonly used postprocessing step for solving the network alignment problem, by introducing a novel selection method RAWSEM ( ra ndom- w alk-based se lection m ethod) based on the propagation of vertices’ mismatching across the networks. The key idea is to pass on the initial levels of mismatching of vertices throughout the entire network in a random-walk fashion. Through extensive numerical experiments on real networks, we demonstrate that ELRUNA significantly outperforms the state-of-the-art alignment methods in terms of alignment accuracy under lower or comparable running time. Moreover, ELRUNA is robust to network perturbations such that it can maintain a close-to-optimal objective value under a high level of noise added to the original networks. Finally, the proposed RAWSEM can further improve the alignment quality with a smaller number of iterations compared with the naive local search method. Reproducibility : The source code and data are available at https://tinyurl.com/uwn35an.

Download Full-text

Reverse-Safe Text Indexing

Journal of Experimental Algorithmics ◽

10.1145/3461698 ◽

2021 ◽

Vol 26 ◽

pp. 1-26

Author(s):

Giulia Bernardini ◽

Huiping Chen ◽

Gabriele Fici ◽

Grigorios Loukides ◽

Solon P. Pissis

Keyword(s):

Data Structure ◽

Pattern Matching ◽

Data Structures ◽

Matrix Multiplication ◽

Text Indexing ◽

Original Dataset ◽

Main Challenge ◽

The Matrix ◽

Preliminary Version ◽

Utility Loss

We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z - reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D . The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z , we propose an algorithm that constructs a z -reverse-safe data structure ( z -RSDS) that has size O(n) and answers decision and counting pattern matching queries of length at most d optimally, where d is maximal for any such z -RSDS. The construction algorithm takes O(nɷ log d) time, where ɷ is the matrix multiplication exponent. We show that, despite the nɷ factor, our engineered implementation takes only a few minutes to finish for million-letter texts. We also show that plugging our method in data analysis applications gives insignificant or no data utility loss. Furthermore, we show how our technique can be extended to support applications under realistic adversary models. Finally, we show a z -RSDS for decision pattern matching queries, whose size can be sublinear in n . A preliminary version of this article appeared in ALENEX 2020.

Download Full-text

HyperBench

Journal of Experimental Algorithmics ◽

10.1145/3440015 ◽

2021 ◽

Vol 26 ◽

pp. 1-40

Author(s):

Wolfgang Fischl ◽

Georg Gottlob ◽

Davide Mario Longo ◽

Reinhard Pichler

Keyword(s):

Constraint Satisfaction ◽

Decomposition Methods ◽

Constraint Satisfaction Problems ◽

Large Set ◽

Web Interface ◽

Conjunctive Queries ◽

Practical Algorithms ◽

New Infrastructure

To cope with the intractability of answering Conjunctive Queries (CQs) and solving Constraint Satisfaction Problems (CSPs), several notions of hypergraph decompositions have been proposed—giving rise to different notions of width, noticeably, plain, generalized, and fractional hypertree width (hw, ghw, and fhw). Given the increasing interest in using such decomposition methods in practice, a publicly accessible repository of decomposition software, as well as a large set of benchmarks, and a web-accessible workbench for inserting, analyzing, and retrieving hypergraphs are called for. We address this need by providing (i) concrete implementations of hypergraph decompositions (including new practical algorithms), (ii) a new, comprehensive benchmark of hypergraphs stemming from disparate CQ and CSP collections, and (iii) HyperBench, our new web-interface for accessing the benchmark and the results of our analyses. In addition, we describe a number of actual experiments we carried out with this new infrastructure.

Download Full-text

Quantum Annealing versus Digital Computing

Journal of Experimental Algorithmics ◽

10.1145/3459606 ◽

2021 ◽

Vol 26 ◽

pp. 1-30

Author(s):

Michael Jünger ◽

Elisabeth Lobe ◽

Petra Mutzel ◽

Gerhard Reinelt ◽

Franz Rendl ◽

...

Keyword(s):

Ising Models ◽

Branch And Cut ◽

Processing Unit ◽

Quantum Annealing ◽

Maximum Cut ◽

Quadratic Unconstrained Binary Optimization ◽

Cut Problems ◽

Programming Algorithms ◽

D Wave

Quantum annealing is getting increasing attention in combinatorial optimization. The quantum processing unit by D-Wave is constructed to approximately solve Ising models on so-called Chimera graphs. Ising models are equivalent to quadratic unconstrained binary optimization (QUBO) problems and maximum cut problems on the associated graphs. We have tailored branch-and-cut as well as semidefinite programming algorithms for solving Ising models for Chimera graphs to provable optimality and use the strength of these approaches for comparing our solution values to those obtained on the current quantum annealing machine, D-Wave 2000Q. This allows for the assessment of the quality of solutions produced by the D-Wave hardware. In addition, we also evaluate the performance of a heuristic by Selby. It has been a matter of discussion in the literature how well the D-Wave hardware performs at its native task, and our experiments shed some more light on this issue. In particular, we examine how reliably the D-Wave computer can deliver true optimum solutions and present some surprising results.

Download Full-text

Journal of Experimental Algorithmics
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Association For Computing Machinery

Engineering Practical Lempel-Ziv Tries

Faster Support Vector Machines

The Model Counting Competition 2020

An Updated Experimental Evaluation of Graph Bipartization Methods

Practical Wavelet Tree Construction

Dynamic Windows Scheduling with Reallocation

ELRUNA

Reverse-Safe Text Indexing

HyperBench

Quantum Annealing versus Digital Computing

Export Citation Format

Journal of Experimental AlgorithmicsLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Association For Computing Machinery

Engineering Practical Lempel-Ziv Tries

Faster Support Vector Machines

The Model Counting Competition 2020

An Updated Experimental Evaluation of Graph Bipartization Methods

Practical Wavelet Tree Construction

Dynamic Windows Scheduling with Reallocation

ELRUNA

Reverse-Safe Text Indexing

HyperBench

Quantum Annealing versus Digital Computing

Journal of Experimental Algorithmics
Latest Publications