Improved Strong Worst-case Upper Bounds for MDP Planning

The Markov Decision Problem (MDP) plays a central role in AI as an abstraction of sequential decision making. We contribute to the theoretical analysis of MDP PLANNING, which is the problem of computing an optimal policy for a given MDP. Specifically, we furnish improved STRONG WORST-CASE upper bounds on the running time of MDP planning. Strong bounds are those that depend only on the number of states n and the number of actions k in the specified MDP; they have no dependence on affiliated variables such as the discount factor and the number of bits needed to represent the MDP. Worst-case bounds apply to EVERY run of an algorithm; randomised algorithms can typically yield faster EXPECTED running times. While the special case of 2-action MDPs (that is, k = 2) has recently received some attention, bounds for general k have remained to be improved for several decades. Our contributions are to this general case. For k >= 3, the tightest strong upper bound shown to date for MDP planning belongs to a family of algorithms called Policy Iteration. This bound is only a polynomial improvement over a trivial bound of poly(n, k) k^{n} [Mansour and Singh, 1999]. In this paper, we generalise a contrasting algorithm called the Fibonacci Seesaw, and derive a bound of poly(n, k) k^{0.6834n}. The key construct we use is a template to map algorithms for the 2-action setting to the general setting. Interestingly, this idea can also be used to design Policy Iteration algorithms with a running time upper bound of poly(n, k) k^{0.7207n}. Both our results improve upon bounds that have stood for several decades.

Download Full-text

Chapter 16. Worst-Case Upper Bounds

Frontiers in Artificial Intelligence and Applications - Handbook of Satisfiability ◽

10.3233/faia200999 ◽

2021 ◽

Author(s):

Evgeny Dantsin ◽

Edward A. Hirsch

Keyword(s):

Structural Complexity ◽

Upper Bounds ◽

Worst Case ◽

Exponential Time ◽

Running Time ◽

Dichotomy Theorem ◽

Exponential Time Hypothesis ◽

Satisfiability Algorithms

The chapter is a survey of ideas and techniques behind satisfiability algorithms with the currently best asymptotic upper bounds on the worst-case running time. The survey also includes related structural-complexity topics such as Schaefer’s dichotomy theorem, reductions between various restricted cases of SAT, the exponential time hypothesis, etc.

Download Full-text

An ExpTime Upper Bound for ALC with Integers

Proceedings of the Seventeenth International Conference on Principles of Knowledge Representation and Reasoning ◽

10.24963/kr.2020/61 ◽

2020 ◽

Author(s):

Nadia Labai ◽

Magdalena Ortiz ◽

Mantas Šimkus

Keyword(s):

Upper Bound ◽

Description Logics ◽

Upper Bounds ◽

Worst Case ◽

Exponential Time ◽

Functional Roles ◽

Case Complexity ◽

Worst Case Complexity ◽

Concrete Domains ◽

Single Exponential

Concrete domains, especially those that allow to compare features with numeric values, have long been recognized as a very desirable extension of description logics (DLs), and significant efforts have been invested into adding them to usual DLs while keeping the complexity of reasoning in check. For expressive DLs and in the presence of general TBoxes, for standard reasoning tasks like consistency, the most general decidability results are for the so-called ω-admissible domains, which are required to be dense. Supporting non-dense domains for features that range over integers or natural numbers remained largely open, despite often being singled out as a highly desirable extension. The decidability of some extensions of ALC with non-dense domains has been shown, but existing results rely on powerful machinery that does not allow to infer any elementary bounds on the complexity of the problem. In this paper, we study an extension of ALC with a rich integer domain that allows for comparisons (between features, and between features and constants coded in unary), and prove that consistency can be solved using automata-theoretic techniques in single exponential time, and thus has no higher worst-case complexity than standard ALC. Our upper bounds apply to some extensions of DLs with concrete domains known from the literature, support general TBoxes, and allow for comparing values along paths of ordinary (not necessarily functional) roles.

Download Full-text

A Discrete Bouncy Particle Sampler

Biometrika ◽

10.1093/biomet/asab013 ◽

2021 ◽

Author(s):

C Sherlock ◽

A H Thiery

Keyword(s):

State Of The Art ◽

Scaling Limit ◽

General Setting ◽

Upper Bounds ◽

Basic Algorithm ◽

Target Density ◽

Target Probability ◽

Probability Densities ◽

Delayed Rejection ◽

Special Case

Abstract Most Markov chain Monte Carlo methods operate in discrete time and are reversible with respect to the target probability. Nevertheless, it is now understood that the use of nonreversible Markov chains can be beneficial in many contexts. In particular, the recently-proposed bouncy particle sampler leverages a continuous-time and nonreversible Markov process and empirically shows state-of-the-art performances when used to explore certain probability densities; however, its implementation typically requires the computation of local upper bounds on the gradient of the log target density. We present the discrete bouncy particle sampler, a general algorithm based upon a guided random walk, a partial refreshment of direction, and a delayed-rejection step. We show that the bouncy particle sampler can be understood as a scaling limit of a special case of our algorithm. In contrast to the bouncy particle sampler, implementing the discrete bouncy particle sampler only requires point-wise evaluation of the target density and its gradient. We propose extensions of the basic algorithm for situations when the exact gradient of the target density is not available. In a Gaussian setting, we establish a scaling limit for the radial process as dimension increases to infinity. We leverage this result to obtain the theoretical efficiency of the discrete bouncy particle sampler as a function of the partial-refreshment parameter, which leads to a simple and robust tuning criterion. A further analysis in a more general setting suggests that this tuning criterion applies more generally. Theoretical and empirical efficiency curves are then compared for different targets and algorithm variations.

Download Full-text

Automated Testbench for Hybrid Machine Learning-Based Worst-Case Energy Consumption Analysis on Batteryless IoT Devices

Energies ◽

10.3390/en14133914 ◽

2021 ◽

Vol 14 (13) ◽

pp. 3914

Author(s):

Thomas Huybrechts ◽

Philippe Reiter ◽

Siegfried Mercelis ◽

Jeroen Famaey ◽

Steven Latré ◽

...

Keyword(s):

Machine Learning ◽

Energy Consumption ◽

Upper Bound ◽

Regression Models ◽

Upper Bounds ◽

Energy Budgets ◽

Energy Aware ◽

Efficient Manner ◽

Worst Case ◽

Iot Devices

Batteryless Internet-of-Things (IoT) devices need to schedule tasks on very limited energy budgets from intermittent energy harvesting. Creating an energy-aware scheduler allows the device to schedule tasks in an efficient manner to avoid power loss during execution. To achieve this, we need insight in the Worst-Case Energy Consumption (WCEC) of each schedulable task on the device. Different methodologies exist to determine or approximate the energy consumption. However, these approaches are computationally expensive and infeasible to perform on all type of devices; or are not accurate enough to acquire safe upper bounds. We propose a hybrid methodology that combines machine learning-based prediction on small code sections, called hybrid blocks, with static analysis to combine the predictions to a final upper bound estimation for the WCEC. In this paper, we present our work on an automated testbench for the Code Behaviour Framework (COBRA) that measures and profiles the upper bound energy consumption on the target device. Next, we use the upper bound measurements of the testbench to train eight different regression models that need to predict these upper bounds. The results show promising estimates for three regression models that could potentially be used for the methodology with additional tuning and training.

Download Full-text

Low upper bounds of ideals

Journal of Symbolic Logic ◽

10.2178/jsl/1243948325 ◽

2009 ◽

Vol 74 (2) ◽

pp. 517-534 ◽

Cited By ~ 6

Author(s):

Antonín Kučera ◽

Theodore A. Slaman

Keyword(s):

Upper Bound ◽

Upper Bounds ◽

Point Of View ◽

Algorithmic Randomness ◽

General Characterization ◽

Special Case

AbstractWe show that there is a lowT-upper bound for the class ofK-trivial sets, namely those which are weak from the point of view of algorithmic randomness. This result is a special case of a more general characterization of ideals inT-degrees for which there is a lowT-upper bound.

Download Full-text

Composition of Web Services Using Markov Decision Processes and Dynamic Programming

The Scientific World JOURNAL ◽

10.1155/2015/545308 ◽

2015 ◽

Vol 2015 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Víctor Uc-Cetina ◽

Francisco Moo-Mena ◽

Rafael Hernandez-Ucan

Keyword(s):

Web Services ◽

Policy Evaluation ◽

Process Model ◽

Real Data ◽

Policy Iteration ◽

Value Iteration ◽

Worst Case ◽

Markov Decision ◽

Minimum Number

We propose a Markov decision process model for solving the Web service composition (WSC) problem. Iterative policy evaluation, value iteration, and policy iteration algorithms are used to experimentally validate our approach, with artificial and real data. The experimental results show the reliability of the model and the methods employed, with policy iteration being the best one in terms of the minimum number of iterations needed to estimate an optimal policy, with the highest Quality of Service attributes. Our experimental work shows how the solution of a WSC problem involving a set of 100,000 individual Web services and where a valid composition requiring the selection of 1,000 services from the available set can be computed in the worst case in less than 200 seconds, using an Intel Core i5 computer with 6 GB RAM. Moreover, a real WSC problem involving only 7 individual Web services requires less than 0.08 seconds, using the same computational power. Finally, a comparison with two popular reinforcement learning algorithms, sarsa andQ-learning, shows that these algorithms require one or two orders of magnitude and more time than policy iteration, iterative policy evaluation, and value iteration to handle WSC problems of the same complexity.

Download Full-text

The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

Mathematics of Operations Research ◽

10.1287/moor.1110.0516 ◽

2011 ◽

Vol 36 (4) ◽

pp. 593-603 ◽

Cited By ~ 57

Author(s):

Yinyu Ye

Keyword(s):

Discount Rate ◽

Decision Problem ◽

Policy Iteration ◽

Markov Decision Problem ◽

Markov Decision ◽

Iteration Methods ◽

Strongly Polynomial

Download Full-text

The Complexity of Malign Ensembles

DAIMI Report Series ◽

10.7146/dpb.v19i335.6565 ◽

1990 ◽

Vol 19 (335) ◽

Author(s):

Peter Bro Miltersen

Keyword(s):

Polynomial Time ◽

Upper Bounds ◽

Lower And Upper Bounds ◽

Worst Case ◽

Average Case ◽

Exponential Time ◽

Running Time ◽

Exponential Time Algorithms

We analyze the concept of malignness, which is the property of probability ensembles of making the average case running time equal to the worst case running time for a class of algorithms. We derive lower and upper bounds on the complexity of malign ensembles, which are tight for exponential time algorithms, and which show that no polynomial time computable malign ensemble exists for the class of superlinear algorithms. Furthermore, we show that for no class of superlinear algorithms a polynomial time computable malign ensemble exists, unless every language in P has an expected polynomial time constructor.

Download Full-text

Erdős-Ginzburg-Ziv Constants by Avoiding Three-Term Arithmetic Progressions

The Electronic Journal of Combinatorics ◽

10.37236/7275 ◽

2018 ◽

Vol 25 (2) ◽

Cited By ~ 2

Author(s):

Jacob Fox ◽

Lisa Sauermann

Keyword(s):

Abelian Group ◽

Arithmetic Progression ◽

Upper Bound ◽

Finite Abelian Group ◽

Upper Bounds ◽

Arithmetic Progressions ◽

Direct Connection ◽

Prime Divisors ◽

Zero Sum ◽

Special Case

For a finite abelian group $G$, The Erdős-Ginzburg-Ziv constant $\mathfrak{s}(G)$ is the smallest $s$ such that every sequence of $s$ (not necessarily distinct) elements of $G$ has a zero-sum subsequence of length $\operatorname{exp}(G)$. For a prime $p$, let $r(\mathbb{F}_p^n)$ denote the size of the largest subset of $\mathbb{F}_p^n$ without a three-term arithmetic progression. Although similar methods have been used to study $\mathfrak{s}(G)$ and $r(\mathbb{F}_p^n)$, no direct connection between these quantities has previously been established. We give an upper bound for $\mathfrak{s}(G)$ in terms of $r(\mathbb{F}_p^n)$ for the prime divisors $p$ of $\operatorname{exp}(G)$. For the special case $G=\mathbb{F}_p^n$, we prove $\mathfrak{s}(\mathbb{F}_p^n)\leq 2p\cdot r(\mathbb{F}_p^n)$. Using the upper bounds for $r(\mathbb{F}_p^n)$ of Ellenberg and Gijswijt, this result improves the previously best known upper bounds for $\mathfrak{s}(\mathbb{F}_p^n)$ given by Naslund.

Download Full-text

The Complexity of Constructing Evolutionary Trees Using Experiments

BRICS Report Series ◽

10.7146/brics.v8i1.20220 ◽

2001 ◽

Vol 8 (1) ◽

Author(s):

Gerth Stølting Brodal ◽

Rolf Fagerberg ◽

Christian N. S. Pedersen ◽

Anna Östlin

Keyword(s):

Lower Bound ◽

Lower Bounds ◽

Upper Bound ◽

Upper Bounds ◽

Upper And Lower Bounds ◽

Optimal Time ◽

Evolutionary Trees ◽

Dynamic Algorithm ◽

Running Time ◽

Small Height

We present tight upper and lower bounds for the problem of constructing evolutionary trees in the experiment model. We describe an algorithm which constructs an evolutionary tree of n species in time O(n d logd n) using at most n |d/2| (log2|d/2|−1 n + O(1)) experiments for d > 2, and at most n(log n + O(1)) experiments for d = 2, where d is the degree of the tree. This improves the previous best upper bound by a factor Theta(log d). For d = 2 the previously best algorithm with running time O(n log n) had a bound of 4n log n on the number of experiments. By an explicit adversary argument, we show an Omega(nd logd n) lower bound, matching our upper bounds and improving the previous best lower bound by a factor Theta(logd n). Central to our algorithm is the construction and maintenance of separator trees of small height. We present how to maintain separator trees with height log n + O(1) under the insertion of new nodes in amortized time O(log n). Part of our dynamic algorithm is an algorithm for computing a centroid tree in optimal time O(n).Keywords: Evolutionary trees, Experiment model, Separator trees, Centroid tree, Lower bounds

Download Full-text