scholarly journals Improved Strong Worst-case Upper Bounds for MDP Planning

Author(s):  
Anchit Gupta ◽  
Shivaram Kalyanakrishnan

The Markov Decision Problem (MDP) plays a central role in AI as an abstraction of sequential decision making. We contribute to the theoretical analysis of MDP PLANNING, which is the problem of computing an optimal policy for a given MDP. Specifically, we furnish improved STRONG WORST-CASE upper bounds on the running time of MDP planning. Strong bounds are those that depend only on the number of states n and the number of actions k in the specified MDP; they have no dependence on affiliated variables such as the discount factor and the number of bits needed to represent the MDP. Worst-case bounds apply to EVERY run of an algorithm; randomised algorithms can typically yield faster EXPECTED running times. While the special case of 2-action MDPs (that is, k = 2) has recently received some attention, bounds for general k have remained to be improved for several decades. Our contributions are to this general case. For k >= 3, the tightest strong upper bound shown to date for MDP planning belongs to a family of algorithms called Policy Iteration. This bound is only a polynomial improvement over a trivial bound of poly(n, k) k^{n} [Mansour and Singh, 1999]. In this paper, we generalise a contrasting algorithm called the Fibonacci Seesaw, and derive a bound of poly(n, k) k^{0.6834n}. The key construct we use is a template to map algorithms for the 2-action setting to the general setting. Interestingly, this idea can also be used to design Policy Iteration algorithms with a running time upper bound of poly(n, k) k^{0.7207n}. Both our results improve upon bounds that have stood for several decades.

Author(s):  
Evgeny Dantsin ◽  
Edward A. Hirsch

The chapter is a survey of ideas and techniques behind satisfiability algorithms with the currently best asymptotic upper bounds on the worst-case running time. The survey also includes related structural-complexity topics such as Schaefer’s dichotomy theorem, reductions between various restricted cases of SAT, the exponential time hypothesis, etc.


Author(s):  
Nadia Labai ◽  
Magdalena Ortiz ◽  
Mantas Šimkus

Concrete domains, especially those that allow to compare features with numeric values, have long been recognized as a very desirable extension of description logics (DLs), and significant efforts have been invested into adding them to usual DLs while keeping the complexity of reasoning in check. For expressive DLs and in the presence of general TBoxes, for standard reasoning tasks like consistency, the most general decidability results are for the so-called ω-admissible domains, which are required to be dense. Supporting non-dense domains for features that range over integers or natural numbers remained largely open, despite often being singled out as a highly desirable extension. The decidability of some extensions of ALC with non-dense domains has been shown, but existing results rely on powerful machinery that does not allow to infer any elementary bounds on the complexity of the problem. In this paper, we study an extension of ALC with a rich integer domain that allows for comparisons (between features, and between features and constants coded in unary), and prove that consistency can be solved using automata-theoretic techniques in single exponential time, and thus has no higher worst-case complexity than standard ALC. Our upper bounds apply to some extensions of DLs with concrete domains known from the literature, support general TBoxes, and allow for comparing values along paths of ordinary (not necessarily functional) roles.


Biometrika ◽  
2021 ◽  
Author(s):  
C Sherlock ◽  
A H Thiery

Abstract Most Markov chain Monte Carlo methods operate in discrete time and are reversible with respect to the target probability. Nevertheless, it is now understood that the use of nonreversible Markov chains can be beneficial in many contexts. In particular, the recently-proposed bouncy particle sampler leverages a continuous-time and nonreversible Markov process and empirically shows state-of-the-art performances when used to explore certain probability densities; however, its implementation typically requires the computation of local upper bounds on the gradient of the log target density. We present the discrete bouncy particle sampler, a general algorithm based upon a guided random walk, a partial refreshment of direction, and a delayed-rejection step. We show that the bouncy particle sampler can be understood as a scaling limit of a special case of our algorithm. In contrast to the bouncy particle sampler, implementing the discrete bouncy particle sampler only requires point-wise evaluation of the target density and its gradient. We propose extensions of the basic algorithm for situations when the exact gradient of the target density is not available. In a Gaussian setting, we establish a scaling limit for the radial process as dimension increases to infinity. We leverage this result to obtain the theoretical efficiency of the discrete bouncy particle sampler as a function of the partial-refreshment parameter, which leads to a simple and robust tuning criterion. A further analysis in a more general setting suggests that this tuning criterion applies more generally. Theoretical and empirical efficiency curves are then compared for different targets and algorithm variations.


Energies ◽  
2021 ◽  
Vol 14 (13) ◽  
pp. 3914
Author(s):  
Thomas Huybrechts ◽  
Philippe Reiter ◽  
Siegfried Mercelis ◽  
Jeroen Famaey ◽  
Steven Latré ◽  
...  

Batteryless Internet-of-Things (IoT) devices need to schedule tasks on very limited energy budgets from intermittent energy harvesting. Creating an energy-aware scheduler allows the device to schedule tasks in an efficient manner to avoid power loss during execution. To achieve this, we need insight in the Worst-Case Energy Consumption (WCEC) of each schedulable task on the device. Different methodologies exist to determine or approximate the energy consumption. However, these approaches are computationally expensive and infeasible to perform on all type of devices; or are not accurate enough to acquire safe upper bounds. We propose a hybrid methodology that combines machine learning-based prediction on small code sections, called hybrid blocks, with static analysis to combine the predictions to a final upper bound estimation for the WCEC. In this paper, we present our work on an automated testbench for the Code Behaviour Framework (COBRA) that measures and profiles the upper bound energy consumption on the target device. Next, we use the upper bound measurements of the testbench to train eight different regression models that need to predict these upper bounds. The results show promising estimates for three regression models that could potentially be used for the methodology with additional tuning and training.


2009 ◽  
Vol 74 (2) ◽  
pp. 517-534 ◽  
Author(s):  
Antonín Kučera ◽  
Theodore A. Slaman

AbstractWe show that there is a lowT-upper bound for the class ofK-trivial sets, namely those which are weak from the point of view of algorithmic randomness. This result is a special case of a more general characterization of ideals inT-degrees for which there is a lowT-upper bound.


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
Víctor Uc-Cetina ◽  
Francisco Moo-Mena ◽  
Rafael Hernandez-Ucan

We propose a Markov decision process model for solving the Web service composition (WSC) problem. Iterative policy evaluation, value iteration, and policy iteration algorithms are used to experimentally validate our approach, with artificial and real data. The experimental results show the reliability of the model and the methods employed, with policy iteration being the best one in terms of the minimum number of iterations needed to estimate an optimal policy, with the highest Quality of Service attributes. Our experimental work shows how the solution of a WSC problem involving a set of 100,000 individual Web services and where a valid composition requiring the selection of 1,000 services from the available set can be computed in the worst case in less than 200 seconds, using an Intel Core i5 computer with 6 GB RAM. Moreover, a real WSC problem involving only 7 individual Web services requires less than 0.08 seconds, using the same computational power. Finally, a comparison with two popular reinforcement learning algorithms, sarsa andQ-learning, shows that these algorithms require one or two orders of magnitude and more time than policy iteration, iterative policy evaluation, and value iteration to handle WSC problems of the same complexity.


1990 ◽  
Vol 19 (335) ◽  
Author(s):  
Peter Bro Miltersen

We analyze the concept of <em> malignness</em>, which is the property of probability ensembles of making the average case running time equal to the worst case running time for a class of algorithms. We derive lower and upper bounds on the complexity of malign ensembles, which are tight for exponential time algorithms, and which show that no polynomial time computable malign ensemble exists for the class of superlinear algorithms. Furthermore, we show that for no class of superlinear algorithms a polynomial time computable malign ensemble exists, unless every language in P has an expected polynomial time constructor.


10.37236/7275 ◽  
2018 ◽  
Vol 25 (2) ◽  
Author(s):  
Jacob Fox ◽  
Lisa Sauermann

For a finite abelian group $G$, The Erdős-Ginzburg-Ziv constant $\mathfrak{s}(G)$ is the smallest $s$ such that every sequence of $s$ (not necessarily distinct) elements of $G$ has a zero-sum subsequence of length $\operatorname{exp}(G)$. For a prime $p$, let $r(\mathbb{F}_p^n)$ denote the size of the largest subset of $\mathbb{F}_p^n$ without a three-term arithmetic progression. Although similar methods have been used to study $\mathfrak{s}(G)$ and $r(\mathbb{F}_p^n)$, no direct connection between these quantities has previously been established. We give an upper bound for $\mathfrak{s}(G)$ in terms of $r(\mathbb{F}_p^n)$ for the prime divisors $p$ of $\operatorname{exp}(G)$. For the special case $G=\mathbb{F}_p^n$, we prove $\mathfrak{s}(\mathbb{F}_p^n)\leq 2p\cdot r(\mathbb{F}_p^n)$. Using the upper bounds for $r(\mathbb{F}_p^n)$ of Ellenberg and Gijswijt, this result improves the previously best known upper bounds for $\mathfrak{s}(\mathbb{F}_p^n)$ given by Naslund. 


2001 ◽  
Vol 8 (1) ◽  
Author(s):  
Gerth Stølting Brodal ◽  
Rolf Fagerberg ◽  
Christian N. S. Pedersen ◽  
Anna Östlin

<p>We present tight upper and lower bounds for the problem of constructing evolutionary trees in the experiment model. We describe an algorithm which constructs an evolutionary tree of n species in time O(n d logd n) using at most n |d/2| (log2|d/2|−1 n + O(1)) experiments for d > 2, and<br />at most n(log n + O(1)) experiments for d = 2, where d is the degree of the tree. This improves the previous best upper bound by a factor Theta(log d). For d = 2 the previously best algorithm with running time O(n log n) had a bound of 4n log n on the number of experiments. By an explicit adversary argument, we show an <br />Omega(nd logd n) lower bound, matching our upper bounds and improving the previous best lower bound<br />by a factor Theta(logd n). Central to our algorithm is the construction and maintenance of separator trees of small height. We present how to maintain separator trees with height log n + O(1) under the insertion of new nodes in amortized time O(log n). Part of our dynamic algorithm is an algorithm for computing a centroid tree in optimal time O(n).</p><p>Keywords: Evolutionary trees, Experiment model, Separator trees, Centroid tree, Lower bounds</p>


Sign in / Sign up

Export Citation Format

Share Document