markovian decision process
Recently Published Documents


TOTAL DOCUMENTS

24
(FIVE YEARS 3)

H-INDEX

7
(FIVE YEARS 1)

2019 ◽  
Vol 6 (4) ◽  
pp. 1-11 ◽  
Author(s):  
Mehmet Soysal ◽  
Mustafa Çimen ◽  
Mine Ömürgönülşen ◽  
Sedat Belbağ

This article concerns a green Time Dependent Capacitated Vehicle Routing Problem (TDCVRP) which is confronted in urban distribution planning. The problem is formulated as a Markovian Decision Process and a dynamic programming (DP) approach has been used for solving the problem. The article presents a performance comparison of two recent heuristics for the green TDCVRP that explicitly accounts for time dependent vehicle speeds and fuel consumption (emissions). These heuristics are the classical Restricted Dynamic Programming (RDP) algorithm, and the Simulation Based RDP that consists of weighted random sampling, RDP heuristic and simulation. The numerical experiments show that the Simulation Based Restricted Dynamic Programming heuristic can provide promising results within relatively short computational times compared to the classical Restricted Dynamic Programming for the green TDCVRP.


DYNA ◽  
2019 ◽  
Vol 86 (208) ◽  
pp. 92-101
Author(s):  
Juan Carlos Montoya ◽  
Natalia Gaviria

In this article, a joint Radio Access Network selection strategy and a sub-channel resource allocation scheme are introduced, where the downlink of a two-tier HetNet system is deployed in a co-channel deployment. With this scenario in mind, a Semi Markovian Decision Process based-model is proposed to find an optimal policy for user association the channel allocation, where the long-term expected reward is maximized. For the proposed strategy, the sub-channel allocation is only considered for those new and handoff petitions that potentially can be allocated in one of the SCs. In addition, with the aim to contribute to the enhancement of the overall usage of radio resources, a traffic offloading process is considered at the departure of the sessions. In order to motivate the session offloading from the MC to the SCs at departure times, we have defined that two sub-channels will be allocated in the SC for each offloaded session. Analytical results quantify and show the effectiveness of the proposed strategy. 


2017 ◽  
Vol 13 (06) ◽  
pp. 105 ◽  
Author(s):  
Amjad Gawanmeh ◽  
Anas AlAzzam ◽  
Bobby Mathew

Blood cell separation microdevices are designed in biomedical engineering for the separation of particular cells from blood, such as cancer cells. The movement of blood microparticles, specially these cancer cells, in a continuous flow microfluidic device is controlled by several forces, as a result, understanding and guiding the movement of these microparticles is a challenging problem. These cells are subject to different types of forces that result from natural or external effects. These forces include gravity, and virtual mass, buoyancy, dielectrophoresis and, inertia force. Therefore, all of these are to be accounted for in any design or implementation of a system. This paper we use formal analysis  of a separation microdevice in order to model and verify the  the microparticle movement and behavior at high level of abstraction while considering different types of forces. The the dynamic behavior of the particle can be modeled as a Markovian decision process in order to predict the trajectory of microparticles. This model can be used to provide probabilistic analysis for the particle movement in the  microdevice under the effect of different types of forces.<br /><br />


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
Sungyong Choi ◽  
Kyungbae Park

We study a few dynamic risk-averse inventory models using additive utility functions. We add Markovian behavior of purchasing costs in our models. Such Markovian purchasing costs can reflect a market situation in a global supply chain such as fluctuations at exchange rates or the existence of product spot markets. We provide our problem formulations with finite and infinite MDP (Markovian Decision Process) problems. For finite time models, we first prove (joint) concavity of the model for each state and obtain a (modified) base-stock optimal policy. Then, we conduct comparative static analysis for model parameters and derive monotone properties to the optimal solutions. For infinite time models, we show the existence of stationary base-stock optimal policies and the inheritance of the monotone properties proven at our finite time models.


2008 ◽  
Vol 22 (4) ◽  
pp. 587-602 ◽  
Author(s):  
René Haijema ◽  
Jan van der Wal

This article presents a novel approach for the dynamic control of a signalized intersection. At the intersection, there is a number of arrival flows of cars, each having a single queue (lane). The set of all flows is partitioned into disjoint combinations of nonconflicting flows that will receive green together. The dynamic control of the traffic lights is based on the numbers of cars waiting in the queues. The problem concerning when to switch (and which combination to serve next) is modeled as a Markovian decision process in discrete time. For large intersections (i.e., intersections with a large number of flows), the number of states becomes tremendously large, prohibiting straightforward optimization using value iteration or policy iteration. Starting from an optimal (or nearly optimal) fixed-cycle strategy, a one-step policy improvement is proposed that is easy to compute and is shown to give a close to optimal strategy for the dynamic problem.


2006 ◽  
Vol 25 ◽  
pp. 17-74 ◽  
Author(s):  
S. Thiebaux ◽  
C. Gretton ◽  
J. Slaney ◽  
D. Price ◽  
F. Kabanza

A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decision-theoretic planning, where many desirable behaviours are more naturally expressed as properties of execution sequences rather than as properties of states, NMRDPs form a more natural model than the commonly adopted fully Markovian decision process (MDP) model. While the more tractable solution methods developed for MDPs do not directly apply in the presence of non-Markovian rewards, a number of solution methods for NMRDPs have been proposed in the literature. These all exploit a compact specification of the non-Markovian reward function in temporal logic, to automatically translate the NMRDP into an equivalent MDP which is solved using efficient MDP solution methods. This paper presents NMRDPP (Non-Markovian Reward Decision Process Planner), a software platform for the development and experimentation of methods for decision-theoretic planning with non-Markovian rewards. The current version of NMRDPP implements, under a single interface, a family of methods based on existing as well as new approaches which we describe in detail. These include dynamic programming, heuristic search, and structured methods. Using NMRDPP, we compare the methods and identify certain problem features that affect their performance. NMRDPP's treatment of non-Markovian rewards is inspired by the treatment of domain-specific search control knowledge in the TLPlan planner, which it incorporates as a special case. In the First International Probabilistic Planning Competition, NMRDPP was able to compete and perform well in both the domain-independent and hand-coded tracks, using search control knowledge in the latter.


Sign in / Sign up

Export Citation Format

Share Document