markov decision
Recently Published Documents


TOTAL DOCUMENTS

3577
(FIVE YEARS 821)

H-INDEX

69
(FIVE YEARS 8)

2022 ◽  
Vol 5 (1) ◽  
Author(s):  
Takuya Isomura ◽  
Hideaki Shimazaki ◽  
Karl J. Friston

AbstractThis work considers a class of canonical neural networks comprising rate coding models, wherein neural activity and plasticity minimise a common cost function—and plasticity is modulated with a certain delay. We show that such neural networks implicitly perform active inference and learning to minimise the risk associated with future outcomes. Mathematical analyses demonstrate that this biological optimisation can be cast as maximisation of model evidence, or equivalently minimisation of variational free energy, under the well-known form of a partially observed Markov decision process model. This equivalence indicates that the delayed modulation of Hebbian plasticity—accompanied with adaptation of firing thresholds—is a sufficient neuronal substrate to attain Bayes optimal inference and control. We corroborated this proposition using numerical analyses of maze tasks. This theory offers a universal characterisation of canonical neural networks in terms of Bayesian belief updating and provides insight into the neuronal mechanisms underlying planning and adaptive behavioural control.


Author(s):  
Qibin Zhou ◽  
Qinggang Su ◽  
Peng Xiong

The assisted download is an effective method solving the problem that the coverage range is insufficient when Wi-Fi access is used in VANET. For the low utilization of time-space resource within blind area and unbalanced download services in VANET, this paper proposes an approximate global optimum scheme to select vehicle based on WebGIS for assistance download. For WebGIS, this scheme uses a two-dimensional matrix to respectively define the time-space resource and the vehicle selecting behavior, and uses Markov Decision Process to solve the problem of time-space resource allocation within blind area, and utilizes the communication features of VANET to simplify the behavior space of vehicle selection so as to reduce the computing complexity. At the same time, Euclidean Distance(Metric) and Manhattan Distance are used as the basis of vehicle selection by the proposed scheme so that, in the case of possessing the balanced assisted download services, the target vehicles can increase effectively the total amount of user downloads. Experimental results show that because of the wider access range and platform independence of WebGIS, when user is in the case of relatively balanced download services, the total amount of downloads is increased by more than 20%. Moreover, WebGIS usually only needs to use Web browser (sometimes add some plug-ins) on the client side, so the system cost is greatly reduced.


Author(s):  
Bingxin Yao ◽  
Bin Wu ◽  
Siyun Wu ◽  
Yin Ji ◽  
Danggui Chen ◽  
...  

In this paper, an offloading algorithm based on Markov Decision Process (MDP) is proposed to solve the multi-objective offloading decision problem in Mobile Edge Computing (MEC) system. The feature of the algorithm is that MDP is used to make offloading decision. The number of tasks in the task queue, the number of accessible edge clouds and Signal-Noise-Ratio (SNR) of the wireless channel are taken into account in the state space of the MDP model. The offloading delay and energy consumption are considered to define the value function of the MDP model, i.e. the objective function. To maximize the value function, Value Iteration Algorithm is used to obtain the optimal offloading policy. According to the policy, tasks of mobile terminals (MTs) are offloaded to the edge cloud or central cloud, or executed locally. The simulation results show that the proposed algorithm can effectively reduce the offloading delay and energy consumption.


2022 ◽  
Vol 12 (1) ◽  
Author(s):  
J. Aznar-Poveda ◽  
A.-J. García-Sánchez ◽  
E. Egea-López ◽  
J. García-Haro

AbstractIn vehicular communications, the increase of the channel load caused by excessive periodical messages (beacons) is an important aspect which must be controlled to ensure the appropriate operation of safety applications and driver-assistance systems. To date, the majority of congestion control solutions involve including additional information in the payload of the messages transmitted, which may jeopardize the appropriate operation of these control solutions when channel conditions are unfavorable, provoking packet losses. This study exploits the advantages of non-cooperative, distributed beaconing allocation, in which vehicles operate independently without requiring any costly road infrastructure. In particular, we formulate the beaconing rate control problem as a Markov Decision Process and solve it using approximate reinforcement learning to carry out optimal actions. Results obtained were compared with other traditional solutions, revealing that our approach, called SSFA, is able to keep a certain fraction of the channel capacity available, which guarantees the delivery of emergency-related notifications with faster convergence than other proposals. Moreover, good performance was obtained in terms of packet delivery and collision ratios.


Author(s):  
Michael Hoffman ◽  
Eunhye Song ◽  
Michael Brundage ◽  
Soundar Kumara

Abstract When maintenance resources in a manufacturing system are limited, a challenge arises in determining how to allocate these resources among multiple competing maintenance jobs. We formulate this problem as an online prioritization problem using a Markov decision process (MDP) to model the system behavior and Monte Carlo tree search (MCTS) to seek optimal maintenance actions in various states of the system. Further, we use Case-based Reasoning (CBR) to retain and reuse search experience gathered from MCTS to reduce the computational effort needed over time and to improve decision-making efficiency. We demonstrate that our proposed method results in increased system throughput when compared to existing methods of maintenance prioritization while also reducing the time needed to identify optimal maintenance actions as more experience is gathered. This is especially beneficial in manufacturing settings where maintenance decisions must be made quickly.


Author(s):  
Hanna Kurniawati

Planning under uncertainty is critical to robotics. The partially observable Markov decision process (POMDP) is a mathematical framework for such planning problems. POMDPs are powerful because of their careful quantification of the nondeterministic effects of actions and the partial observability of the states. But for the same reason, they are notorious for their high computational complexity and have been deemed impractical for robotics. However, over the past two decades, the development of sampling-based approximate solvers has led to tremendous advances in POMDP-solving capabilities. Although these solvers do not generate the optimal solution, they can compute good POMDP solutions that significantly improve the robustness of robotics systems within reasonable computational resources, thereby making POMDPs practical for many realistic robotics problems. This article presents a review of POMDPs, emphasizing computational issues that have hindered their practicality in robotics and ideas in sampling-based solvers that have alleviated such difficulties, together with lessons learned from applying POMDPs to physical robots. Expected final online publication date for the Annual Review of Control, Robotics, and Autonomous Systems, Volume 5 is May 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Sign in / Sign up

Export Citation Format

Share Document