Performance Evaluation of Tile Coding in Reinforcement Learning

Author(s):  
Kenji Ota ◽  
Tomoko Ozeki
2021 ◽  
Author(s):  
Qi Zhang ◽  
Jiaqiao Hu

Many systems arising in applications from engineering design, manufacturing, and healthcare require the use of simulation optimization (SO) techniques to improve their performance. In “Actor-Critic–Like Stochastic Adaptive Search for Continuous Simulation Optimization,” Q. Zhang and J. Hu propose a randomized approach that integrates ideas from actor-critic reinforcement learning within a class of adaptive search algorithms for solving SO problems. The approach fully retains the previous simulation data and incorporates them into an approximation architecture to exploit knowledge of the objective function in searching for improved solutions. The authors provide a finite-time analysis for the method when only a single simulation observation is collected at each iteration. The method works well on a diverse set of benchmark problems and has the potential to yield good performance for complex problems using expensive simulation experiments for performance evaluation.


Author(s):  
Michael Robin Mitchley

Reinforcement learning is a machine learning framework whereby an agent learns to perform a task by maximising its total reward received for selecting actions in each state. The policy mapping states to actions that the agent learns is either represented explicitly, or implicitly through a value function. It is common in reinforcement learning to discretise a continuous state space using tile coding or binary features. We prove an upper bound on the performance of discretisation for direct policy representation or value function approximation.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Shitong Ye ◽  
Lijuan Xu ◽  
Xiaomin Li

Through the research on the vehicle-mounted self-organizing network, in view of the current routing technical problems of the vehicle-mounted self-organizing network under the condition of no roadside auxiliary communication unit cooperation, this paper proposes a vehicle network routing algorithm based on deep reinforcement learning. For the problems of massive vehicle nodes and multiple performance evaluation indexes in vehicular ad hoc network, this paper proposes a time prediction model of vehicle communication to reduce the probability of communication interruption and proposes the routing technology of vehicle network by studying the deep reinforcement learning method. This technology can quickly select routing nodes and plan the optimal route according to the required performance evaluation indicators.


Author(s):  
Lei Le ◽  
Raksha Kumaraswamy ◽  
Martha White

A variety of representation learning approaches have been investigated for reinforcement learning; much less attention, however, has been given to investigating the utility of sparse coding. Outside of reinforcement learning, sparse coding representations have been widely used, with non-convex objectives that result in discriminative representations. In this work, we develop a supervised sparse coding objective for policy evaluation. Despite the non-convexity of this objective, we prove that all local minima are global minima, making the approach amenable to simple optimization strategies. We empirically show that it is key to use a supervised objective, rather than the more straightforward unsupervised sparse coding approach. We then compare the learned representations to a canonical fixed sparse representation, called tile-coding, demonstrating that the sparse coding representation outperforms a wide variety of tile-coding representations.


Sign in / Sign up

Export Citation Format

Share Document