Predicting Pilot Behavior in Medium-Scale Scenarios Using Game Theory and Reinforcement Learning

This paper addresses the problem of museum visitors’ Quality of Experience (QoE) optimization by viewing and treating the museum environment as a cyber-physical social system. To achieve this goal, we harness visitors’ internal ability to intelligently sense their environment and make choices that improve their QoE in terms of which the museum touring option is the best for them and how much time to spend on their visit. We model the museum setting as a distributed non-cooperative game where visitors selfishly maximize their own QoE. In this setting, we formulate the problem of Recommendation Selection and Visiting Time Management (RSVTM) and propose a two-stage distributed algorithm based on game theory and reinforcement learning, which learns from visitor behavior to make on-the-fly recommendation selections that maximize visitor QoE. The proposed framework enables autonomic visitor-centric management in a personalized manner and enables visitors themselves to decide on the best visiting strategies. Experimental results evaluating the performance of the proposed RSVTM algorithm under realistic simulation conditions indicate the high operational effectiveness and superior performance when compared to other recommendation approaches. Our results constitute a practical alternative for museums and exhibition spaces meant to enhance visitor QoE in a flexible, efficient, and cost-effective manner.

Download Full-text

A Game Theoretic Approach to Swarm Robotics

Applied Bionics and Biomechanics ◽

10.1155/2006/183949 ◽

2006 ◽

Vol 3 (3) ◽

pp. 131-142 ◽

Cited By ~ 1

Author(s):

S. N. Givigi ◽

H. M. Schwartz

Keyword(s):

Game Theory ◽

Reinforcement Learning ◽

Personality Traits ◽

Evolutionary Psychology ◽

Learning Process ◽

Swarm Robotics ◽

Theoretic Approach ◽

Intelligent Robots ◽

Game Theoretic ◽

Game Theoretic Approach

In this article, we discuss some techniques for achieving swarm intelligent robots through the use of traits of personality. Traits of personality are characteristics of each robot that, altogether, define the robot's behaviours. We discuss the use of evolutionary psychology to select a set of traits of personality that will evolve due to a learning process based on reinforcement learning. The use of Game Theory is introduced, and some simulations showing its potential are reported.

Download Full-text

Understanding the future of Deep Reinforcement Learning from the perspective of Game Theory

Journal of Physics Conference Series ◽

10.1088/1742-6596/1453/1/012076 ◽

2020 ◽

Vol 1453 ◽

pp. 012076

Author(s):

Ziyi Gao

Keyword(s):

Game Theory ◽

Reinforcement Learning ◽

The Future

Download Full-text

A coordination model of game theory for multi-intersection-agents and the algorithm for solving equilibrium based on the reinforcement learning method

10.1109/chicc.2008.4605194 ◽

2008 ◽

Author(s):

Ma Li ◽

Liu Weiyi

Keyword(s):

Game Theory ◽

Reinforcement Learning ◽

Learning Method ◽

Coordination Model

Download Full-text

Curiosity eliminates the exploration-exploitation dilemma

10.1101/671362 ◽

2019 ◽

Author(s):

Erik J Peterson ◽

Timothy D Verstynen

Keyword(s):

Game Theory ◽

Reinforcement Learning ◽

Reward Value ◽

Deterministic Solution ◽

Intractable Problem ◽

Traditional Approaches ◽

Exploration Exploitation ◽

Theory And Experiments ◽

Switch Strategy

AbstractThe exploration-exploitation dilemma is one of a few fundamental problems in reinforcement learning and is seen as an intractable problem, mathematically. In this paper we prove the key to finding a tractable solution is to do an unintuitive thing–to explore without considering reward value. We have redefined exploration as having no objective but learning itself. Through theory and experiments we prove that this view leads to a perfect deterministic solution to the dilemma, based on the famous strategy win-stay, lose-switch strategy from game theory. This solution rests on our conjecture that information and reward are equally valuable for survival. Besides offering a mathematical answer, this view seems more robust than traditional approaches because it succeeds in the difficult conditions where rewards are sparse, deceptive, or non-stationary.

Download Full-text