opponent modeling
Recently Published Documents


TOTAL DOCUMENTS

42
(FIVE YEARS 9)

H-INDEX

8
(FIVE YEARS 1)

2022 ◽  
Vol 73 ◽  
pp. 277-327
Author(s):  
Samer Nashed ◽  
Shlomo Zilberstein

Opponent modeling is the ability to use prior knowledge and observations in order to predict the behavior of an opponent. This survey presents a comprehensive overview of existing opponent modeling techniques for adversarial domains, many of which must address stochastic, continuous, or concurrent actions, and sparse, partially observable payoff structures. We discuss all the components of opponent modeling systems, including feature extraction, learning algorithms, and strategy abstractions. These discussions lead us to propose a new form of analysis for describing and predicting the evolution of game states over time. We then introduce a new framework that facilitates method comparison, analyze a representative selection of techniques using the proposed framework, and highlight common trends among recently proposed methods. Finally, we list several open problems and discuss future research directions inspired by AI research on opponent modeling and related research in other disciplines.


Games ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 70
Author(s):  
Erik Brockbank ◽  
Edward Vul

In simple dyadic games such as rock, paper, scissors (RPS), people exhibit peculiar sequential dependencies across repeated interactions with a stable opponent. These regularities seem to arise from a mutually adversarial process of trying to outwit their opponent. What underlies this process, and what are its limits? Here, we offer a novel framework for formally describing and quantifying human adversarial reasoning in the rock, paper, scissors game. We first show that this framework enables a precise characterization of the complexity of patterned behaviors that people exhibit themselves, and appear to exploit in others. This combination allows for a quantitative understanding of human opponent modeling abilities. We apply these tools to an experiment in which people played 300 rounds of RPS in stable dyads. We find that although people exhibit very complex move dependencies, they cannot exploit these dependencies in their opponents, indicating a fundamental limitation in people’s capacity for adversarial reasoning. Taken together, the results presented here show how the rock, paper, scissors game allows for precise formalization of human adaptive reasoning abilities.


2021 ◽  
Vol 11 (13) ◽  
pp. 6022
Author(s):  
Victor Sanchez-Anguix ◽  
Okan Tunalı ◽  
Reyhan Aydoğan ◽  
Vicente Julian

In the last few years, we witnessed a growing body of literature about automated negotiation. Mainly, negotiating agents are either purely self-driven by maximizing their utility function or by assuming a cooperative stance by all parties involved in the negotiation. We argue that, while optimizing one’s utility function is essential, agents in a society should not ignore the opponent’s utility in the final agreement to improve the agent’s long-term perspectives in the system. This article aims to show whether it is possible to design a social agent (i.e., one that aims to optimize both sides’ utility functions) while performing efficiently in an agent society. Accordingly, we propose a social agent supported by a portfolio of strategies, a novel tit-for-tat concession mechanism, and a frequency-based opponent modeling mechanism capable of adapting its behavior according to the opponent’s behavior and the state of the negotiation. The results show that the proposed social agent not only maximizes social metrics such as the distance to the Nash bargaining point or the Kalai point but also is shown to be a pure and mixed equilibrium strategy in some realistic agent societies.


Author(s):  
Ying Wen ◽  
Yaodong Yang ◽  
Jun Wang

Though limited in real-world decision making, most multi-agent reinforcement learning (MARL) models assume perfectly rational agents -- a property hardly met due to individual's cognitive limitation and/or the tractability of the decision problem. In this paper, we introduce generalized recursive reasoning (GR2) as a novel framework to model agents with different \emph{hierarchical} levels of rationality; our framework enables agents to exhibit varying levels of ``thinking'' ability thereby allowing higher-level agents to best respond to various less sophisticated learners. We contribute both theoretically and empirically. On the theory side, we devise the hierarchical framework of GR2 through probabilistic graphical models and prove the existence of a perfect Bayesian equilibrium. Within the GR2, we propose a practical actor-critic solver, and demonstrate its convergent property to a stationary point in two-player games through Lyapunov analysis. On the empirical side, we validate our findings on a variety of MARL benchmarks. Precisely, we first illustrate the hierarchical thinking process on the Keynes Beauty Contest, and then demonstrate significant improvements compared to state-of-the-art opponent modeling baselines on the normal-form games and the cooperative navigation benchmark.


Author(s):  
Zheng Tian ◽  
Ying Wen ◽  
Zhichen Gong ◽  
Faiz Punakkath ◽  
Shihao Zou ◽  
...  

In a single-agent setting, reinforcement learning (RL) tasks can be cast into an inference problem by introducing a binary random variable o, which stands for the "optimality". In this paper, we redefine the binary random variable o in multi-agent setting and formalize multi-agent reinforcement learning (MARL) as probabilistic inference. We derive a variational lower bound of the likelihood of achieving the optimality and name it as Regularized Opponent Model with Maximum Entropy Objective (ROMMEO). From ROMMEO, we present a novel perspective on opponent modeling and show how it can improve the performance of training agents theoretically and empirically in cooperative games. To optimize ROMMEO, we first introduce a tabular Q-iteration method ROMMEO-Q with proof of convergence. We extend the exact algorithm to complex environments by proposing an approximate version, ROMMEO-AC. We evaluate these two algorithms on the challenging iterated matrix game and differential game respectively and show that they can outperform strong MARL baselines. 


Author(s):  
Victor Gallego ◽  
Roi Naveiro ◽  
David Rios Insua

In several reinforcement learning (RL) scenarios, mainly in security settings, there may be adversaries trying to interfere with the reward generating process. However, when non-stationary environments as such are considered, Q-learning leads to suboptimal results (Busoniu, Babuska, and De Schutter 2010). Previous game-theoretical approaches to this problem have focused on modeling the whole multi-agent system as a game. Instead, we shall face the problem of prescribing decisions to a single agent (the supported decision maker, DM) against a potential threat model (the adversary). We augment the MDP to account for this threat, introducing Threatened Markov Decision Processes (TMDPs). Furthermore, we propose a level-k thinking scheme resulting in a new learning framework to deal with TMDPs. We empirically test our framework, showing the benefits of opponent modeling.


Author(s):  
José Antonio Iglesias ◽  
Agapito Ledezma ◽  
Araceli Sanchis
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document