stochastic game
Recently Published Documents


TOTAL DOCUMENTS

315
(FIVE YEARS 83)

H-INDEX

24
(FIVE YEARS 6)

Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 448
Author(s):  
Yumi Kim ◽  
Mincheol Paik ◽  
Bokyeong Kim ◽  
Haneul Ko ◽  
Seung-Yeon Kim

In a non-orthogonal multiple access (NOMA) environment, an Internet of Things (IoT) device achieves a high data rate by increasing its transmission power. However, excessively high transmission power can cause an energy outage of an IoT device and have a detrimental effect on the signal-to-interference-plus-noise ratio of neighbor IoT devices. In this paper, we propose a neighbor-aware NOMA scheme (NA-NOMA) where each IoT device determines whether to transmit data to the base station and the transmission power at each time epoch in a distributed manner with the consideration of its energy level and other devices’ transmission powers. To maximize the aggregated data rate of IoT devices while keeping an acceptable average energy outage probability, a constrained stochastic game model is formulated, and the solution of the model is obtained using a best response dynamics-based algorithm. Evaluation results show that NA-NOMA can increase the average data rate up to 22% compared with a probability-based scheme while providing a sufficiently low energy outage probability (e.g., 0.05).


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Zenan Wu ◽  
Liqin Tian ◽  
Yan Wang ◽  
Jianfei Xie ◽  
Yuquan Du ◽  
...  

Aiming at the existing network attack and defense stochastic game models, most of them are based on the assumption of complete information, which causes the problem of poor applicability of the model. Based on the actual modeling requirements of the network attack and defense process, a network defense decision-making model combining incomplete information stochastic game and deep reinforcement learning is proposed. This model regards the incomplete information of the attacker and the defender as the defender’s uncertainty about the attacker’s type and uses the Double Deep Q-Network algorithm to solve the problem of the difficulty of determining the network state transition probability, so that the network system can dynamically adjust the defense strategy. Finally, a simulation experiment was performed on the proposed model. The results show that, under the same experimental conditions, the proposed method in this paper has a better convergence speed than other methods in solving the defense equilibrium strategy. This model is a fusion of traditional methods and artificial intelligence technology and provides new research ideas for the application of artificial intelligence in the field of cyberspace security.


Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8264
Author(s):  
Seung-Yeon Kim ◽  
Yi-Kang Kim

An edge computing system is a distributed computing framework that provides execution resources such as computation and storage for applications involving networking close to the end nodes. An unmanned aerial vehicle (UAV)-aided edge computing system can provide a flexible configuration for mobile ground nodes (MGN). However, edge computing systems still require higher guaranteed reliability for computational task completion and more efficient energy management before their widespread usage. To solve these problems, we propose an energy efficient UAV-based edge computing system with energy harvesting capability. In this system, the MGN makes requests for computing service from multiple UAVs, and geographically proximate UAVs determine whether or not to conduct the data processing in a distributed manner. To minimize the energy consumption of UAVs while maintaining a guaranteed level of reliability for task completion, we propose a stochastic game model with constraints for our proposed system. We apply a best response algorithm to obtain a multi-policy constrained Nash equilibrium. The results show that our system can achieve an improved life cycle compared to the individual computing scheme while maintaining a sufficient successful complete computation probability.


2021 ◽  
Vol 10 ◽  
pp. 13-32
Author(s):  
Petro Kravets ◽  
◽  
Volodymyr Pasichnyk ◽  
Mykola Prodaniuk ◽  
◽  
...  

This paper proposes a new application of the stochastic game model to solve the problem of self- organization of the Hamiltonian cycle of a graph. To do this, at the vertices of the undirected graph are placed game agents, whose pure strategies are options for choosing one of the incident edges. A random selection of strategies by all agents forms a set of local paths that begin at each vertex of the graph. Current player payments are defined as loss functions that depend on the strategies of neighboring players that control adjacent vertices of the graph. These functions are formed from a penalty for the choice of opposing strategies by neighboring players and a penalty for strategies that have reduced the length of the local path. Random selection of players’ pure strategies is aimed at minimizing their average loss functions. The generation of sequences of pure strategies is performed by a discrete distribution built on the basis of dynamic vectors of mixed strategies. The elements of the vectors of mixed strategies are the probabilities of choosing the appropriate pure strategies that adaptively take into account the values of current losses. The formation of vectors of mixed strategies is determined by the Markov recurrent method, for the construction of which the gradient method of stochastic approximation is used. During the game, the method increases the value of the probabilities of choosing those pure strategies that lead to a decrease in the functions of average losses. For given methods of forming current payments, the result of the stochastic game is the formation of patterns of self-organization in the form of cyclically oriented strategies of game agents. The conditions of convergence of the recurrent method to collectively optimal solutions are ensured by observance of the fundamental conditions of stochastic approximation. The game task is extended to random graphs. To do this, the vertices are assigned the probabilities of recovery failures, which cause a change in the structure of the graph at each step of the game. Realizations of a random graph are adaptively taken into account when searching for Hamiltonian cycles. Increasing the probability of failure slows down the convergence of the stochastic game. Computer simulation of the stochastic game provided patterns of self-organization of agents’ strategies in the form of several local cycles or a global Hamiltonian cycle of the graph, depending on the ways of forming the current losses of players. The reliability of experimental studies is confirmed by the repetition of implementations of self-organization patterns for different sequences of random variables. The results of the study can be used in practice for game-solving NP-complex problems, transport and communication problems, for building authentication protocols in distributed information systems, for collective decision-making in conditions of uncertainty.


Author(s):  
Yu.V. Averboukh

The paper is concerned with the approximation of the value function of the zero-sum differential game with the minimal cost, i.e., the differential game with the payoff functional determined by the minimization of some quantity along the trajectory by the solutions of continuous-time stochastic games with the stopping governed by one player. Notice that the value function of the auxiliary continuous-time stochastic game is described by the Isaacs–Bellman equation with additional inequality constraints. The Isaacs–Bellman equation is a parabolic PDE for the case of stochastic differential game and it takes a form of system of ODEs for the case of continuous-time Markov game. The approximation developed in the paper is based on the concept of the stochastic guide first proposed by Krasovskii and Kotelnikova.


2021 ◽  
pp. 102480
Author(s):  
Xiaohu Liu ◽  
Hengwei Zhang ◽  
Shuqin Dong ◽  
Yuchen Zhang

Author(s):  
Petr Tomášek ◽  
Karel Horák ◽  
Aditya Aradhye ◽  
Branislav Bošanský ◽  
Krishnendu Chatterjee

We study the two-player zero-sum extension of the partially observable stochastic shortest-path problem where one agent has only partial information about the environment. We formulate this problem as a partially observable stochastic game (POSG): given a set of target states and negative rewards for each transition, the player with imperfect information maximizes the expected undiscounted total reward until a target state is reached. The second player with the perfect information aims for the opposite. We base our formalism on POSGs with one-sided observability (OS-POSGs) and give the following contributions: (1) we introduce a novel heuristic search value iteration algorithm that iteratively solves depth-limited variants of the game, (2) we derive the bound on the depth guaranteeing an arbitrary precision, (3) we propose a novel upper-bound estimation that allows early terminations, and (4) we experimentally evaluate the algorithm on a pursuit-evasion game.


Sign in / Sign up

Export Citation Format

Share Document