A Markovian Mechanism of Proportional Resource Allocation in the Incentive Model as a Dynamic Stochastic Inverse Stackelberg Game

This paper considers resource allocation among producers (agents) in the case where the Principal knows nothing about their cost functions while the agents have Markovian awareness about his/her strategies. We use a dynamic setup of the stochastic inverse Stackelberg game as the model. We suggest an algorithm for solving this game based on Q-learning. The associated Bellman equations contain functions of one variable for the Principal and also for the agents. The new results are illustrated by numerical examples.

Download Full-text

Game-Based Resource Allocation Mechanism in B5G HetNets with Incomplete Information

Applied Sciences ◽

10.3390/app10051557 ◽

2020 ◽

Vol 10 (5) ◽

pp. 1557

Author(s):

Weijia Feng ◽

Xiaohui Li

Keyword(s):

Resource Allocation ◽

Incomplete Information ◽

Stackelberg Game ◽

Superior Performance ◽

Channel Gain ◽

Mobile Users ◽

Pricing Strategies ◽

Game Equilibrium ◽

Allocation Process ◽

Realistic Situation

Ultra-dense and highly heterogeneous network (HetNet) deployments make the allocation of limited wireless resources among ubiquitous Internet of Things (IoT) devices an unprecedented challenge in 5G and beyond (B5G) networks. The interactions among mobile users and HetNets remain to be analyzed, where mobile users choose optimal networks to access and the HetNets adopt proper methods for allocating their own network resource. Existing works always need complete information among mobile users and HetNets. However, it is not practical in a realistic situation where important individual information is protected and will not be public to others. This paper proposes a distributed pricing and resource allocation scheme based on a Stackelberg game with incomplete information. The proposed model proves to be more practical by solving the problem that important information of either mobile users or HetNets is difficult to acquire during the resource allocation process. Considering the unknowability of channel gain information, the follower game among users is modeled as an incomplete information game, and channel gain is regarded as the type of each player. Given the pricing strategies of networks, users will adjust their bandwidth requesting strategies to maximize their expected utility. While based on the sub-equilibrium obtained in the follower game, networks will correspondingly update their pricing strategies to be optimal. The existence and uniqueness of Bayesian Nash equilibrium is proved. A probabilistic prediction method realizes the feasibility of the incomplete information game, and a reverse deduction method is utilized to obtain the game equilibrium. Simulation results show the superior performance of the proposed method.

Download Full-text

A Q-learning based Resource Allocation for Downlink Non-Orthogonal Multiple Access Systems Considering QoS

IEEE Access ◽

10.1109/access.2021.3080283 ◽

2021 ◽

pp. 1-1

Author(s):

Qi Zhai ◽

Miodrag Bolic ◽

Yong Li ◽

Wei Cheng ◽

Chenxi Liu

Keyword(s):

Resource Allocation ◽

Multiple Access ◽

Q Learning

Download Full-text

Priority-based Joint Resource Allocation with Deep Q-Learning for Heterogeneous NOMA Systems

IEEE Access ◽

10.1109/access.2021.3065314 ◽

2021 ◽

pp. 1-1

Author(s):

Sifat Rezwan ◽

Wooyeol Choi

Keyword(s):

Resource Allocation ◽

Q Learning ◽

Joint Resource Allocation

Download Full-text

Dynamic Resource Allocation Based on Q-learning for VNE in Fiber-Wireless (FiWi) Access Network

Proceedings of the International Conference on Graphics and Signal Processing - ICGSP '17 ◽

10.1145/3121360.3121381 ◽

2017 ◽

Author(s):

QingHai Ou ◽

Honghao Zhao ◽

Yuepian Ye ◽

Xiaohui Yu ◽

Zhu Liu ◽

...

Keyword(s):

Resource Allocation ◽

Access Network ◽

Dynamic Resource Allocation ◽

Q Learning ◽

Dynamic Resource

Download Full-text

Continuous Q-Learning Resource Allocation Network

ICANN 98 - Perspectives in Neural Computing ◽

10.1007/978-1-4471-1599-1_68 ◽

1998 ◽

pp. 455-460 ◽

Cited By ~ 1

Author(s):

W. Ilg ◽

K.-U. Scholl

Keyword(s):

Resource Allocation ◽

Learning Resource ◽

Q Learning

Download Full-text

Resource Allocation Algorithms of Vehicle Networks with Stackelberg Game

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering - Cloud Computing, Smart Grid and Innovative Frontiers in Telecommunications ◽

10.1007/978-3-030-48513-9_18 ◽

2020 ◽

pp. 221-230

Author(s):

Ying Zhang ◽

Guang-Shun Li ◽

Jun-Hua Wu ◽

Jia-He Yan ◽

Xiao-Fei Sheng

Keyword(s):

Resource Allocation ◽

Stackelberg Game ◽

Vehicle Networks ◽

Allocation Algorithms

Download Full-text

A Resource Allocation Algorithm for Ultra-Dense Networks Based on Deep Reinforcement Learning

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2021.2.4189 ◽

2021 ◽

Vol 16 (2) ◽

Author(s):

Huashuai Zhang ◽

Tingmei Wang ◽

Haiwei Shen

Keyword(s):

Resource Allocation ◽

Reinforcement Learning ◽

Data Traffic ◽

Wireless Data ◽

Resource Allocation Algorithm ◽

Allocation Algorithm ◽

Q Learning ◽

Dense Networks ◽

Target Network ◽

Wireless Resource Allocation

The resource optimization of ultra-dense networks (UDNs) is critical to meet the huge demand of users for wireless data traffic. But the mainstream optimization algorithms have many problems, such as the poor optimization effect, and high computing load. This paper puts forward a wireless resource allocation algorithm based on deep reinforcement learning (DRL), which aims to maximize the total throughput of the entire network and transform the resource allocation problem into a deep Q-learning process. To effectively allocate resources in UDNs, the DRL algorithm was introduced to improve the allocation efficiency of wireless resources; the authors adopted the resource allocation strategy of the deep Q-network (DQN), and employed empirical repetition and target network to overcome the instability and divergence of the results caused by the previous network state, and to solve the overestimation of the Q value. Simulation results show that the proposed algorithm can maximize the total throughput of the network, while making the network more energy-efficient and stable. Thus, it is very meaningful to introduce the DRL to the research of UDN resource allocation.

Download Full-text

Q-learning algorithm for resource allocation in WDMA-based optical wireless communication networks

10.23919/splitech52315.2021.9566383 ◽

2021 ◽

Author(s):

Abdelrahman S. Elgamal ◽

Osama Z. Alsulami ◽

Ahmad Adnan Qidan ◽

Taisir E.H. El-Gorashi ◽

Jaafar M. H. Elmirghani

Keyword(s):

Resource Allocation ◽

Wireless Communication ◽

Communication Networks ◽

Learning Algorithm ◽

Wireless Communication Networks ◽

Optical Wireless Communication ◽

Optical Wireless ◽

Q Learning

Download Full-text

Dynamic SPICE-model of resource allocation in marketing networks

Contributions to Game Theory and Management ◽

10.21638/11701/spbu31.2020.02 ◽

2020 ◽

Vol 13 ◽

pp. 8-23

Author(s):

Movlatkhan T. Agieva ◽

◽

Olga I. Gorbaneva ◽

Keyword(s):

Resource Allocation ◽

Stackelberg Game ◽

Target Audience ◽

Opinion Leaders ◽

Stackelberg Equilibrium ◽

Common Interest ◽

Theoretic Model ◽

Goods And Services ◽

Spice Model ◽

Private Interests

We consider a dynamic Stackelberg game theoretic model of the coordination of social and private interests (SPICE-model) of resource allocation in marketing networks. The dynamics of controlled system describes an interaction of the members of a target audience (basic agents) that leads to a change of their opinions (cost of buying the goods and services of firms competing on a market). An interaction of the firms (influence agents) is formalized as their differential game in strategic form. The payoff functional of each firm includes two terms: the summary opinion of the basic agents with consideration of their marketing costs (a common interest of all firms), and the income from investments in a private activity. The latter income is described by a linear function. The firms exert their influence not to all basic agents but only to the members of strong subgroups of the influence digraph (opinion leaders). The opinion leaders determine the stable final opinions of all members of the target audience. A coordinating principal determines the firms' marketing budgets and maximizes the summary opinion of the basic agents with consideration of the allocated resources. The Nash equilibrium in the game of influence agents and the Stackelberg equilibrium in a general hierarchical game of the principal with them are found. It is proved that the value of opinion of a basic agent is the same for all influence agents and the principal. It is also proved that the influence agents assign less resources for the marketing efforts than the principal would like.

Download Full-text

The Dynamics of Multiagent Q-Learning in Commodity Market Resource Allocation

Advances in Machine Learning II - Studies in Computational Intelligence ◽

10.1007/978-3-642-05179-1_15 ◽

2010 ◽

pp. 315-349

Author(s):

Eduardo R. Gomes ◽

Ryszard Kowalczyk

Keyword(s):

Resource Allocation ◽

Commodity Market ◽

Q Learning

Download Full-text