scholarly journals Joint Power and Bandwidth Allocation for UAV Backhaul Networks: A Hierarchical Learning Approach

Author(s):  
Tingting Yang ◽  
Kailing Yao ◽  
Youming Sun ◽  
Fei Song ◽  
Yang Yang ◽  
...  

Unmanned Aerial Vehicles (UAVs) severing as the relay is an effective technology method to extend the coverage. It can also alleviate the congestion and increase the throughput, especially applied in UAV networks. However, since the energy of UAVs is limited and the resources in UAV networks are scarce, how to optimize the network delay performance under these constraints should be well investigated. Besides, the relationship among different resources, e.g. power and bandwidth, is coupled which makes the optimization more complex. This article investigates the problem of joint power and bandwidth allocation in UAV backhaul networks, which considers both the delay performance and the resource utilization efficiency. Considering the heterogeneous locations characteristics of different UAVs, we formulate the optimization problem as a Stackelberg game. The relay UAV acts as the leader and extended UAVs act as followers. Their utility functions take both the delay durance and the resource consumption into account. To capture the competitive relationship among followers, the sub-game is proved to be an exact potential game and exists Nash equilibriums (NE). The Stackelberg Equilibrium (SE) is proved afterwards. We utilize a hierarchical learning algorithm (HLA) to find out the best resource allocation strategies, which also reduces the computational complexity. Simulation results demonstrate the effectiveness of the proposed method.

2019 ◽  
Vol 9 (16) ◽  
pp. 3348 ◽  
Author(s):  
Zhibin Feng ◽  
Guochun Ren ◽  
Jin Chen ◽  
Chaohui Chen ◽  
Xiaoqin Yang ◽  
...  

In this paper, we study joint relay selection and the power control optimization problem in an anti-jamming relay communication system. Considering the hierarchical competitive relationship between a user and jammer, we formulate the anti-jamming problem as a Stackelberg game. From the perspective of game, the user selects relay and power strategy firstly which acts as the leader, while the jammer chooses power strategy then that acts as follower. Moreover, we prove the existence of Stackelberg equilibrium. Based on the Q-learning algorithm and multi-armed bandit method, a hierarchical joint optimization algorithm is proposed. Simulation results show the user’s strategy selection probability and the jammer’s regret. We compare the user’s and jammer’s utility under the proposed algorithm with a random selection algorithm to verify the algorithm’s superiority. Moreover, the influence of feedback error and eavesdropping error on utility is analyzed.


2021 ◽  
Vol 72 ◽  
pp. 507-531
Author(s):  
Georgios Birmpas ◽  
Jiarui Gan ◽  
Alexandros Hollender ◽  
Francisco J. Marmolejo-Cossío ◽  
Ninad Rajgopal ◽  
...  

Recent results have shown that algorithms for learning the optimal commitment in a Stackelberg game are susceptible to manipulation by the follower. These learning algorithms operate by querying the best responses of the follower, who consequently can deceive the algorithm by using fake best responses, typically by responding according to fake payoffs that are different from the actual ones. For this strategic behavior to be successful, the main challenge faced by the follower is to pinpoint the fake payoffs that would make the learning algorithm output a commitment that benefits them the most. While this problem has been considered before, the related literature has only focused on a simple setting where the follower can only choose from a finite set of payoff matrices, thus leaving the general version of the problem unanswered. In this paper, we fill this gap by showing that it is always possible for the follower to efficiently compute (near-)optimal fake payoffs, for various scenarios of learning interaction between the leader and the follower. Our results also establish an interesting connection between the follower’s deception and the leader’s maximin utility: through deception, the follower can induce almost any (fake) Stackelberg equilibrium if and only if the leader obtains at least their maximin utility in this equilibrium.


2021 ◽  
Author(s):  
Zikai Feng ◽  
Yuanyuan Wu ◽  
Mengxing Huang ◽  
Di Wu

Abstract In order to avoid the malicious jamming of the intelligent unmanned aerial vehicle (UAV) to ground users in the downlink communications, a new anti-UAV jamming strategy based on multi-agent deep reinforcement learning is studied in this paper. In this method, ground users aim to learn the best mobile strategies to avoid the jamming of UAV. The problem is modeled as a Stackelberg game to describe the competitive interaction between the UAV jammer (leader) and ground users (followers). To reduce the computational cost of equilibrium solution for the complex game with large state space, a hierarchical multi-agent proximal policy optimization (HMAPPO) algorithm is proposed to decouple the hybrid game into several sub-Markov games, which updates the actor and critic network of the UAV jammer and ground users at different time scales. Simulation results suggest that the hierarchical multi-agent proximal policy optimization -based anti-jamming strategy achieves comparable performance with lower time complexity than the benchmark strategies. The well-trained HMAPPO has the ability to obtain the optimal jamming strategy and the optimal anti-jamming strategies, which can approximate the Stackelberg equilibrium (SE).


Energies ◽  
2019 ◽  
Vol 12 (2) ◽  
pp. 325 ◽  
Author(s):  
Shijun Chen ◽  
Huwei Chen ◽  
Shanhe Jiang

Electric vehicles (EVs) are designed to improve the efficiency of energy and prevent the environment from being polluted, when they are widely and reasonably used in the transport system. However, due to the feature of EV’s batteries, the charging problem plays an important role in the application of EVs. Fortunately, with the help of advanced technologies, charging stations powered by smart grid operators (SGOs) can easily and conveniently solve the problems and supply charging service to EV users. In this paper, we consider that EVs will be charged by charging station operators (CSOs) in heterogeneous networks (Hetnet), through which they can exchange the information with each other. Considering the trading relationship among EV users, CSOs, and SGOs, we design their own utility functions in Hetnet, where the demand uncertainty is taken into account. In order to maximize the profits, we formulate this charging problem as a four-stage Stackelberg game, through which the optimal strategy is studied and analyzed. In the Stackelberg game model, we theoretically prove and discuss the existence and uniqueness of the Stackelberg equilibrium (SE). Using the proposed iterative algorithm, the optimal solution can be obtained in the optimization problem. The performance of the strategy is shown in the simulation results. It is shown that the simulation results confirm the efficiency of the model in Hetnet.


2020 ◽  
Vol 13 ◽  
pp. 8-23
Author(s):  
Movlatkhan T. Agieva ◽  
◽  
Olga I. Gorbaneva ◽  

We consider a dynamic Stackelberg game theoretic model of the coordination of social and private interests (SPICE-model) of resource allocation in marketing networks. The dynamics of controlled system describes an interaction of the members of a target audience (basic agents) that leads to a change of their opinions (cost of buying the goods and services of firms competing on a market). An interaction of the firms (influence agents) is formalized as their differential game in strategic form. The payoff functional of each firm includes two terms: the summary opinion of the basic agents with consideration of their marketing costs (a common interest of all firms), and the income from investments in a private activity. The latter income is described by a linear function. The firms exert their influence not to all basic agents but only to the members of strong subgroups of the influence digraph (opinion leaders). The opinion leaders determine the stable final opinions of all members of the target audience. A coordinating principal determines the firms' marketing budgets and maximizes the summary opinion of the basic agents with consideration of the allocated resources. The Nash equilibrium in the game of influence agents and the Stackelberg equilibrium in a general hierarchical game of the principal with them are found. It is proved that the value of opinion of a basic agent is the same for all influence agents and the principal. It is also proved that the influence agents assign less resources for the marketing efforts than the principal would like.


2019 ◽  
Vol 2019 ◽  
pp. 1-17 ◽  
Author(s):  
Kai Du ◽  
Zhen Wu

This paper is concerned with a new kind of Stackelberg differential game of mean-field backward stochastic differential equations (MF-BSDEs). By means of four Riccati equations (REs), the follower first solves a backward mean-field stochastic LQ optimal control problem and gets the corresponding open-loop optimal control with the feedback representation. Then the leader turns to solve an optimization problem for a 1×2 mean-field forward-backward stochastic differential system. In virtue of some high-dimensional and complicated REs, we obtain the open-loop Stackelberg equilibrium, and it admits a state feedback representation. Finally, as applications, a class of stochastic pension fund optimization problems which can be viewed as a special case of our formulation is studied and the open-loop Stackelberg strategy is obtained.


2019 ◽  
Vol 2019 ◽  
pp. 1-10 ◽  
Author(s):  
Yanxiang Jiang ◽  
Hui Ge ◽  
Mehdi Bennis ◽  
Fu-Chun Zheng ◽  
Xiaohu You

In this paper, power control in the uplink for two-tier small-cell networks is investigated. We formulate the power control problem as a Stackelberg game, where the macrocell user equipment (MUE) acts as the leader and the small-cell user equipment (SUE) acts as the follower. To reduce the cross-tier and cotier interferences and the power consumption of both the MUE and SUE, we propose optimizing not only the transmit rate but also the transmit power. The corresponding optimization problems are solved through a two-layer iteration. In the inner iteration, the SUE items (SUEs) compete with each other, and their optimal transmit powers are obtained through iterative computations. In the outer iteration, the optimal transmit power of the MUE is obtained in a closed form based on the transmit powers of the SUEs through proper mathematical manipulations. We prove the convergence of the proposed power control scheme, and we also theoretically show the existence and uniqueness of the Stackelberg equilibrium (SE) in the formulated Stackelberg game. The simulation results show that the proposed power control scheme provides considerable improvements, particularly for the MUE.


Sensors ◽  
2020 ◽  
Vol 20 (22) ◽  
pp. 6625
Author(s):  
Yang Wang ◽  
Yuankun Lin ◽  
Lingyu Chen ◽  
Jianghong Shi

As a key technology of intelligent transportation systems (ITS), vehicular ad hoc networks (VANETs) have been promising to provide safety and infotainment for drivers and passengers. To support different applications about traffic safety, traffic efficiency, autonomous driving and entertainment, it is important to investigate how to effectively deliver content in VANETs. Since it takes resources such as bandwidth and power for base stations (BSs) or roadside units (RSUs) to deliver content, the optimal pricing strategy for BSs and the optimal caching incentive scheme for RSUs need to be studied. In this paper, a framework of content delivery is proposed first, where each moving vehicle can obtain small-volume content files from either the nearest BS or the nearest RSU according to the competition among them. Then, the profit models for both BSs and RSUs are established based on stochastic geometry and point processes theory. Next, a caching incentive scheme for RSUs based on Stackelberg game is proposed, where both competition sides (i.e., BSs and RSUs) can maximize their own profits. Besides, a backward introduction method is introduced to solve the Stackelberg equilibrium. Finally, the simulation results demonstrate that BSs can obtain their own optimal pricing strategy for maximizing the profit as well as RSUs can obtain the optimal caching scheme with the maximum profit during the content delivery.


Sign in / Sign up

Export Citation Format

Share Document