Jamming-Resilient Wideband Cognitive Radios with Multi-Agent Reinforcement Learning

Author(s):  
Mohamed A. Aref ◽  
Sudharman K. Jayaweera

This article presents a design of a wideband autonomous cognitive radio (WACR) for anti-jamming and interference-avoidance. The proposed system model allows multiple WACRs to simultaneously operate over the same spectrum range producing a multi-agent environment. The objective of each radio is to predict and evade a dynamic jammer signal as well as avoiding transmissions of other WACRs. The proposed cognitive framework is made of two operations: sensing and transmission. Each operation is helped by its own learning algorithm based on Q-learning, but both will be experiencing the same RF environment. The simulation results indicate that the proposed cognitive anti-jamming technique has low computational complexity and significantly outperforms non-cognitive sub-band selection policy while being sufficiently robust against the impact of sensing errors.

Transport ◽  
2014 ◽  
Vol 29 (3) ◽  
pp. 296-306 ◽  
Author(s):  
Min Yang ◽  
Dounan Tang ◽  
Haoyang Ding ◽  
Wei Wang ◽  
Tianming Luo ◽  
...  

Staggered working hours has the potential to alleviate excessive demands on urban transport networks during the morning and afternoon peak hours and influence the travel behavior of individuals by affecting their activity schedules and reducing their commuting times. This study proposes a multi-agent-based Q-learning algorithm for evaluating the influence of staggered work hours by simulating travelers’ time and location choices in their activity patterns. Interactions among multiple travelers were also considered. Various types of agents were identified based on real activity–travel data for a mid-sized city in China. Reward functions based on time and location information were constructed using Origin–Destination (OD) survey data to simulate individuals’ temporal and spatial choices simultaneously. Interactions among individuals were then described by introducing a road impedance function to formulate a dynamic environment in which one traveler’s decisions influence the decisions of other travelers. Lastly, by applying the Q-learning algorithm, individuals’ activity–travel patterns under staggered working hours were simulated. Based on the simulation results, the effects of staggered working hours were evaluated on both a macroscopic level, at which the space–time distribution of the traffic volume in the network was determined, and a microscopic level, at which the timing of individuals’ leisure activities and their daily household commuting costs were determined. Based on the simulation results and experimental tests, an optimal scheme for staggering working hours was developed.


2012 ◽  
Vol 566 ◽  
pp. 572-579
Author(s):  
Abdolkarim Niazi ◽  
Norizah Redzuan ◽  
Raja Ishak Raja Hamzah ◽  
Sara Esfandiari

In this paper, a new algorithm based on case base reasoning and reinforcement learning (RL) is proposed to increase the convergence rate of the reinforcement learning algorithms. RL algorithms are very useful for solving wide variety decision problems when their models are not available and they must make decision correctly in every state of system, such as multi agent systems, artificial control systems, robotic, tool condition monitoring and etc. In the propose method, we investigate how making improved action selection in reinforcement learning (RL) algorithm. In the proposed method, the new combined model using case base reasoning systems and a new optimized function is proposed to select the action, which led to an increase in algorithms based on Q-learning. The algorithm mentioned was used for solving the problem of cooperative Markov’s games as one of the models of Markov based multi-agent systems. The results of experiments Indicated that the proposed algorithms perform better than the existing algorithms in terms of speed and accuracy of reaching the optimal policy.


2011 ◽  
Vol 10 (03) ◽  
pp. 323-336 ◽  
Author(s):  
YE YE ◽  
NENG-GANG XIE ◽  
LIN-GANG WANG ◽  
LU WANG ◽  
YU-WAN CEN

The paper studies a multi-agent Parrondo's game with history dependence. With the complex networks as the spatial carrier, the adaptation of cooperation and competition (coopetition for short) behaviors is analyzed and the impact of the degree distribution of the heterogeneity on the behavioral adaptation is investigated. The multi-agent Parrondo's game consists of a zero-sum game between individuals and a negative-sum game between individuals and environment based on the history of the game. In terms of relations of the zero-sum game, two behavioral patterns are determined: cooperation and competition. The simulation results show that: (1) Cooperation and competition in any forms are the adaptive behaviors. The coopetition behavior results in the variety of winning and losing states of the history, which makes the population develop toward the beneficial direction where nature affects. The positive average fitness of the population represents the paradoxical feature that the Parrondo's game is counterintuitive, (2) for the cooperation pattern, the average fitness of the population is the largest under Barabási–Albert (BA) network which is conducive to cooperation, (3) the heterogeneity has a positive impact on cooperation.


2020 ◽  
Vol 17 (2) ◽  
pp. 647-664
Author(s):  
Yangyang Ge ◽  
Fei Zhu ◽  
Wei Huang ◽  
Peiyao Zhao ◽  
Quan Liu

Multi-Agent system has broad application in real world, whose security performance, however, is barely considered. Reinforcement learning is one of the most important methods to resolve Multi-Agent problems. At present, certain progress has been made in applying Multi-Agent reinforcement learning to robot system, man-machine match, and automatic, etc. However, in the above area, an agent may fall into unsafe states where the agent may find it difficult to bypass obstacles, to receive information from other agents and so on. Ensuring the safety of Multi-Agent system is of great importance in the above areas where an agent may fall into dangerous states that are irreversible, causing great damage. To solve the safety problem, in this paper we introduce a Multi-Agent Cooperation Q-Learning Algorithm based on Constrained Markov Game. In this method, safety constraints are added to the set of actions, and each agent, when interacting with the environment to search for optimal values, should be restricted by the safety rules, so as to obtain an optimal policy that satisfies the security requirements. Since traditional Multi-Agent reinforcement learning algorithm is no more suitable for the proposed model in this paper, a new solution is introduced for calculating the global optimum state-action function that satisfies the safety constraints. We take advantage of the Lagrange multiplier method to determine the optimal action that can be performed in the current state based on the premise of linearizing constraint functions, under conditions that the state-action function and the constraint function are both differentiable, which not only improves the efficiency and accuracy of the algorithm, but also guarantees to obtain the global optimal solution. The experiments verify the effectiveness of the algorithm.


Respuestas ◽  
2018 ◽  
Vol 23 (2) ◽  
pp. 53-61
Author(s):  
David Luviano Cruz ◽  
Francesco José García Luna ◽  
Luis Asunción Pérez Domínguez

This paper presents a hybrid control proposal for multi-agent systems, where the advantages of the reinforcement learning and nonparametric functions are exploited. A modified version of the Q-learning algorithm is used which will provide data training for a Kernel, this approach will provide a sub optimal set of actions to be used by the agents. The proposed algorithm is experimentally tested in a path generation task in an unknown environment for mobile robots.


2012 ◽  
Vol 433-440 ◽  
pp. 6033-6037
Author(s):  
Xiao Ming Liu ◽  
Xiu Ying Wang

The movement characteristics of traffic flow nearby have the important influence on the main line. The control method of expressway off-ramp based on Q-learning and extension control is established by analyzing parameters of off-ramp and auxiliary road. First, the basic description of Q-learning algorithm and extension control is given and analyzed necessarily. Then reward function is gained through the extension control theory to judge the state of traffic light. Simulation results show that compared to the queue lengths of off-ramp and auxiliary road, control method based on Q-learning algorithm and extension control greatly reduced queue length of off-ramp, which demonstrates the feasibility of control strategies.


2012 ◽  
Vol 150 ◽  
pp. 133-138
Author(s):  
Bin Bian ◽  
Shu Qin Liu ◽  
De Guang Li ◽  
Zhao Kui Wang

In this paper, backstepping method is used to handle the nonlinear factors for magnetic bearing spindle systems, resulting in the design of a nonlinear robust controller to make the system in equilibrium with the global uniform asymptotic stability. Meanwhile, considering the uncertainty in the system model, the impact of uncertainty is introduced in the design process, so that the system has certain robustness. Simulation results show that this method can have a good control effect in the nonlinear magnetic bearing spindle applications..


2014 ◽  
Vol 5 (1) ◽  
pp. 35-50
Author(s):  
Yudai Arai ◽  
Tomoko Kajiyama ◽  
Noritomo Ouchi

In light of the rapid growth of social networks around the world, this study analyses the impact of social networks on the diffusion of products and demonstrates the effective way to diffuse products in the society where social networks play an important role. We construct a consumer behaviour model by multi-agent simulation taking the movie market as an example. After validating it by using data from 13 US movies, we conduct simulations. Our simulation results show that the impact of social networks on the diffusion differs according to the customers’ expectations and evaluation for a movie. We also demonstrate the effective weekly advertising budget allocations corresponding to the types of movies. We find that the difference of weekly advertising budget allocations gives greater impact on the diffusion with the growth of social networks. This paper provides firm’s managers with important suggestions for diffusion strategy considering the impact of social networks.


Sign in / Sign up

Export Citation Format

Share Document