The Dynamics of Multiagent Q-Learning in Commodity Market Resource Allocation

Author(s):  
Eduardo R. Gomes ◽  
Ryszard Kowalczyk
IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Qi Zhai ◽  
Miodrag Bolic ◽  
Yong Li ◽  
Wei Cheng ◽  
Chenxi Liu

Author(s):  
Huashuai Zhang ◽  
Tingmei Wang ◽  
Haiwei Shen

The resource optimization of ultra-dense networks (UDNs) is critical to meet the huge demand of users for wireless data traffic. But the mainstream optimization algorithms have many problems, such as the poor optimization effect, and high computing load. This paper puts forward a wireless resource allocation algorithm based on deep reinforcement learning (DRL), which aims to maximize the total throughput of the entire network and transform the resource allocation problem into a deep Q-learning process. To effectively allocate resources in UDNs, the DRL algorithm was introduced to improve the allocation efficiency of wireless resources; the authors adopted the resource allocation strategy of the deep Q-network (DQN), and employed empirical repetition and target network to overcome the instability and divergence of the results caused by the previous network state, and to solve the overestimation of the Q value. Simulation results show that the proposed algorithm can maximize the total throughput of the network, while making the network more energy-efficient and stable. Thus, it is very meaningful to introduce the DRL to the research of UDN resource allocation.


Sensors ◽  
2019 ◽  
Vol 20 (1) ◽  
pp. 44 ◽  
Author(s):  
Yi-Han Xu ◽  
Jing-Wei Xie ◽  
Yang-Gang Zhang ◽  
Min Hua ◽  
Wen Zhou

Wireless body area networks (WBANs) have attracted great attention from both industry and academia as a promising technology for continuous monitoring of physiological signals of the human body. As the sensors in WBANs are typically battery-driven and inconvenient to recharge, an energy efficient resource allocation scheme is essential to prolong the lifetime of the networks, while guaranteeing the rigid requirements of quality of service (QoS) of the WBANs in nature. As a possible alternative solution to address the energy efficiency problem, energy harvesting (EH) technology with the capability of harvesting energy from ambient sources can potentially reduce the dependence on the battery supply. Consequently, in this paper, we investigate the resource allocation problem for EH-powered WBANs (EH-WBANs). Our goal is to maximize the energy efficiency of the EH-WBANs with the joint consideration of transmission mode, relay selection, allocated time slot, transmission power, and the energy constraint of each sensor. In view of the characteristic of the EH-WBANs, we formulate the energy efficiency problem as a discrete-time and finite-state Markov decision process (DFMDP), in which allocation strategy decisions are made by a hub that does not have complete and global network information. Owing to the complexity of the problem, we propose a modified Q-learning (QL) algorithm to obtain the optimal allocation strategy. The numerical results validate the effectiveness of the proposed scheme as well as the low computation complexity of the proposed modified Q-learning (QL) algorithm.


2020 ◽  
Vol 14 (6) ◽  
pp. 1022-1027 ◽  
Author(s):  
Sanjay Bhardwaj ◽  
Rizki Rivai Ginanjar ◽  
Dong-Seong Kim

Sign in / Sign up

Export Citation Format

Share Document