Large-Scale Computation Offloading Using a Multi-Agent Reinforcement Learning in Heterogeneous Multi-access Edge Computing

Author(s):  
Zhen Gao ◽  
Lei Yang ◽  
Yu Dai
Information ◽  
2021 ◽  
Vol 12 (9) ◽  
pp. 343
Author(s):  
Chunyang Hu ◽  
Jingchen Li ◽  
Haobin Shi ◽  
Bin Ning ◽  
Qiong Gu

Using reinforcement learning technologies to learn offloading strategies for multi-access edge computing systems has been developed by researchers. However, large-scale systems are unsuitable for reinforcement learning, due to their huge state spaces and offloading behaviors. For this reason, this work introduces the centralized training and decentralized execution mechanism, designing a decentralized reinforcement learning model for multi-access edge computing systems. Considering a cloud server and several edge servers, we separate the training and execution in the reinforcement learning model. The execution happens in edge devices of the system, and edge servers need no communication. Conversely, the training process occurs at the cloud device, which causes a lower transmission latency. The developed method uses a deep deterministic policy gradient algorithm to optimize offloading strategies. The simulated experiment shows that our method can learn the offloading strategy for each edge device efficiently.


2021 ◽  
Author(s):  
Abdeladim Sadiki ◽  
Jamal Bentahar ◽  
Rachida Dssouli ◽  
Abdeslam En-Nouaary

Multi-access Edge Computing (MEC) has recently emerged as a potential technology to serve the needs of mobile devices (MDs) in 5G and 6G cellular networks. By offloading tasks to high-performance servers installed at the edge of the wireless networks, resource-limited MDs can cope with the proliferation of the recent computationally-intensive applications. In this paper, we study the computation offloading problem in a massive multiple-input multiple-output (MIMO)-based MEC system where the base stations are equipped with a large number of antennas. Our objective is to minimize the power consumption and offloading delay at the MDs under the stochastic system environment. To this end, we formulate the problem as a Markov Decision Process (MDP) and propose two Deep Reinforcement Learning (DRL) strategies to learn the optimal offloading policy without any prior knowledge of the environment dynamics. First, a Deep Q-Network (DQN) strategy to solve the curse of the state space explosion is analyzed. Then, a more general Proximal Policy Optimization (PPO) strategy to solve the problem of discrete action space is introduced. Simulation results show that the proposed DRL-based strategies outperform the baseline and state-of-the-art algorithms. Moreover, our PPO algorithm exhibits stable performance and efficient offloading results compared to the benchmark DQN strategy.


2021 ◽  
Author(s):  
Abdeladim Sadiki ◽  
Jamal Bentahar ◽  
Rachida Dssouli ◽  
Abdeslam En-Nouaary

Multi-access Edge Computing (MEC) has recently emerged as a potential technology to serve the needs of mobile devices (MDs) in 5G and 6G cellular networks. By offloading tasks to high-performance servers installed at the edge of the wireless networks, resource-limited MDs can cope with the proliferation of the recent computationally-intensive applications. In this paper, we study the computation offloading problem in a massive multiple-input multiple-output (MIMO)-based MEC system where the base stations are equipped with a large number of antennas. Our objective is to minimize the power consumption and offloading delay at the MDs under the stochastic system environment. To this end, we formulate the problem as a Markov Decision Process (MDP) and propose two Deep Reinforcement Learning (DRL) strategies to learn the optimal offloading policy without any prior knowledge of the environment dynamics. First, a Deep Q-Network (DQN) strategy to solve the curse of the state space explosion is analyzed. Then, a more general Proximal Policy Optimization (PPO) strategy to solve the problem of discrete action space is introduced. Simulation results show that the proposed DRL-based strategies outperform the baseline and state-of-the-art algorithms. Moreover, our PPO algorithm exhibits stable performance and efficient offloading results compared to the benchmark DQN strategy.


2021 ◽  
Vol 20 (6) ◽  
pp. 1-33
Author(s):  
Kaustabha Ray ◽  
Ansuman Banerjee

Multi-Access Edge Computing (MEC) has emerged as a promising new paradigm allowing low latency access to services deployed on edge servers to avert network latencies often encountered in accessing cloud services. A key component of the MEC environment is an auto-scaling policy which is used to decide the overall management and scaling of container instances corresponding to individual services deployed on MEC servers to cater to traffic fluctuations. In this work, we propose a Safe Reinforcement Learning (RL)-based auto-scaling policy agent that can efficiently adapt to traffic variations to ensure adherence to service specific latency requirements. We model the MEC environment using a Markov Decision Process (MDP). We demonstrate how latency requirements can be formally expressed in Linear Temporal Logic (LTL). The LTL specification acts as a guide to the policy agent to automatically learn auto-scaling decisions that maximize the probability of satisfying the LTL formula. We introduce a quantitative reward mechanism based on the LTL formula to tailor service specific latency requirements. We prove that our reward mechanism ensures convergence of standard Safe-RL approaches. We present experimental results in practical scenarios on a test-bed setup with real-world benchmark applications to show the effectiveness of our approach in comparison to other state-of-the-art methods in literature. Furthermore, we perform extensive simulated experiments to demonstrate the effectiveness of our approach in large scale scenarios.


Sign in / Sign up

Export Citation Format

Share Document