Deep Reinforcement Learning-Based Workload Scheduling for Edge Computing

Abstract Edge computing is a new paradigm for providing cloud computing capacities at the edge of network near mobile users. It offers an effective solution to help mobile devices with computation-intensive and delay-sensitive tasks. However, the edge of network presents a dynamic environment with large number of devices, high mobility of the end user, heterogeneous applications and intermittent traffific. In such environment, edge computing always encounters workload scheduling problem of how to effificiently schedule incoming tasks from mobile devices to edge servers or cloud servers, which is a hard and online problem. In this work, we focus on the workload scheduling problem with the goal of balancing the workload, reducing the service time and minimizing the failed task rate. We proposed a reinforcement learning-based approach, which can learn from the previous actions and achieve best scheduling in the absence of a mathematical model of the environment. Simulation results show that our proposed approach achieves the best performance in aspects of service time, virtual machine utilization, and failed tasks rate compared with other approaches. Our reinforcement learning-based approach can provide an effificient solution to the workload scheduling problem in edge computing.

Download Full-text

Reinforcement Learning for Security-Aware Workflow Application Scheduling in Mobile Edge Computing

Security and Communication Networks ◽

10.1155/2021/5532410 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Binbin Huang ◽

Yuanyuan Xiang ◽

Dongjin Yu ◽

Jiaojiao Wang ◽

Zhongjin Li ◽

...

Keyword(s):

Reinforcement Learning ◽

Edge Computing ◽

Mobile Users ◽

Workflow Scheduling ◽

Scheduling Problem ◽

Mobile Edge Computing ◽

Malicious Attacks ◽

Computing Environment ◽

Risk Probability ◽

Workflow Application

Mobile edge computing as a novel computing paradigm brings remote cloud resource to the edge servers nearby mobile users. Within one-hop communication range of mobile users, a number of edge servers equipped with enormous computation and storage resources are deployed. Mobile users can offload their partial or all computation tasks of a workflow application to the edge servers, thereby significantly reducing the completion time of the workflow application. However, due to the open nature of mobile edge computing environment, these tasks, offloaded to the edge servers, are susceptible to be intentionally overheard or tampered by malicious attackers. In addition, the edge computing environment is dynamical and time-variant, which results in the fact that the existing quasistatic workflow application scheduling scheme cannot be applied to the workflow scheduling problem in dynamical mobile edge computing with malicious attacks. To address these two problems, this paper formulates the workflow scheduling problem with risk probability constraint in the dynamic edge computing environment with malicious attacks to be a Markov Decision Process (MDP). To solve this problem, this paper designs a reinforcement learning-based security-aware workflow scheduling (SAWS) scheme. To demonstrate the effectiveness of our proposed SAWS scheme, this paper compares SAWS with MSAWS, AWM, Greedy, and HEFT baseline algorithms in terms of different performance parameters including risk probability, security service, and risk coefficient. The extensive experiments results show that, compared with the four baseline algorithms in workflows of different scales, the SAWS strategy can achieve better execution efficiency while satisfying the risk probability constraints.

Download Full-text

D3PG: Dirichlet DDGP for Task Partitioning and Offloading with Constrained Hybrid Action Space in Mobile Edge Computing

10.36227/techrxiv.17203607 ◽

2021 ◽

Author(s):

Laha Ale ◽

Scott King ◽

Ning Zhang ◽

Abdul Sattar ◽

Janahan Skandaraniyam

Keyword(s):

Reinforcement Learning ◽

Dynamic Environment ◽

Edge Computing ◽

Action Space ◽

Computation Offloading ◽

Mobile Edge Computing ◽

Continuous Variables ◽

Task Partitioning ◽

Policy Gradient ◽

Service Latency

<div> Mobile Edge Computing (MEC) has been regarded as a promising paradigm to reduce service latency for data processing in Internet of Things, by provisioning computing resources at network edge. In this work, we jointly optimize the task partitioning and computational power allocation for computation offloading in a dynamic environment with multiple IoT devices and multiple edge servers. We formulate the problem as a Markov decision process with constrained hybrid action space, which cannot be well handled by existing deep reinforcement learning (DRL) algorithms. Therefore, we develop a novel Deep Reinforcement Learning called Dirichlet Deep Deterministic Policy Gradient (D3PG), which </div><div>is built on Deep Deterministic Policy Gradient (DDPG) to solve the problem. The developed model can learn to solve multi-objective optimization, including maximizing the number of tasks processed before expiration and minimizing the energy cost and service latency. More importantly, D3PG can effectively deal with constrained distribution-continuous hybrid action space, where the distribution variables are for the task partitioning and offloading, while the continuous variables are for computational frequency control. Moreover, the D3PG can address many similar issues in MEC and general reinforcement learning problems. Extensive simulation results show that the proposed D3PG outperforms the state-of-art methods.</div><div> Mobile Edge Computing (MEC) has been regarded as a promising paradigm to reduce service latency for data processing in Internet of Things, by provisioning computing resources at network edge. In this work, we jointly optimize the task partitioning and computational power allocation for computation offloading in a dynamic environment with multiple IoT devices and multiple edge servers. We formulate the problem as a Markov decision process with constrained hybrid action space, which cannot be well handled by existing deep reinforcement learning (DRL) algorithms. Therefore, we develop a novel Deep Reinforcement Learning called Dirichlet Deep Deterministic Policy Gradient (D3PG), which is built on Deep Deterministic Policy Gradient (DDPG) to solve the problem. The developed model can learn to solve multi-objective optimization, including maximizing the number of tasks processed before expiration and minimizing the energy cost and service latency. More importantly, D3PG can effectively deal with constrained distribution-continuous hybrid action space, where the distribution variables are for the task partitioning and offloading, while the continuous variables are for computational frequency control. Moreover, the D3PG can address many similar issues in MEC and general reinforcement learning problems. Extensive simulation results show that the proposed D3PG outperforms the state-of-art methods.</div>

Download Full-text

Deep Reinforcement Learning-Based Task Scheduling in IoT Edge Computing

Sensors ◽

10.3390/s21051666 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1666 ◽

Cited By ~ 1

Author(s):

Shuran Sheng ◽

Peng Chen ◽

Zhimin Chen ◽

Lenan Wu ◽

Yuxuan Yao

Keyword(s):

Reinforcement Learning ◽

Task Scheduling ◽

Virtual Machines ◽

Scheduling Algorithm ◽

Edge Computing ◽

Scheduling Problem ◽

Satisfaction Degree ◽

Markov Decision ◽

Time Scheduling ◽

Task Satisfaction

Edge computing (EC) has recently emerged as a promising paradigm that supports resource-hungry Internet of Things (IoT) applications with low latency services at the network edge. However, the limited capacity of computing resources at the edge server poses great challenges for scheduling application tasks. In this paper, a task scheduling problem is studied in the EC scenario, and multiple tasks are scheduled to virtual machines (VMs) configured at the edge server by maximizing the long-term task satisfaction degree (LTSD). The problem is formulated as a Markov decision process (MDP) for which the state, action, state transition, and reward are designed. We leverage deep reinforcement learning (DRL) to solve both time scheduling (i.e., the task execution order) and resource allocation (i.e., which VM the task is assigned to), considering the diversity of the tasks and the heterogeneity of available resources. A policy-based REINFORCE algorithm is proposed for the task scheduling problem, and a fully-connected neural network (FCN) is utilized to extract the features. Simulation results show that the proposed DRL-based task scheduling algorithm outperforms the existing methods in the literature in terms of the average task satisfaction degree and success ratio.

Download Full-text

D3PG: Dirichlet DDGP for Task Partitioning and Offloading with Constrained Hybrid Action Space in Mobile Edge Computing

10.36227/techrxiv.17203607.v1 ◽

2021 ◽

Author(s):

Laha Ale ◽

Scott King ◽

Ning Zhang ◽

Abdul Sattar ◽

Janahan Skandaraniyam

Keyword(s):

Reinforcement Learning ◽

Dynamic Environment ◽

Edge Computing ◽

Action Space ◽

Computation Offloading ◽

Mobile Edge Computing ◽

Continuous Variables ◽

Task Partitioning ◽

Policy Gradient ◽

Service Latency

<div> Mobile Edge Computing (MEC) has been regarded as a promising paradigm to reduce service latency for data processing in Internet of Things, by provisioning computing resources at network edge. In this work, we jointly optimize the task partitioning and computational power allocation for computation offloading in a dynamic environment with multiple IoT devices and multiple edge servers. We formulate the problem as a Markov decision process with constrained hybrid action space, which cannot be well handled by existing deep reinforcement learning (DRL) algorithms. Therefore, we develop a novel Deep Reinforcement Learning called Dirichlet Deep Deterministic Policy Gradient (D3PG), which </div><div>is built on Deep Deterministic Policy Gradient (DDPG) to solve the problem. The developed model can learn to solve multi-objective optimization, including maximizing the number of tasks processed before expiration and minimizing the energy cost and service latency. More importantly, D3PG can effectively deal with constrained distribution-continuous hybrid action space, where the distribution variables are for the task partitioning and offloading, while the continuous variables are for computational frequency control. Moreover, the D3PG can address many similar issues in MEC and general reinforcement learning problems. Extensive simulation results show that the proposed D3PG outperforms the state-of-art methods.</div><div> Mobile Edge Computing (MEC) has been regarded as a promising paradigm to reduce service latency for data processing in Internet of Things, by provisioning computing resources at network edge. In this work, we jointly optimize the task partitioning and computational power allocation for computation offloading in a dynamic environment with multiple IoT devices and multiple edge servers. We formulate the problem as a Markov decision process with constrained hybrid action space, which cannot be well handled by existing deep reinforcement learning (DRL) algorithms. Therefore, we develop a novel Deep Reinforcement Learning called Dirichlet Deep Deterministic Policy Gradient (D3PG), which is built on Deep Deterministic Policy Gradient (DDPG) to solve the problem. The developed model can learn to solve multi-objective optimization, including maximizing the number of tasks processed before expiration and minimizing the energy cost and service latency. More importantly, D3PG can effectively deal with constrained distribution-continuous hybrid action space, where the distribution variables are for the task partitioning and offloading, while the continuous variables are for computational frequency control. Moreover, the D3PG can address many similar issues in MEC and general reinforcement learning problems. Extensive simulation results show that the proposed D3PG outperforms the state-of-art methods.</div>

Download Full-text

Deep Reinforcement Learning-Empowered Resource Allocation for Mobile Edge Computing in Cellular V2X Networks

Sensors ◽

10.3390/s21020372 ◽

2021 ◽

Vol 21 (2) ◽

pp. 372

Author(s):

Dongji Li ◽

Shaoyi Xu ◽

Pengyu Li

Keyword(s):

Reinforcement Learning ◽

Energy Consumption ◽

Vehicular Networks ◽

Learning Algorithm ◽

Rapid Development ◽

Dynamic Environment ◽

Edge Computing ◽

Mobile Edge Computing ◽

Computing Paradigm ◽

Cloud Servers

With the rapid development of vehicular networks, vehicle-to-everything (V2X) communications have huge number of tasks to be calculated, which brings challenges to the scarce network resources. Cloud servers can alleviate the terrible situation regarding the lack of computing abilities of vehicular user equipment (VUE), but the limited resources, the dynamic environment of vehicles, and the long distances between the cloud servers and VUE induce some potential issues, such as extra communication delay and energy consumption. Fortunately, mobile edge computing (MEC), a promising computing paradigm, can ameliorate the above problems by enhancing the computing abilities of VUE through allocating the computational resources to VUE. In this paper, we propose a joint optimization algorithm based on a deep reinforcement learning algorithm named the double deep Q network (double DQN) to minimize the cost constituted of energy consumption, the latency of computation, and communication with the proper policy. The proposed algorithm is more suitable for dynamic scenarios and requires low-latency vehicular scenarios in the real world. Compared with other reinforcement learning algorithms, the algorithm we proposed algorithm improve the performance in terms of convergence, defined cost, and speed by around 30%, 15%, and 17%.

Download Full-text

Deep Reinforcement Learning Based Delay-Sensitive Task Scheduling and Resource Management Algorithm for Multi-User Mobile-Edge Computing Systems

Proceedings of the 2019 4th International Conference on Mathematics and Artificial Intelligence - ICMAI 2019 ◽

10.1145/3325730.3325731 ◽

2019 ◽

Author(s):

Hao Meng ◽

Daichong Chao ◽

Ru Huo ◽

Qianying Guo ◽

Xiaowei Li ◽

...

Keyword(s):

Reinforcement Learning ◽

Resource Management ◽

Task Scheduling ◽

Edge Computing ◽

Mobile Edge Computing ◽

Computing Systems ◽

Management Algorithm ◽

Delay Sensitive

Download Full-text

Delay-sensitive Task Scheduling with Deep Reinforcement Learning in Mobile-edge Computing Systems

Journal of Physics Conference Series ◽

10.1088/1742-6596/1229/1/012059 ◽

2019 ◽

Vol 1229 ◽

pp. 012059

Author(s):

Hao Meng ◽

Daichong Chao ◽

Qianying Guo ◽

Xiaowei Li

Keyword(s):

Reinforcement Learning ◽

Task Scheduling ◽

Edge Computing ◽

Mobile Edge Computing ◽

Computing Systems ◽

Delay Sensitive

Download Full-text

Horizontal Auto-Scaling for Multi-Access Edge Computing Using Safe Reinforcement Learning

ACM Transactions on Embedded Computing Systems ◽

10.1145/3475991 ◽

2021 ◽

Vol 20 (6) ◽

pp. 1-33

Author(s):

Kaustabha Ray ◽

Ansuman Banerjee

Keyword(s):

Reinforcement Learning ◽

Large Scale ◽

Edge Computing ◽

Cloud Services ◽

Test Bed ◽

New Paradigm ◽

Markov Decision ◽

Multi Access ◽

Reward Mechanism ◽

Auto Scaling

Multi-Access Edge Computing (MEC) has emerged as a promising new paradigm allowing low latency access to services deployed on edge servers to avert network latencies often encountered in accessing cloud services. A key component of the MEC environment is an auto-scaling policy which is used to decide the overall management and scaling of container instances corresponding to individual services deployed on MEC servers to cater to traffic fluctuations. In this work, we propose a Safe Reinforcement Learning (RL)-based auto-scaling policy agent that can efficiently adapt to traffic variations to ensure adherence to service specific latency requirements. We model the MEC environment using a Markov Decision Process (MDP). We demonstrate how latency requirements can be formally expressed in Linear Temporal Logic (LTL). The LTL specification acts as a guide to the policy agent to automatically learn auto-scaling decisions that maximize the probability of satisfying the LTL formula. We introduce a quantitative reward mechanism based on the LTL formula to tailor service specific latency requirements. We prove that our reward mechanism ensures convergence of standard Safe-RL approaches. We present experimental results in practical scenarios on a test-bed setup with real-world benchmark applications to show the effectiveness of our approach in comparison to other state-of-the-art methods in literature. Furthermore, we perform extensive simulated experiments to demonstrate the effectiveness of our approach in large scale scenarios.

Download Full-text

Task Migration Based on Reinforcement Learning in Vehicular Edge Computing

Wireless Communications and Mobile Computing ◽

10.1155/2021/9929318 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Sungwon Moon ◽

Jaesung Park ◽

Yujin Lim

Keyword(s):

Quality Of Service ◽

Reinforcement Learning ◽

Load Balancing ◽

Learning Algorithm ◽

High Mobility ◽

Edge Computing ◽

Task Completion ◽

Task Migration ◽

Delay Constraints

Multiaccess edge computing (MEC) has emerged as a promising technology for time-sensitive and computation-intensive tasks. With the high mobility of users, especially in a vehicular environment, computational task migration between vehicular edge computing servers (VECSs) has become one of the most critical challenges in guaranteeing quality of service (QoS) requirements. If the vehicle’s tasks unequally migrate to specific VECSs, the performance can degrade in terms of latency and quality of service. Therefore, in this study, we define a computational task migration problem for balancing the loads of VECSs and minimizing migration costs. To solve this problem, we adopt a reinforcement learning algorithm in a cooperative VECS group environment that can collaborate with VECSs in the group. The objective of this study is to optimize load balancing and migration cost while satisfying the delay constraints of the computation task of vehicles. Simulations are performed to evaluate the performance of the proposed algorithm. The results show that compared to other algorithms, the proposed algorithm achieves approximately 20–40% better load balancing and approximately 13–28% higher task completion rate within the delay constraints.

Download Full-text