service latency Latest Research Papers

<div> Mobile Edge Computing (MEC) has been regarded as a promising paradigm to reduce service latency for data processing in Internet of Things, by provisioning computing resources at network edge. In this work, we jointly optimize the task partitioning and computational power allocation for computation offloading in a dynamic environment with multiple IoT devices and multiple edge servers. We formulate the problem as a Markov decision process with constrained hybrid action space, which cannot be well handled by existing deep reinforcement learning (DRL) algorithms. Therefore, we develop a novel Deep Reinforcement Learning called Dirichlet Deep Deterministic Policy Gradient (D3PG), which </div><div>is built on Deep Deterministic Policy Gradient (DDPG) to solve the problem. The developed model can learn to solve multi-objective optimization, including maximizing the number of tasks processed before expiration and minimizing the energy cost and service latency. More importantly, D3PG can effectively deal with constrained distribution-continuous hybrid action space, where the distribution variables are for the task partitioning and offloading, while the continuous variables are for computational frequency control. Moreover, the D3PG can address many similar issues in MEC and general reinforcement learning problems. Extensive simulation results show that the proposed D3PG outperforms the state-of-art methods.</div><div> Mobile Edge Computing (MEC) has been regarded as a promising paradigm to reduce service latency for data processing in Internet of Things, by provisioning computing resources at network edge. In this work, we jointly optimize the task partitioning and computational power allocation for computation offloading in a dynamic environment with multiple IoT devices and multiple edge servers. We formulate the problem as a Markov decision process with constrained hybrid action space, which cannot be well handled by existing deep reinforcement learning (DRL) algorithms. Therefore, we develop a novel Deep Reinforcement Learning called Dirichlet Deep Deterministic Policy Gradient (D3PG), which is built on Deep Deterministic Policy Gradient (DDPG) to solve the problem. The developed model can learn to solve multi-objective optimization, including maximizing the number of tasks processed before expiration and minimizing the energy cost and service latency. More importantly, D3PG can effectively deal with constrained distribution-continuous hybrid action space, where the distribution variables are for the task partitioning and offloading, while the continuous variables are for computational frequency control. Moreover, the D3PG can address many similar issues in MEC and general reinforcement learning problems. Extensive simulation results show that the proposed D3PG outperforms the state-of-art methods.</div>

Download Full-text

D3PG: Dirichlet DDGP for Task Partitioning and Offloading with Constrained Hybrid Action Space in Mobile Edge Computing

10.36227/techrxiv.17203607 ◽

2021 ◽

Author(s):

Laha Ale ◽

Scott King ◽

Ning Zhang ◽

Abdul Sattar ◽

Janahan Skandaraniyam

Keyword(s):

Reinforcement Learning ◽

Dynamic Environment ◽

Edge Computing ◽

Action Space ◽

Computation Offloading ◽

Mobile Edge Computing ◽

Continuous Variables ◽

Task Partitioning ◽

Policy Gradient ◽

Service Latency

<div> Mobile Edge Computing (MEC) has been regarded as a promising paradigm to reduce service latency for data processing in Internet of Things, by provisioning computing resources at network edge. In this work, we jointly optimize the task partitioning and computational power allocation for computation offloading in a dynamic environment with multiple IoT devices and multiple edge servers. We formulate the problem as a Markov decision process with constrained hybrid action space, which cannot be well handled by existing deep reinforcement learning (DRL) algorithms. Therefore, we develop a novel Deep Reinforcement Learning called Dirichlet Deep Deterministic Policy Gradient (D3PG), which </div><div>is built on Deep Deterministic Policy Gradient (DDPG) to solve the problem. The developed model can learn to solve multi-objective optimization, including maximizing the number of tasks processed before expiration and minimizing the energy cost and service latency. More importantly, D3PG can effectively deal with constrained distribution-continuous hybrid action space, where the distribution variables are for the task partitioning and offloading, while the continuous variables are for computational frequency control. Moreover, the D3PG can address many similar issues in MEC and general reinforcement learning problems. Extensive simulation results show that the proposed D3PG outperforms the state-of-art methods.</div><div> Mobile Edge Computing (MEC) has been regarded as a promising paradigm to reduce service latency for data processing in Internet of Things, by provisioning computing resources at network edge. In this work, we jointly optimize the task partitioning and computational power allocation for computation offloading in a dynamic environment with multiple IoT devices and multiple edge servers. We formulate the problem as a Markov decision process with constrained hybrid action space, which cannot be well handled by existing deep reinforcement learning (DRL) algorithms. Therefore, we develop a novel Deep Reinforcement Learning called Dirichlet Deep Deterministic Policy Gradient (D3PG), which is built on Deep Deterministic Policy Gradient (DDPG) to solve the problem. The developed model can learn to solve multi-objective optimization, including maximizing the number of tasks processed before expiration and minimizing the energy cost and service latency. More importantly, D3PG can effectively deal with constrained distribution-continuous hybrid action space, where the distribution variables are for the task partitioning and offloading, while the continuous variables are for computational frequency control. Moreover, the D3PG can address many similar issues in MEC and general reinforcement learning problems. Extensive simulation results show that the proposed D3PG outperforms the state-of-art methods.</div>

Download Full-text

Towards Optimal Parallelism-Aware Service Chaining and Embedding

10.36227/techrxiv.17099120.v1 ◽

2021 ◽

Author(s):

Danyang Zheng ◽

Gangxiang Shen ◽

Xiaojun Cao ◽

Biswanath Mukherjee

Keyword(s):

Propagation Delay ◽

Approximate Algorithm ◽

Network Function Virtualization ◽

Network Function ◽

Propagation Delays ◽

Additional Processing ◽

Processing Delay ◽

End To End ◽

Multiple Network ◽

Service Latency

<div>Emerging 5G technologies can significantly reduce end-to-end service latency for applications requiring strict quality of service (QoS). With network function virtualization (NFV), to complete a client’s request from those applications, the client’s data can sequentially go through multiple service functions (SFs) for processing/analysis but introduce additional processing delay. To reduce the processing delay from the serially-running SFs, network function parallelism (NFP) that allows multiple SFs to run in parallel is introduced. In this work, we study how to apply NFP into the SF chaining and embedding process such that the latency, including processing and propagation delays, can be jointly minimized. We introduce a novel augmented graph to address the parallel relationship constraint among the required SFs. Considering parallel relationship constraints, we propose a novel problem called parallelism-aware service function chaining and embedding (PSFCE). For this problem, we propose a near-optimal maximum parallel block gain (MPBG) first optimization algorithm when computing resources at each physical node are enough to host the required SFs. When computing resources are limited, we propose a logarithm-approximate algorithm, called parallelism-aware SFs deployment (PSFD), to jointly optimize processing and propagation delays. We conduct extensive simulations on multiple network scenarios to evaluate the performances of our schemes. Accordingly, we find that (i) MPBG is near-optimal, (ii) the optimization of end-to-end service latency largely depends on the processing delay in small networks and is impacted more by the propagation delay in large networks, and (iii) PSFD outperforms the schemes directly extended from existing works regarding end-to-end latency.</div>

Download Full-text

Towards Optimal Parallelism-Aware Service Chaining and Embedding

10.36227/techrxiv.17099120 ◽

2021 ◽

Author(s):

Danyang Zheng ◽

Gangxiang Shen ◽

Xiaojun Cao ◽

Biswanath Mukherjee

Keyword(s):

Propagation Delay ◽

Approximate Algorithm ◽

Network Function Virtualization ◽

Network Function ◽

Propagation Delays ◽

Additional Processing ◽

Processing Delay ◽

End To End ◽

Multiple Network ◽

Service Latency

<div>Emerging 5G technologies can significantly reduce end-to-end service latency for applications requiring strict quality of service (QoS). With network function virtualization (NFV), to complete a client’s request from those applications, the client’s data can sequentially go through multiple service functions (SFs) for processing/analysis but introduce additional processing delay. To reduce the processing delay from the serially-running SFs, network function parallelism (NFP) that allows multiple SFs to run in parallel is introduced. In this work, we study how to apply NFP into the SF chaining and embedding process such that the latency, including processing and propagation delays, can be jointly minimized. We introduce a novel augmented graph to address the parallel relationship constraint among the required SFs. Considering parallel relationship constraints, we propose a novel problem called parallelism-aware service function chaining and embedding (PSFCE). For this problem, we propose a near-optimal maximum parallel block gain (MPBG) first optimization algorithm when computing resources at each physical node are enough to host the required SFs. When computing resources are limited, we propose a logarithm-approximate algorithm, called parallelism-aware SFs deployment (PSFD), to jointly optimize processing and propagation delays. We conduct extensive simulations on multiple network scenarios to evaluate the performances of our schemes. Accordingly, we find that (i) MPBG is near-optimal, (ii) the optimization of end-to-end service latency largely depends on the processing delay in small networks and is impacted more by the propagation delay in large networks, and (iii) PSFD outperforms the schemes directly extended from existing works regarding end-to-end latency.</div>

Download Full-text

Bursty data service latency analysis under fractional calculus fluid model of Multi-hop Wireless Networks

Wireless Networks ◽

10.1007/s11276-021-02666-3 ◽

2021 ◽

Author(s):

Lei Chen ◽

Chuangeng Tian ◽

Ping Cui ◽

Kailiang Zhang ◽

Yuan An

Keyword(s):

Wireless Networks ◽

Fractional Calculus ◽

Fluid Model ◽

Data Service ◽

Latency Analysis ◽

Service Latency

Download Full-text

Modelling Cloud Service Latency and Availability using a Deep Learning Strategy

Expert Systems with Applications ◽

10.1016/j.eswa.2021.115121 ◽

2021 ◽

pp. 115121

Author(s):

Xu Peng ◽

Gokop L. Goteng ◽

He Yu

Keyword(s):

Deep Learning ◽

Learning Strategy ◽

Cloud Service ◽

Service Latency

Download Full-text

You Only Look Once, But Compute Twice: Service Function Chaining for Low-Latency Object Detection in Softwarized Networks

Applied Sciences ◽

10.3390/app11052177 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2177

Author(s):

Zuo Xiang ◽

Patrick Seeling ◽

Frank H. P. Fitzek

Keyword(s):

Machine Learning ◽

Object Detection ◽

Low Latency ◽

Connected Vehicles ◽

Network Nodes ◽

Machine Learning Model ◽

Fold Reduction ◽

Broad Variety ◽

Computational Resources ◽

Service Latency

With increasing numbers of computer vision and object detection application scenarios, those requiring ultra-low service latency times have become increasingly prominent; e.g., those for autonomous and connected vehicles or smart city applications. The incorporation of machine learning through the applications of trained models in these scenarios can pose a computational challenge. The softwarization of networks provides opportunities to incorporate computing into the network, increasing flexibility by distributing workloads through offloading from client and edge nodes over in-network nodes to servers. In this article, we present an example for splitting the inference component of the YOLOv2 trained machine learning model between client, network, and service side processing to reduce the overall service latency. Assuming a client has 20% of the server computational resources, we observe a more than 12-fold reduction of service latency when incorporating our service split compared to on-client processing and and an increase in speed of more than 25% compared to performing everything on the server. Our approach is not only applicable to object detection, but can also be applied in a broad variety of machine learning-based applications and services.

Download Full-text

Exploring reliable edge‐cloud computing for service latency optimization in sustainable cyber‐physical systems

Software Practice and Experience ◽

10.1002/spe.2942 ◽

2021 ◽

Author(s):

Kun Cao ◽

Tongquan Wei ◽

Mingsong Chen ◽

Keqin Li ◽

Jian Weng ◽

...

Keyword(s):

Cloud Computing ◽

Cyber Physical Systems ◽

Physical Systems ◽

Service Latency

Download Full-text

Service Function Chain Placement for Joint Cost and Latency Optimization

Mobile Networks and Applications ◽

10.1007/s11036-020-01661-w ◽

2020 ◽

Vol 25 (6) ◽

pp. 2191-2205

Author(s):

Mohammad Ali Khoshkholghi ◽

Michel Gokan Khan ◽

Kyoomars Alizadeh Noghani ◽

Javid Taheri ◽

Deval Bhamare ◽

...

Keyword(s):

Optimization Model ◽

Service Providers ◽

Optimal Solution ◽

Network Function Virtualization ◽

Optimization Approach ◽

Service Function ◽

Conflicting Objectives ◽

Network Functions ◽

Deployment Cost ◽

Service Latency

AbstractNetwork Function Virtualization (NFV) is an emerging technology to consolidate network functions onto high volume storages, servers and switches located anywhere in the network. Virtual Network Functions (VNFs) are chained together to provide a specific network service, called Service Function Chains (SFCs). Regarding to Quality of Service (QoS) requirements and network features and states, SFCs are served through performing two tasks: VNF placement and link embedding on the substrate networks. Reducing deployment cost is a desired objective for all service providers in cloud/edge environments to increase their profit form demanded services. However, increasing resource utilization in order to decrease deployment cost may lead to increase the service latency and consequently increase SLA violation and decrease user satisfaction. To this end, we formulate a multi-objective optimization model to joint VNF placement and link embedding in order to reduce deployment cost and service latency with respect to a variety of constraints. We, then solve the optimization problem using two heuristic-based algorithms that perform close to optimum for large scale cloud/edge environments. Since the optimization model involves conflicting objectives, we also investigate pareto optimal solution so that it optimizes multiple objectives as much as possible. The efficiency of proposed algorithms is evaluated using both simulation and emulation. The evaluation results show that the proposed optimization approach succeed in minimizing both cost and latency while the results are as accurate as optimal solution obtained by Gurobi (5%).

Download Full-text

Collaborative Intelligence: Accelerating Deep Neural Network Inference via Device-Edge Synergy

Security and Communication Networks ◽

10.1155/2020/8831341 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Nanliang Shan ◽

Zecong Ye ◽

Xiaolong Cui

Keyword(s):

Neural Network ◽

Deep Learning ◽

Deep Neural Network ◽

Network Inference ◽

Partition Scheme ◽

Model Inference ◽

Cloud Server ◽

Hardware Configuration ◽

Collaborative Intelligence ◽

Service Latency

With the development of mobile edge computing (MEC), more and more intelligent services and applications based on deep neural networks are deployed on mobile devices to meet the diverse and personalized needs of users. Unfortunately, deploying and inferencing deep learning models on resource-constrained devices are challenging. The traditional cloud-based method usually runs the deep learning model on the cloud server. Since a large amount of input data needs to be transmitted to the server through WAN, it will cause a large service latency. This is unacceptable for most current latency-sensitive and computation-intensive applications. In this paper, we propose Cogent, an execution framework that accelerates deep neural network inference through device-edge synergy. In the Cogent framework, it is divided into two operation stages, including the automatic pruning and partition stage and the containerized deployment stage. Cogent uses reinforcement learning (RL) to automatically predict pruning and partition strategies based on feedback from the hardware configuration and system conditions so that the pruned and partitioned model can better adapt to the system environment and user hardware configuration. Then through containerized deployment to the device and the edge server to accelerate model inference, experiments show that the learning-based hardware-aware automatic pruning and partition scheme can significantly reduce the service latency, and it accelerates the overall model inference process while maintaining accuracy. Using this method can accelerate up to 8.89× without loss of accuracy of more than 7%.

Download Full-text

service latency
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

D3PG: Dirichlet DDGP for Task Partitioning and Offloading with Constrained Hybrid Action Space in Mobile Edge Computing

D3PG: Dirichlet DDGP for Task Partitioning and Offloading with Constrained Hybrid Action Space in Mobile Edge Computing

Towards Optimal Parallelism-Aware Service Chaining and Embedding

Towards Optimal Parallelism-Aware Service Chaining and Embedding

Bursty data service latency analysis under fractional calculus fluid model of Multi-hop Wireless Networks

Modelling Cloud Service Latency and Availability using a Deep Learning Strategy

You Only Look Once, But Compute Twice: Service Function Chaining for Low-Latency Object Detection in Softwarized Networks

Exploring reliable edge‐cloud computing for service latency optimization in sustainable cyber‐physical systems

Service Function Chain Placement for Joint Cost and Latency Optimization

Collaborative Intelligence: Accelerating Deep Neural Network Inference via Device-Edge Synergy

Export Citation Format

service latencyRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

D3PG: Dirichlet DDGP for Task Partitioning and Offloading with Constrained Hybrid Action Space in Mobile Edge Computing

D3PG: Dirichlet DDGP for Task Partitioning and Offloading with Constrained Hybrid Action Space in Mobile Edge Computing

Towards Optimal Parallelism-Aware Service Chaining and Embedding

Towards Optimal Parallelism-Aware Service Chaining and Embedding

Bursty data service latency analysis under fractional calculus fluid model of Multi-hop Wireless Networks

Modelling Cloud Service Latency and Availability using a Deep Learning Strategy

You Only Look Once, But Compute Twice: Service Function Chaining for Low-Latency Object Detection in Softwarized Networks

Exploring reliable edge‐cloud computing for service latency optimization in sustainable cyber‐physical systems

Service Function Chain Placement for Joint Cost and Latency Optimization

Collaborative Intelligence: Accelerating Deep Neural Network Inference via Device-Edge Synergy

service latency
Recently Published Documents