auto scaling Latest Research Papers

A Holistic View on Resource Management in Serverless Computing Environments: Taxonomy and Future Directions

ACM Computing Surveys ◽

10.1145/3510412 ◽

2022 ◽

Author(s):

Anupama Mampage ◽

Shanika Karunasekera ◽

Rajkumar Buyya

Keyword(s):

Resource Management ◽

Fog Computing ◽

Resource Provisioning ◽

Future Research ◽

Holistic View ◽

Fine Grained ◽

Computing Model ◽

Service Ecosystem ◽

Edge And Fog Computing ◽

Auto Scaling

Serverless computing has emerged as an attractive deployment option for cloud applications in recent times. The unique features of this computing model include rapid auto-scaling, strong isolation, fine-grained billing options and access to a massive service ecosystem which autonomously handles resource management decisions. This model is increasingly being explored for deployments in geographically distributed edge and fog computing networks as well, due to these characteristics. Effective management of computing resources has always gained a lot of attention among researchers. The need to automate the entire process of resource provisioning, allocation, scheduling, monitoring and scaling, has resulted in the need for specialized focus on resource management under the serverless model. In this article, we identify the major aspects covering the broader concept of resource management in serverless environments and propose a taxonomy of elements which influence these aspects, encompassing characteristics of system design, workload attributes and stakeholder expectations. We take a holistic view on serverless environments deployed across edge, fog and cloud computing networks. We also analyse existing works discussing aspects of serverless resource management using this taxonomy. This article further identifies gaps in literature and highlights future research directions for improving capabilities of this computing model.

An SDN-Based Solution for Horizontal Auto-Scaling and Load Balancing of Transparent VNF Clusters

Sensors ◽

10.3390/s21248283 ◽

2021 ◽

Vol 21 (24) ◽

pp. 8283

Author(s):

Alejandro Llorens-Carrodeguas ◽

Irian Leyva-Pupo ◽

Cristina Cervelló-Pastor ◽

Luis Piñeiro ◽

Shuaib Siddiqui

Keyword(s):

Load Balancing ◽

Real Time ◽

Dynamic Scaling ◽

Quick Response ◽

Great Flexibility ◽

Monitoring Strategies ◽

Network Functions ◽

Auto Scaling ◽

Bidirectional Flow ◽

Scaling Out

This paper studies the problem of the dynamic scaling and load balancing of transparent virtualized network functions (VNFs). It analyzes different particularities of this problem, such as loop avoidance when performing scaling-out actions, and bidirectional flow affinity. To address this problem, a software-defined networking (SDN)-based solution is implemented consisting of two SDN controllers and two OpenFlow switches (OFSs). In this approach, the SDN controllers run the solution logic (i.e., monitoring, scaling, and load-balancing modules). According to the SDN controllers instructions, the OFSs are responsible for redirecting traffic to and from the VNF clusters (i.e., load-balancing strategy). Several experiments were conducted to validate the feasibility of this proposed solution on a real testbed. Through connectivity tests, not only could end-to-end (E2E) traffic be successfully achieved through the VNF cluster, but the bidirectional flow affinity strategy was also found to perform well because it could simultaneously create flow rules in both switches. Moreover, the selected CPU-based load-balancing method guaranteed an average imbalance below 10% while ensuring that new incoming traffic was redirected to the least loaded instance without requiring packet modification. Additionally, the designed monitoring function was able to detect failures in the set of active members in near real-time and active new instances in less than a minute. Likewise, the proposed auto-scaling module had a quick response to traffic changes. Our solution showed that the use of SDN controllers along with OFS provides great flexibility to implement different load-balancing, scaling, and monitoring strategies.

Predictive auto-scaling with OpenStack Monasca

10.1145/3468737.3494104 ◽

2021 ◽

Author(s):

Giacomo Lanciano ◽

Filippo Galli ◽

Tommaso Cucinotta ◽

Davide Bacciu ◽

Andrea Passarella

Keyword(s):

Auto Scaling

Horizontal Auto-Scaling for Multi-Access Edge Computing Using Safe Reinforcement Learning

ACM Transactions on Embedded Computing Systems ◽

10.1145/3475991 ◽

2021 ◽

Vol 20 (6) ◽

pp. 1-33

Author(s):

Kaustabha Ray ◽

Ansuman Banerjee

Keyword(s):

Reinforcement Learning ◽

Large Scale ◽

Edge Computing ◽

Cloud Services ◽

Test Bed ◽

New Paradigm ◽

Markov Decision ◽

Multi Access ◽

Reward Mechanism ◽

Auto Scaling

Multi-Access Edge Computing (MEC) has emerged as a promising new paradigm allowing low latency access to services deployed on edge servers to avert network latencies often encountered in accessing cloud services. A key component of the MEC environment is an auto-scaling policy which is used to decide the overall management and scaling of container instances corresponding to individual services deployed on MEC servers to cater to traffic fluctuations. In this work, we propose a Safe Reinforcement Learning (RL)-based auto-scaling policy agent that can efficiently adapt to traffic variations to ensure adherence to service specific latency requirements. We model the MEC environment using a Markov Decision Process (MDP). We demonstrate how latency requirements can be formally expressed in Linear Temporal Logic (LTL). The LTL specification acts as a guide to the policy agent to automatically learn auto-scaling decisions that maximize the probability of satisfying the LTL formula. We introduce a quantitative reward mechanism based on the LTL formula to tailor service specific latency requirements. We prove that our reward mechanism ensures convergence of standard Safe-RL approaches. We present experimental results in practical scenarios on a test-bed setup with real-world benchmark applications to show the effectiveness of our approach in comparison to other state-of-the-art methods in literature. Furthermore, we perform extensive simulated experiments to demonstrate the effectiveness of our approach in large scale scenarios.

A SmartNIC-based Load Balancing and Auto Scaling Framework for Middlebox Edge Server

10.1109/nfv-sdn53031.2021.9665167 ◽

2021 ◽

Author(s):

Zhen Ni ◽

Cuidi Wei ◽

Timothy Wood ◽

Nakjung Choi

Keyword(s):

Load Balancing ◽

Auto Scaling

Performance Analysis of Anomaly Detection Methods for Application System on Kubernetes with Auto-scaling and Self-healing

10.23919/cnsm52442.2021.9615544 ◽

2021 ◽

Author(s):

Yoichi Matsuo ◽

Daisuke Ikegami

Keyword(s):

Performance Analysis ◽

Anomaly Detection ◽

Detection Methods ◽

Self Healing ◽

Application System ◽

Auto Scaling

On Auto-scaling and Load Balancing for User-plane Gateways in a Softwarized 5G Network

10.23919/cnsm52442.2021.9615536 ◽

2021 ◽

Author(s):

Van-Giang Nguyen ◽

Karl-Johan Grinnemo ◽

Javid Taheri ◽

Johan Forsman ◽

Thang Le Duc ◽

...

Keyword(s):

Load Balancing ◽

5G Network ◽

Auto Scaling

Optimizing the Performance of Containerized Cloud Software Systems Using Adaptive PID Controllers

ACM Transactions on Autonomous and Adaptive Systems ◽

10.1145/3465630 ◽

2021 ◽

Vol 15 (3) ◽

pp. 1-27

Author(s):

Mikael Sabuhi ◽

Nima Mahmoudi ◽

Hamzeh Khazaei

Keyword(s):

Adaptive Controller ◽

Service Level ◽

Resource Provisioning ◽

Performance Model ◽

Software Systems ◽

Adaptive Controllers ◽

Efficient Resource ◽

Reliable Performance ◽

Auto Scaling ◽

Compare And Contrast

Control theory has proven to be a practical approach for the design and implementation of controllers, which does not inherit the problems of non-control theoretic controllers due to its strong mathematical background. State-of-the-art auto-scaling controllers suffer from one or more of the following limitations: (1) lack of a reliable performance model, (2) using a performance model with low scalability, tractability, or fidelity, (3) being application- or architecture-specific leading to low extendability, and (4) no guarantee on their efficiency. Consequently, in this article, we strive to mitigate these problems by leveraging an adaptive controller, which is composed of a neural network as the performance model and a Proportional-Integral-Derivative (PID) controller as the scaling engine. More specifically, we design, implement, and analyze different flavours of these adaptive and non-adaptive controllers, and we compare and contrast them against each other to find the most suitable one for managing containerized cloud software systems at runtime. The controller’s objective is to maintain the response time of the controlled software system in a pre-defined range, and meeting the Service-level Agreements, while leading to efficient resource provisioning.

Virtual Machine Customization Using Resource Using Prediction for Efficient Utilization of Resources in IaaS Public Clouds

Journal of Information Technology and Computer Science ◽

10.25126/jitecs.202162196 ◽

2021 ◽

Vol 6 (2) ◽

pp. 170-182

Author(s):

Derdus Kenga ◽

Vincent Omwenga ◽

Patrick Ogao

Keyword(s):

Energy Consumption ◽

Resource Utilization ◽

Virtual Machines ◽

Service Providers ◽

Cloud Service ◽

Cloud Data ◽

Efficient Resource ◽

Server Utilization ◽

Auto Scaling ◽

Cloud Users

The main cause of energy wastage in cloud data centres is the low level of server utilization. Low server utilization is a consequence of allocating more resources than required for running applications. For instance, in Infrastructure as a Service (IaaS) public clouds, cloud service providers (CSPs) deliver computing resources in the form of virtual machines (VMs) templates, which the cloud users have to choose from. More often, inexperienced cloud users tend to choose bigger VMs than their application requirements. To address the problem of inefficient resources utilization, the existing approaches focus on VM allocation and migration, which only leads to physical machine (PM) level optimization. Other approaches use horizontal auto-scaling, which is not a visible solution in the case of IaaS public cloud. In this paper, we propose an approach of customizing user VMâ€™s size to match the resources requirements of their application workloads based on an analysis of real backend traces collected from a VM in a production data centre. In this approach, a VM is given fixed size resources that match applications workload demands and any demand that exceeds the fixed resource allocation is predicted and handled through vertical VM auto-scaling. In this approach, energy consumption by PMs is reduced through efficient resource utilization. Experimental results obtained from a simulation on CloudSim Plus using GWA-T-13 Materna real backend traces shows that data center energy consumption can be reduced via efficient resource utilization

A Holistic Auto-Scaling Algorithm for Multi-Service Applications Based on Balanced Queuing Network

10.1109/icws53863.2021.00074 ◽

2021 ◽

Author(s):

Jingwan Tong ◽

Mingchang Wei ◽

Maolin Pan ◽

Yang Yu

Keyword(s):

Queuing Network ◽

Auto Scaling

auto scaling
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

A Holistic View on Resource Management in Serverless Computing Environments: Taxonomy and Future Directions

An SDN-Based Solution for Horizontal Auto-Scaling and Load Balancing of Transparent VNF Clusters

Predictive auto-scaling with OpenStack Monasca

Horizontal Auto-Scaling for Multi-Access Edge Computing Using Safe Reinforcement Learning

A SmartNIC-based Load Balancing and Auto Scaling Framework for Middlebox Edge Server

Performance Analysis of Anomaly Detection Methods for Application System on Kubernetes with Auto-scaling and Self-healing

On Auto-scaling and Load Balancing for User-plane Gateways in a Softwarized 5G Network

Optimizing the Performance of Containerized Cloud Software Systems Using Adaptive PID Controllers

Virtual Machine Customization Using Resource Using Prediction for Efficient Utilization of Resources in IaaS Public Clouds

A Holistic Auto-Scaling Algorithm for Multi-Service Applications Based on Balanced Queuing Network

Export Citation Format

auto scalingRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A Holistic View on Resource Management in Serverless Computing Environments: Taxonomy and Future Directions

An SDN-Based Solution for Horizontal Auto-Scaling and Load Balancing of Transparent VNF Clusters

Predictive auto-scaling with OpenStack Monasca

Horizontal Auto-Scaling for Multi-Access Edge Computing Using Safe Reinforcement Learning

A SmartNIC-based Load Balancing and Auto Scaling Framework for Middlebox Edge Server

Performance Analysis of Anomaly Detection Methods for Application System on Kubernetes with Auto-scaling and Self-healing

On Auto-scaling and Load Balancing for User-plane Gateways in a Softwarized 5G Network

Optimizing the Performance of Containerized Cloud Software Systems Using Adaptive PID Controllers

Virtual Machine Customization Using Resource Using Prediction for Efficient Utilization of Resources in IaaS Public Clouds

A Holistic Auto-Scaling Algorithm for Multi-Service Applications Based on Balanced Queuing Network

auto scaling
Recently Published Documents