Design for Run-Time Monitor on Cloud Computing

Cloud computing is an innovative computing paradigm that can potentially bridge the gap between increasing computing demands in computer aided engineering (CAE) applications and limited scalability, flexibility, and agility in traditional computing paradigms. In light of the benefits of cloud computing, high performance computing (HPC) in the cloud has the potential to enable users to not only accelerate computationally expensive CAE simulations (e.g., finite element analysis), but also to reduce costs by utilizing on-demand and scalable cloud computing resources. The objective of this research is to evaluate the performance of running a large finite element simulation in a public cloud. Specifically, an experiment is performed to identify individual and interactive effects of several factors (e.g., CPU core count, memory size, solver computational rate, and input/output rate) on run time using statistical methods. Our experimental results have shown that the performance of HPC in the cloud is sufficient for the application of a large finite element analysis, and that run time can be optimized by properly selecting a configuration of CPU, memory, and interconnect.

Download Full-text

MiCADO-Edge: Towards an Application-level Orchestrator for the Cloud-to-Edge Computing Continuum

Journal of Grid Computing ◽

10.1007/s10723-021-09589-5 ◽

2021 ◽

Vol 19 (4) ◽

Author(s):

Amjad Ullah ◽

Huseyin Dagdeviren ◽

Resmi C. Ariyattu ◽

James DesLauriers ◽

Tamas Kiss ◽

...

Keyword(s):

Cloud Computing ◽

Data Analysis ◽

Video Processing ◽

Time Management ◽

Realistic Case ◽

Healthcare Data ◽

Computing Environments ◽

Management Policies ◽

Run Time ◽

Monitoring Information

AbstractAutomated deployment and run-time management of microservices-based applications in cloud computing environments is relatively well studied with several mature solutions. However, managing such applications and tasks in the cloud-to-edge continuum is far from trivial, with no robust, production-level solutions currently available. This paper presents our first attempt to extend an application-level cloud orchestration framework called MiCADO to utilise edge and fog worker nodes. The paper illustrates how MiCADO-Edge can automatically deploy complex sets of interconnected microservices in such multi-layered cloud-to-edge environments. Additionally, it shows how monitoring information can be collected from such services and how complex, user- defined run-time management policies can be enforced on application components running at any layer of the architecture. The implemented solution is demonstrated and evaluated using two realistic case studies from the areas of video processing and secure healthcare data analysis.

Download Full-text

Model-Driven Automated Error Recovery in Cloud Computing

Model-Driven Domain Analysis and Software Development ◽

10.4018/978-1-61692-874-2.ch007 ◽

2011 ◽

pp. 136-155 ◽

Cited By ~ 3

Author(s):

Yu Sun ◽

Jules White ◽

Jeff Gray ◽

Aniruddha Gokhale

Keyword(s):

Cloud Computing ◽

Error Detection ◽

Error Recovery ◽

Inference Engine ◽

Innovative Approach ◽

Time Model ◽

Model Driven ◽

Recovery Pattern ◽

On Demand ◽

Run Time

Cloud computing provides a platform that enables users to utilize computation, storage, and other computing resources on-demand. As the number of running nodes in the cloud increases, the potential points of failure and the complexity of recovering from error states grows correspondingly. Using the traditional cloud administrative interface to manually detect and recover from errors is tedious, time-consuming, and error prone. This chapter presents an innovative approach to automate cloud error detection and recovery based on a run-time model that monitors and manages the running nodes in a cloud. When administrators identify and correct errors in the model, an inference engine is used to identify the specific state pattern in the model to which they were reacting, and to record their recovery actions. An error detection and recovery pattern can be generated from the inference and applied automatically whenever the same error occurs again.

Download Full-text

Model-Driven Automated Error Recovery in Cloud Computing

Grid and Cloud Computing ◽

10.4018/978-1-4666-0879-5.ch308 ◽

2012 ◽

pp. 680-700

Author(s):

Yu Sun ◽

Jules White ◽

Jeff Gray ◽

Aniruddha Gokhale

Keyword(s):

Cloud Computing ◽

Error Detection ◽

Error Recovery ◽

Inference Engine ◽

Innovative Approach ◽

Time Model ◽

Model Driven ◽

Recovery Pattern ◽

On Demand ◽

Run Time

Cloud computing provides a platform that enables users to utilize computation, storage, and other computing resources on-demand. As the number of running nodes in the cloud increases, the potential points of failure and the complexity of recovering from error states grows correspondingly. Using the traditional cloud administrative interface to manually detect and recover from errors is tedious, time-consuming, and error prone. This chapter presents an innovative approach to automate cloud error detection and recovery based on a run-time model that monitors and manages the running nodes in a cloud. When administrators identify and correct errors in the model, an inference engine is used to identify the specific state pattern in the model to which they were reacting, and to record their recovery actions. An error detection and recovery pattern can be generated from the inference and applied automatically whenever the same error occurs again.

Download Full-text