A Survey of Scheduling and Management Techniques for Data-Intensive Application Workflows

Enterprise Resource Planning ◽

10.4018/978-1-4666-4153-2.ch066 ◽

2013 ◽

pp. 1170-1190

Author(s):

Suraj Pandey ◽

Rajkumar Buyya

Keyword(s):

Performance Optimization ◽

Workflow Management ◽

Huge Amount ◽

Distributed Resources ◽

Grid Systems ◽

Data Intensive ◽

Network Bandwidth ◽

Comprehensive Survey ◽

Management Techniques ◽

Data Intensive Application

This chapter presents a comprehensive survey of algorithms, techniques, and frameworks used for scheduling and management of data-intensive application workflows. Many complex scientific experiments are expressed in the form of workflows for structured, repeatable, controlled, scalable, and automated executions. This chapter focuses on the type of workflows that have tasks processing huge amount of data, usually in the range from hundreds of mega-bytes to petabytes. Scientists are already using Grid systems that schedule these workflows onto globally distributed resources for optimizing various objectives: minimize total makespan of the workflow, minimize cost and usage of network bandwidth, minimize cost of computation and storage, meet the deadline of the application, and so forth. This chapter lists and describes techniques used in each of these systems for processing huge amount of data. A survey of workflow management techniques is useful for understanding the working of the Grid systems providing insights on performance optimization of scientific applications dealing with data-intensive workloads.

Download Full-text

Enhancing the use of Object Based Contents Addressable Storage

Computing Trendz - The Journal of Emerging Trends in Information Technology ◽

10.21844/cttjetit.v6i1.6697 ◽

2016 ◽

Vol 6 (1) ◽

Author(s):

Simab Hasan Rizvi

Keyword(s):

File System ◽

Application Performance ◽

Shared Storage ◽

Data Intensive ◽

Network Bandwidth ◽

Storage Performance ◽

Efficient Management ◽

Object Based ◽

Data Volume ◽

Potential Benefits

In Today's age of Tetra Scale computing, the application has become more data intensive than ever. The increased data volume from applications, in now tackling larger and larger problems, and has fuelled the need for efficient management of this data. In this paper, a technique called Content Addressable Storage or CAS, for managing large volume of data is evaluated. This evaluation focuses on the benefits and demerits of using CAS it focuses, i) improved application performance via lockless and lightweight synchronization ofaccess to shared storage data, ii) improved cache performance, iii) increase in storage capacity and, iv) increase network bandwidth. The presented design of a CAS-Based file store significantly improves the storage performance that provides lightweight lock less user defined consistency semantics. As a result, this file system shows a 28% increase in read bandwidth and 13% increase in write bandwidth, over a popular file system in common use. In this paper the potential benefits of using CAS for a virtual machine are estimated. The study also explains mobility application for active use and public deployment.

Download Full-text

A Method of Optimizing Network Topology Structure Combining Viterbi Algorithm and Bayesian Algorithm

Wireless Communications and Mobile Computing ◽

10.1155/2021/5513349 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Xiaoxiao Shi

Keyword(s):

Performance Optimization ◽

Viterbi Algorithm ◽

Flow Distribution ◽

Bottom Layer ◽

Bayesian Algorithm ◽

Recommendation Algorithm ◽

Average Value ◽

Network Bandwidth ◽

5G Network ◽

Overall Performance

With Internet entering all walks of life, development of internet and usage expansion demand better performance, especially the application of 5G network that adopts NAS networking mode. Some of the network bandwidth cannot fully support the current network demand, which causes network fluctuations and other concerns. In this paper, a method for optimizing the topological structure of the bottom layer of the communication network is proposed that has outage performance close to optimal routing scheme. In specific, path in areas with poor network conditions is first optimized using Viterbi algorithm. Then, network element nodes on the path are optimized using Bayes recommendation algorithm for reasonable flow distribution. Dual planning of improved Viterbi algorithm is used to realize the main and standby path planning, and then, Bayesian recommendation algorithm based on the average value is used to optimize the network elements. Therefore, it is very efficient to realize overall performance optimization.

Download Full-text

Recovery Mutual Scheduling: A Decentralized Approach for Fault Recovery Mechanism in the Grid Computing

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.573.571 ◽

2014 ◽

Vol 573 ◽

pp. 571-575

Author(s):

M.H. Anandbabu ◽

B. Palanichelvam ◽

R. Suganya

Keyword(s):

Grid Computing ◽

Grid Service ◽

Fault Recovery ◽

Distributed Resources ◽

Grid Systems ◽

Recovery Mechanism

Grid computing is that the major analysis space wherever the distributed resources square measure used. In programming, the largest challenge is to amass optimum answer for the submitted jobs within the grid. For giant subtask need time intense computation, this paper introduces a replacement fault recovery mechanism into grid systems associated an thorough study on grid service. We have a tendency to propose a replacement algorithmic program on considering these factors. In our planned algorithmic program Recovery Mutual programming, a catalog is employed which is able to be responsive in accumulation of saving its state sporadically. Consequently the turnout of a system is exaggerated with the localized approach.

Download Full-text

A Comprehensive Survey on Data-Intensive Computing and MapReduce Paradigm in Cloud Computing Environments

Informatics and Communication Technologies for Societal Development ◽

10.1007/978-81-322-1916-3_9 ◽

2014 ◽

pp. 85-93

Author(s):

Girish Neelakanta Iyer ◽

Salaja Silas

Keyword(s):

Cloud Computing ◽

Data Intensive Computing ◽

Data Intensive ◽

Comprehensive Survey ◽

Computing Environments ◽

Mapreduce Paradigm

Download Full-text

Leveraging the Power of the Grid with Opal

Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine, and Healthcare ◽

10.4018/978-1-60566-374-6.ch028 ◽

2011 ◽

pp. 552-576

Author(s):

Sriram Krishnan ◽

Luca Clementi ◽

Zhaohui Ding ◽

Wilfred Li

Keyword(s):

Web Services ◽

User Interfaces ◽

Data Transfer ◽

Command Line ◽

Distributed Resources ◽

Data Staging ◽

Grid Systems ◽

Single Sign On ◽

Basic Set ◽

Application Programming

Grid systems provide mechanisms for single sign-on, and uniform APIs for job submission and data transfer, in order to allow the coupling of distributed resources in a seamless manner. However, new users face a daunting barrier of entry due to the high cost of deployment and maintenance. They are often required to learn complex concepts relative to Grid infrastructures (credential management, scheduling systems, data staging, etc). To most scientific users, running their applications with minimal changes and yet getting results faster is highly desirable, without having to know much about how the resources are used. Hence, a higher level of abstraction must be provided for the underlying infrastructure to be used effectively. For this purpose, we have developed the Opal toolkit for exposing applications on Grid resources as simple Web services. Opal provides a basic set of Application Programming Interfaces (APIs) that allows users to execute their deployed applications, query job status, and retrieve results. Opal also provides a mechanism to define command-line arguments and automatically generates user interfaces for the Web services dynamically. In addition, Opal services can be hooked up to a Metascheduler such as CSF4 to leverage a distributed set of resources, and accessed via a multitude of interfaces such as Web browsers, rich desktop environments, workflow tools, and command-line clients.

Download Full-text