Improved fair Scheduling Algorithm for Hadoop Clustering

Traditional way of storing such a huge amount of data is not convenient because processing those data in the later stages is very tedious job. So nowadays, Hadoop is used to store and process large amount of data. When we look at the statistics of data generated in the recent years it is very high in the last 2 years. Hadoop is a good framework to store and process data efficiently. It works like parallel processing and there is no failure or data loss as such due to fault tolerance. Job scheduling is an important process in Hadoop Map Reduce. Hadoop comes with three types of schedulers namely FIFO (First in first out), Fair and Capacity Scheduler. The schedulers are now a pluggable component in the Hadoop Map Reduce framework. This paper talks about the native job scheduling algorithms in Hadoop. Fair scheduling algorithm is analysed with its algorithm considering its response time, throughput and performance. Advantages and drawbacks of fair scheduling algorithm is discussed. Improvised fair scheduling algorithm is proposed with new strategy. Analysis is made with respect to response time, throughput and performance is calculated in naive fair scheduling and improvised fair scheduling. Improvised fair Scheduling algorithms is used in the cases where there is jobs with high and less processing time.

Download Full-text

ROA-CONS: Raccoon Optimization for Job Scheduling

Symmetry ◽

10.3390/sym13122270 ◽

2021 ◽

Vol 13 (12) ◽

pp. 2270

Author(s):

Sina Zangbari Koohi ◽

Nor Asilah Wati Abdul Hamid ◽

Mohamed Othman ◽

Gafurjan Ibragimov

Keyword(s):

User Satisfaction ◽

High Performance ◽

Job Scheduling ◽

Response Times ◽

Scheduling Algorithm ◽

Desktop Computer ◽

Network Heterogeneity ◽

Parallel Job Scheduling ◽

Scheduling Method ◽

And Performance

High-performance computing comprises thousands of processing powers in order to deliver higher performance computation than a typical desktop computer or workstation in order to solve large problems in science, engineering, or business. The scheduling of these machines has an important impact on their performance. HPC’s job scheduling is intended to develop an operational strategy which utilises resources efficiently and avoids delays. An optimised schedule results in greater efficiency of the parallel machine. In addition, processes and network heterogeneity is another difficulty for the scheduling algorithm. Another problem for parallel job scheduling is user fairness. One of the issues in this field of study is providing a balanced schedule that enhances efficiency and user fairness. ROA-CONS is a new job scheduling method proposed in this paper. It describes a new scheduling approach, which is a combination of an updated conservative backfilling approach further optimised by the raccoon optimisation algorithm. This algorithm also proposes a technique of selection that combines job waiting and response time optimisation with user fairness. It contributes to the development of a symmetrical schedule that increases user satisfaction and performance. In comparison with other well-known job scheduling algorithms, the simulation assesses the effectiveness of the proposed method. The results demonstrate that the proposed strategy offers improved schedules that reduce the overall system’s job waiting and response times.

Download Full-text

Balanced Job Scheduling Based on Ant Algorithm for Grid Network

International Journal of Grid and High Performance Computing ◽

10.4018/jghpc.2010092803 ◽

2010 ◽

Vol 2 (1) ◽

pp. 34-50 ◽

Cited By ~ 1

Author(s):

Nikolaos Preve

Keyword(s):

Grid Computing ◽

Ant Colony Optimization ◽

Job Scheduling ◽

Scheduling Algorithm ◽

Scheduling Algorithms ◽

Experimental Results ◽

Ant Algorithm ◽

Entire System ◽

Grid Environment ◽

Grid Network

Job scheduling in grid computing is a very important problem. To utilize grids efficiently, we need a good job scheduling algorithm to assign jobs to resources in grids. The main scope of this article is to propose a new Ant Colony Optimization (ACO) algorithm for balanced job scheduling in the Grid environment. To achieve the above goal, we will indicate a way to balance the entire system load while minimizing the makespan of a given set of jobs. Based on the experimental results, the proposed algorithm confidently demonstrates its practicability and competitiveness compared with other job scheduling algorithms.

Download Full-text

A job scheduling algorithm for delay and performance optimization in fog computing

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.5581 ◽

2019 ◽

Vol 32 (7) ◽

Cited By ~ 2

Author(s):

Bushra Jamil ◽

Mohammad Shojafar ◽

Israr Ahmed ◽

Atta Ullah ◽

Kashif Munir ◽

...

Keyword(s):

Performance Optimization ◽

Job Scheduling ◽

Scheduling Algorithm ◽

Fog Computing ◽

And Performance ◽

Job Scheduling Algorithm

Download Full-text

A hybrid approach for scheduling applications in cloud computing environment

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v10i2.pp1387-1397 ◽

2020 ◽

Vol 10 (2) ◽

pp. 1387

Author(s):

Ahmed Subhi Abdalkafor ◽

Khattab M. Ali Alheeti

Keyword(s):

Cloud Computing ◽

Job Scheduling ◽

Positive Impact ◽

Scheduling Algorithm ◽

Hybrid Approach ◽

Scheduling Algorithms ◽

Vital Role ◽

Experimental Result ◽

Computing Performance ◽

Cloud Applications

Cloud computing plays an important role in our daily life. It has direct and positive impact on share and update data, knowledge, storage and scientific resources between various regions. Cloud computing performance heavily based on job scheduling algorithms that are utilized for queue waiting in modern scientific applications. The researchers are considered cloud computing a popular platform for new enforcements. These scheduling algorithms help in design efficient queue lists in cloud as well as they play vital role in reducing waiting for processing time in cloud computing. A novel job scheduling is proposed in this paper to enhance performance of cloud computing and reduce delay time in queue waiting for jobs. The proposed algorithm tries to avoid some significant challenges that throttle from developing applications of cloud computing. However, a smart scheduling technique is proposed in our paper to improve performance processing in cloud applications. Our experimental result of the proposed job scheduling algorithm shows that the proposed schemes possess outstanding enhancing rates with a reduction in waiting time for jobs in queue list.

Download Full-text

A Pruning-Based Disk Scheduling Algorithm for Heterogeneous I/O Workloads

The Scientific World JOURNAL ◽

10.1155/2014/940850 ◽

2014 ◽

Vol 2014 ◽

pp. 1-17

Author(s):

Taeseok Kim ◽

Hyokyung Bahn ◽

Youjip Won

Keyword(s):

Response Time ◽

Scheduling Algorithm ◽

Scheduling Algorithms ◽

Disk Scheduling ◽

Best Effort ◽

Qos Guarantees ◽

Time Minimization ◽

On Line ◽

Manageable Level ◽

Deadline Constraints

In heterogeneous I/O workload environments, disk scheduling algorithms should support different QoS (Quality-of-Service) for each I/O request. For example, the algorithm should meet the deadlines of real-time requests and at the same time provide reasonable response time for best-effort requests. This paper presents a novel disk scheduling algorithm called G-SCAN (Grouping-SCAN) for handling heterogeneous I/O workloads. To find a schedule that satisfies the deadline constraints and seek time minimization simultaneously, G-SCAN maintains a series of candidate schedules and expands the schedules whenever a new request arrives. Maintaining these candidate schedules requires excessive spatial and temporal overhead, but G-SCAN reduces the overhead to a manageable level via pruning the state space using two heuristics. One is grouping that clusters adjacent best-effort requests into a single scheduling unit and the other is the branch-and-bound strategy that cuts off inefficient or impractical schedules. Experiments with various synthetic and real-world I/O workloads show that G-SCAN outperforms existing disk scheduling algorithms significantly in terms of the average response time, throughput, and QoS-guarantees for heterogeneous I/O workloads. We also show that the overhead of G-SCAN is reasonable for on-line execution.

Download Full-text

PREEMPTIVE JOB SCHEDULING POLICY FOR DISTRIBUTIVELY-OWNED WORKSTATION CLUSTERS

Parallel Processing Letters ◽

10.1142/s0129626404001866 ◽

2004 ◽

Vol 14 (02) ◽

pp. 255-270 ◽

Cited By ~ 2

Author(s):

JEMAL H. ABAWAJY

Keyword(s):

Parallel Processing ◽

Response Time ◽

Cluster Computing ◽

Job Scheduling ◽

Response Times ◽

Scheduling Algorithm ◽

Cost Effective ◽

Opportunistic Scheduling ◽

Workstation Clusters ◽

Scheduling Policy

Cluster computing has come to prominence as a cost-effective parallel processing tool for solving many complex computational problems. In this paper, we propose a new timesharing opportunistic scheduling policy to support remote batch job executions over networked clusters to be used in conjunction with the Condor Up-Down scheduling algorithm. We show that timesharing approaches can be used in an opportunistic setting to improve both mean job slowdowns and mean response times with little or no throughput reduction. We also show that the proposed algorithm achieves significant improvement in job response time and slowdown as compared to exiting approaches and some recently proposed new approaches.

Download Full-text

Research on Information Processing with Delay Fair Scheduling Algorithm Based on Resource Situation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1046.363 ◽

2014 ◽

Vol 1046 ◽

pp. 363-366 ◽

Cited By ~ 1

Author(s):

Xue Ying Jiang ◽

Yang Yang ◽

Di Jing

Keyword(s):

Information Processing ◽

Job Scheduling ◽

Scheduling Algorithm ◽

Experimental Results ◽

Original Algorithm ◽

Fair Scheduling ◽

Key Technologies ◽

Hadoop Platform ◽

Run Time ◽

Resource Situation

Job scheduling algorithm is one of the key technologies Hadoop platform. In this paper, waiting for a fair scheduling algorithm based on delay resource situation. The algorithm periodically obtain status through the use of cluster resources to determine the wait timeout, it has dynamic. Experimental results show that the Delay Fair Scheduling Algorithm Based On the Resource Situation than the original algorithm to some extent reduces the job run time and improve the throughput of Hadoop platform.

Download Full-text

Balanced Job Scheduling Based on Ant Algorithm for Grid Network

Evolving Developments in Grid and Cloud Computing ◽

10.4018/978-1-4666-0056-0.ch002 ◽

2012 ◽

pp. 13-30

Author(s):

Nikolaos Preve

Keyword(s):

Grid Computing ◽

Ant Colony Optimization ◽

Job Scheduling ◽

Scheduling Algorithm ◽

Scheduling Algorithms ◽

Experimental Results ◽

Ant Algorithm ◽

Entire System ◽

System Load ◽

Grid Network

Job scheduling in grid computing is a very important problem. To utilize grids efficiently, we need a good job scheduling algorithm to assign jobs to resources in grids. The main scope of this paper is to propose a new Ant Colony Optimization (ACO) algorithm for balanced job scheduling in the Grid environment. To achieve the above goal, we will indicate a way to balance the entire system load while minimizing the makespan of a given set of jobs. Based on the experimental results, the proposed algorithm confidently demonstrates its practicability and competitiveness compared with other job scheduling algorithms.

Download Full-text

Balanced Job Scheduling Based on Ant Algorithm for Grid Network

Grid and Cloud Computing ◽

10.4018/978-1-4666-0879-5.ch506 ◽

2012 ◽

pp. 1114-1131

Author(s):

Nikolaos Preve

Keyword(s):

Grid Computing ◽

Ant Colony Optimization ◽

Job Scheduling ◽

Scheduling Algorithm ◽

Scheduling Algorithms ◽

Experimental Results ◽

Ant Algorithm ◽

Entire System ◽

System Load ◽

Grid Network

Job scheduling in grid computing is a very important problem. To utilize grids efficiently, we need a good job scheduling algorithm to assign jobs to resources in grids. The main scope of this paper is to propose a new Ant Colony Optimization (ACO) algorithm for balanced job scheduling in the Grid environment. To achieve the above goal, we will indicate a way to balance the entire system load while minimizing the makespan of a given set of jobs. Based on the experimental results, the proposed algorithm confidently demonstrates its practicability and competitiveness compared with other job scheduling algorithms.

Download Full-text

Performance Improvement of DAG-Aware Task Scheduling Algorithms with Efficient Cache Management in Spark

Electronics ◽

10.3390/electronics10161874 ◽

2021 ◽

Vol 10 (16) ◽

pp. 1874

Author(s):

Yao Zhao ◽

Jian Dong ◽

Hongwei Liu ◽

Jin Wu ◽

Yanxin Liu

Keyword(s):

Task Scheduling ◽

Scheduling Algorithm ◽

Poor Performance ◽

Scheduling Algorithms ◽

Vital Role ◽

Management Policy ◽

Cache Management ◽

Performance Improvements ◽

Data Parallel ◽

And Performance

Directed acyclic graph (DAG)-aware task scheduling algorithms have been studied extensively in recent years, and these algorithms have achieved significant performance improvements in data-parallel analytic platforms. However, current DAG-aware task scheduling algorithms, among which HEFT and GRAPHENE are notable, pay little attention to the cache management policy, which plays a vital role in in-memory data-parallel systems such as Spark. Cache management policies that are designed for Spark exhibit poor performance in DAG-aware task-scheduling algorithms, which leads to cache misses and performance degradation. In this study, we propose a new cache management policy known as Long-Running Stage Set First (LSF), which makes full use of the task dependencies to optimize the cache management performance in DAG-aware scheduling algorithms. LSF calculates the caching and prefetching priorities of resilient distributed datasets according to their unprocessed workloads and significance in parallel scheduling, which are key factors in DAG-aware scheduling algorithms. Moreover, we present a cache-aware task scheduling algorithm based on LSF to reduce the resource fragmentation in computing. Experiments demonstrate that, compared to DAG-aware scheduling algorithms with LRU and MRD, the same algorithms with LSF improve the JCT by up to 42% and 30%, respectively. The proposed cache-aware scheduling algorithm also exhibits about 12% reduction in the average job completion time compared to GRAPHENE with LSF.

Download Full-text