scholarly journals Improved fair Scheduling Algorithm for Hadoop Clustering

2017 ◽  
Vol 10 (1) ◽  
pp. 194-200 ◽  
Author(s):  
Sneha Sneha ◽  
Shoney Sebastian

Traditional way of storing such a huge amount of data is not convenient because processing those data in the later stages is very tedious job. So nowadays, Hadoop is used to store and process large amount of data. When we look at the statistics of data generated in the recent years it is very high in the last 2 years. Hadoop is a good framework to store and process data efficiently. It works like parallel processing and there is no failure or data loss as such due to fault tolerance. Job scheduling is an important process in Hadoop Map Reduce. Hadoop comes with three types of schedulers namely FIFO (First in first out), Fair and Capacity Scheduler. The schedulers are now a pluggable component in the Hadoop Map Reduce framework. This paper talks about the native job scheduling algorithms in Hadoop. Fair scheduling algorithm is analysed with its algorithm considering its response time, throughput and performance. Advantages and drawbacks of fair scheduling algorithm is discussed. Improvised fair scheduling algorithm is proposed with new strategy. Analysis is made with respect to response time, throughput and performance is calculated in naive fair scheduling and improvised fair scheduling. Improvised fair Scheduling algorithms is used in the cases where there is jobs with high and less processing time.

Symmetry ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 2270
Author(s):  
Sina Zangbari Koohi ◽  
Nor Asilah Wati Abdul Hamid ◽  
Mohamed Othman ◽  
Gafurjan Ibragimov

High-performance computing comprises thousands of processing powers in order to deliver higher performance computation than a typical desktop computer or workstation in order to solve large problems in science, engineering, or business. The scheduling of these machines has an important impact on their performance. HPC’s job scheduling is intended to develop an operational strategy which utilises resources efficiently and avoids delays. An optimised schedule results in greater efficiency of the parallel machine. In addition, processes and network heterogeneity is another difficulty for the scheduling algorithm. Another problem for parallel job scheduling is user fairness. One of the issues in this field of study is providing a balanced schedule that enhances efficiency and user fairness. ROA-CONS is a new job scheduling method proposed in this paper. It describes a new scheduling approach, which is a combination of an updated conservative backfilling approach further optimised by the raccoon optimisation algorithm. This algorithm also proposes a technique of selection that combines job waiting and response time optimisation with user fairness. It contributes to the development of a symmetrical schedule that increases user satisfaction and performance. In comparison with other well-known job scheduling algorithms, the simulation assesses the effectiveness of the proposed method. The results demonstrate that the proposed strategy offers improved schedules that reduce the overall system’s job waiting and response times.


2010 ◽  
Vol 2 (1) ◽  
pp. 34-50 ◽  
Author(s):  
Nikolaos Preve

Job scheduling in grid computing is a very important problem. To utilize grids efficiently, we need a good job scheduling algorithm to assign jobs to resources in grids. The main scope of this article is to propose a new Ant Colony Optimization (ACO) algorithm for balanced job scheduling in the Grid environment. To achieve the above goal, we will indicate a way to balance the entire system load while minimizing the makespan of a given set of jobs. Based on the experimental results, the proposed algorithm confidently demonstrates its practicability and competitiveness compared with other job scheduling algorithms.


Author(s):  
Bushra Jamil ◽  
Mohammad Shojafar ◽  
Israr Ahmed ◽  
Atta Ullah ◽  
Kashif Munir ◽  
...  

Author(s):  
Ahmed Subhi Abdalkafor ◽  
Khattab M. Ali Alheeti

Cloud computing plays an important role in our daily life. It has direct and positive impact on share and update data, knowledge, storage and scientific resources between various regions. Cloud computing performance heavily based on job scheduling algorithms that are utilized for queue waiting in modern scientific applications. The researchers are considered cloud computing a popular platform for new enforcements. These scheduling algorithms help in design efficient queue lists in cloud as well as they play vital role in reducing waiting for processing time in cloud computing. A novel job scheduling is proposed in this paper to enhance performance of cloud computing and reduce delay time in queue waiting for jobs. The proposed algorithm tries to avoid some significant challenges that throttle from developing applications of cloud computing. However, a smart scheduling technique is proposed in our paper to improve performance processing in cloud applications. Our experimental result of the proposed job scheduling algorithm shows that the proposed schemes possess outstanding enhancing rates with a reduction in waiting time for jobs in queue list.


2014 ◽  
Vol 2014 ◽  
pp. 1-17
Author(s):  
Taeseok Kim ◽  
Hyokyung Bahn ◽  
Youjip Won

In heterogeneous I/O workload environments, disk scheduling algorithms should support different QoS (Quality-of-Service) for each I/O request. For example, the algorithm should meet the deadlines of real-time requests and at the same time provide reasonable response time for best-effort requests. This paper presents a novel disk scheduling algorithm called G-SCAN (Grouping-SCAN) for handling heterogeneous I/O workloads. To find a schedule that satisfies the deadline constraints and seek time minimization simultaneously, G-SCAN maintains a series of candidate schedules and expands the schedules whenever a new request arrives. Maintaining these candidate schedules requires excessive spatial and temporal overhead, but G-SCAN reduces the overhead to a manageable level via pruning the state space using two heuristics. One is grouping that clusters adjacent best-effort requests into a single scheduling unit and the other is the branch-and-bound strategy that cuts off inefficient or impractical schedules. Experiments with various synthetic and real-world I/O workloads show that G-SCAN outperforms existing disk scheduling algorithms significantly in terms of the average response time, throughput, and QoS-guarantees for heterogeneous I/O workloads. We also show that the overhead of G-SCAN is reasonable for on-line execution.


2004 ◽  
Vol 14 (02) ◽  
pp. 255-270 ◽  
Author(s):  
JEMAL H. ABAWAJY

Cluster computing has come to prominence as a cost-effective parallel processing tool for solving many complex computational problems. In this paper, we propose a new timesharing opportunistic scheduling policy to support remote batch job executions over networked clusters to be used in conjunction with the Condor Up-Down scheduling algorithm. We show that timesharing approaches can be used in an opportunistic setting to improve both mean job slowdowns and mean response times with little or no throughput reduction. We also show that the proposed algorithm achieves significant improvement in job response time and slowdown as compared to exiting approaches and some recently proposed new approaches.


2014 ◽  
Vol 1046 ◽  
pp. 363-366 ◽  
Author(s):  
Xue Ying Jiang ◽  
Yang Yang ◽  
Di Jing

Job scheduling algorithm is one of the key technologies Hadoop platform. In this paper, waiting for a fair scheduling algorithm based on delay resource situation. The algorithm periodically obtain status through the use of cluster resources to determine the wait timeout, it has dynamic. Experimental results show that the Delay Fair Scheduling Algorithm Based On the Resource Situation than the original algorithm to some extent reduces the job run time and improve the throughput of Hadoop platform.


Author(s):  
Nikolaos Preve

Job scheduling in grid computing is a very important problem. To utilize grids efficiently, we need a good job scheduling algorithm to assign jobs to resources in grids. The main scope of this paper is to propose a new Ant Colony Optimization (ACO) algorithm for balanced job scheduling in the Grid environment. To achieve the above goal, we will indicate a way to balance the entire system load while minimizing the makespan of a given set of jobs. Based on the experimental results, the proposed algorithm confidently demonstrates its practicability and competitiveness compared with other job scheduling algorithms.


2012 ◽  
pp. 1114-1131
Author(s):  
Nikolaos Preve

Job scheduling in grid computing is a very important problem. To utilize grids efficiently, we need a good job scheduling algorithm to assign jobs to resources in grids. The main scope of this paper is to propose a new Ant Colony Optimization (ACO) algorithm for balanced job scheduling in the Grid environment. To achieve the above goal, we will indicate a way to balance the entire system load while minimizing the makespan of a given set of jobs. Based on the experimental results, the proposed algorithm confidently demonstrates its practicability and competitiveness compared with other job scheduling algorithms.


Electronics ◽  
2021 ◽  
Vol 10 (16) ◽  
pp. 1874
Author(s):  
Yao Zhao ◽  
Jian Dong ◽  
Hongwei Liu ◽  
Jin Wu ◽  
Yanxin Liu

Directed acyclic graph (DAG)-aware task scheduling algorithms have been studied extensively in recent years, and these algorithms have achieved significant performance improvements in data-parallel analytic platforms. However, current DAG-aware task scheduling algorithms, among which HEFT and GRAPHENE are notable, pay little attention to the cache management policy, which plays a vital role in in-memory data-parallel systems such as Spark. Cache management policies that are designed for Spark exhibit poor performance in DAG-aware task-scheduling algorithms, which leads to cache misses and performance degradation. In this study, we propose a new cache management policy known as Long-Running Stage Set First (LSF), which makes full use of the task dependencies to optimize the cache management performance in DAG-aware scheduling algorithms. LSF calculates the caching and prefetching priorities of resilient distributed datasets according to their unprocessed workloads and significance in parallel scheduling, which are key factors in DAG-aware scheduling algorithms. Moreover, we present a cache-aware task scheduling algorithm based on LSF to reduce the resource fragmentation in computing. Experiments demonstrate that, compared to DAG-aware scheduling algorithms with LRU and MRD, the same algorithms with LSF improve the JCT by up to 42% and 30%, respectively. The proposed cache-aware scheduling algorithm also exhibits about 12% reduction in the average job completion time compared to GRAPHENE with LSF.


Sign in / Sign up

Export Citation Format

Share Document