Survey on Map Reduce Scheduling Algorithms in Hadoop Heterogeneous Environments

Purpose – This paper aims to present a solution that enables organizations to monitor and analyse the performance of their business processes by means of Big Data technology. Business process improvement can drastically influence in the profit of corporations and helps them to remain viable. However, the use of traditional Business Intelligence systems is not sufficient to meet today ' s business needs. They normally are business domain-specific and have not been sufficiently process-aware to support the needs of process improvement-type activities, especially on large and complex supply chains, where it entails integrating, monitoring and analysing a vast amount of dispersed event logs, with no structure, and produced on a variety of heterogeneous environments. This paper tackles this variability by devising different Big-Data-based approaches that aim to gain visibility into process performance. Design/methodology/approach – Authors present a cloud-based solution that leverages (BD) technology to provide essential insights into business process improvement. The proposed solution is aimed at measuring and improving overall business performance, especially in very large and complex cross-organisational business processes, where this type of visibility is hard to achieve across heterogeneous systems. Findings – Three different (BD) approaches have been undertaken based on Hadoop and HBase. We introduced first, a map-reduce approach that it is suitable for batch processing and presents a very high scalability. Secondly, we have described an alternative solution by integrating the proposed system with Impala. This approach has significant improvements in respect with map reduce as it is focused on performing real-time queries over HBase. Finally, the use of secondary indexes has been also proposed with the aim of enabling immediate access to event instances for correlation in detriment of high duplication storage and synchronization issues. This approach has produced remarkable results in two real functional environments presented in the paper. Originality/value – The value of the contribution relies on the comparison and integration of software packages towards an integrated solution that is aimed to be adopted by industry. Apart from that, in this paper, authors illustrate the deployment of the architecture in two different settings.

Download Full-text

Comparative Analysis of Map Reduce Scheduling Algorithms

Journal of Advanced Research in Dynamical and Control Systems ◽

10.5373/jardcs/v11sp10/20192773 ◽

2019 ◽

Vol 11 (10-SPECIAL ISSUE) ◽

pp. 20-31

Author(s):

Sonia Sharma ◽

Dr. Parag Jain

Keyword(s):

Comparative Analysis ◽

Scheduling Algorithms ◽

Map Reduce

Download Full-text

Evaluation Performance of Task Scheduling Algorithms in Heterogeneous Environments

International Journal of Computer Applications ◽

10.5120/ijca2016908968 ◽

2016 ◽

Vol 138 (8) ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Hadi Yazdanpanah ◽

Amin Shouraki ◽

Najmeh Jamali

Keyword(s):

Task Scheduling ◽

Scheduling Algorithms ◽

Heterogeneous Environments ◽

Evaluation Performance ◽

Évaluation Performance

Download Full-text

Performance of Partitioned Homogeneous Multiprocessor Real-Time Scheduling Algorithms in Heterogeneous Environments

2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS) ◽

10.1109/bigdatasecurity-hpsc-ids.2016.64 ◽

2016 ◽

Author(s):

Andrew Burke

Keyword(s):

Real Time ◽

Scheduling Algorithms ◽

Heterogeneous Environments ◽

Real Time Scheduling ◽

Time Scheduling

Download Full-text

Evaluating map reduce tasks scheduling algorithms over cloud computing infrastructure

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.3595 ◽

2015 ◽

Vol 27 (18) ◽

pp. 5686-5699 ◽

Cited By ~ 10

Author(s):

Qutaibah Althebyan ◽

Yaser Jararweh ◽

Qussai Yaseen ◽

Omar AlQudah ◽

Mahmoud Al-Ayyoub

Keyword(s):

Cloud Computing ◽

Scheduling Algorithms ◽

Map Reduce ◽

Tasks Scheduling ◽

Computing Infrastructure

Download Full-text

A brief review of scheduling algorithms of Map Reduce model using Hadoop

International Journal of Engineering Trends and Technology ◽

10.14445/22315381/ijett-v45p209 ◽

2017 ◽

Vol 45 (1) ◽

pp. 37-42

Author(s):

Adhishtha Tyagi ◽

◽

Sonia Sharma

Keyword(s):

Scheduling Algorithms ◽

Map Reduce

Download Full-text

Evolutionary Inheritance in Workflow Scheduling Algorithms within Dynamically Changing Heterogeneous Environments

Proceedings of the International Conference on Evolutionary Computation Theory and Applications ◽

10.5220/0005035201600168 ◽

2014 ◽

Cited By ~ 1

Author(s):

Nikolay Butakov ◽

Denis Nasonov ◽

Alexander Boukhanovsky

Keyword(s):

Scheduling Algorithms ◽

Workflow Scheduling ◽

Heterogeneous Environments

Download Full-text

Improved fair Scheduling Algorithm for Hadoop Clustering

Oriental journal of computer science and technology ◽

10.13005/ojcst/10.01.26 ◽

2017 ◽

Vol 10 (1) ◽

pp. 194-200 ◽

Cited By ~ 1

Author(s):

Sneha Sneha ◽

Shoney Sebastian

Keyword(s):

Response Time ◽

Job Scheduling ◽

Scheduling Algorithm ◽

Scheduling Algorithms ◽

Map Reduce ◽

Process Data ◽

Strategy Analysis ◽

Fair Scheduling ◽

New Strategy ◽

And Performance

Traditional way of storing such a huge amount of data is not convenient because processing those data in the later stages is very tedious job. So nowadays, Hadoop is used to store and process large amount of data. When we look at the statistics of data generated in the recent years it is very high in the last 2 years. Hadoop is a good framework to store and process data efficiently. It works like parallel processing and there is no failure or data loss as such due to fault tolerance. Job scheduling is an important process in Hadoop Map Reduce. Hadoop comes with three types of schedulers namely FIFO (First in first out), Fair and Capacity Scheduler. The schedulers are now a pluggable component in the Hadoop Map Reduce framework. This paper talks about the native job scheduling algorithms in Hadoop. Fair scheduling algorithm is analysed with its algorithm considering its response time, throughput and performance. Advantages and drawbacks of fair scheduling algorithm is discussed. Improvised fair scheduling algorithm is proposed with new strategy. Analysis is made with respect to response time, throughput and performance is calculated in naive fair scheduling and improvised fair scheduling. Improvised fair Scheduling algorithms is used in the cases where there is jobs with high and less processing time.

Download Full-text