Survey on Map Reduce Scheduling Algorithms in Hadoop Heterogeneous Environments

Author(s):  
R. Nirmalan ◽  
K. Gokulakrishnan
2015 ◽  
Vol 22 (4) ◽  
pp. 215-228 ◽  
Author(s):  
Alejandro Vera-Baquero ◽  
Ricardo Colomo Palacios ◽  
Vladimir Stantchev ◽  
Owen Molloy

Purpose – This paper aims to present a solution that enables organizations to monitor and analyse the performance of their business processes by means of Big Data technology. Business process improvement can drastically influence in the profit of corporations and helps them to remain viable. However, the use of traditional Business Intelligence systems is not sufficient to meet today ' s business needs. They normally are business domain-specific and have not been sufficiently process-aware to support the needs of process improvement-type activities, especially on large and complex supply chains, where it entails integrating, monitoring and analysing a vast amount of dispersed event logs, with no structure, and produced on a variety of heterogeneous environments. This paper tackles this variability by devising different Big-Data-based approaches that aim to gain visibility into process performance. Design/methodology/approach – Authors present a cloud-based solution that leverages (BD) technology to provide essential insights into business process improvement. The proposed solution is aimed at measuring and improving overall business performance, especially in very large and complex cross-organisational business processes, where this type of visibility is hard to achieve across heterogeneous systems. Findings – Three different (BD) approaches have been undertaken based on Hadoop and HBase. We introduced first, a map-reduce approach that it is suitable for batch processing and presents a very high scalability. Secondly, we have described an alternative solution by integrating the proposed system with Impala. This approach has significant improvements in respect with map reduce as it is focused on performing real-time queries over HBase. Finally, the use of secondary indexes has been also proposed with the aim of enabling immediate access to event instances for correlation in detriment of high duplication storage and synchronization issues. This approach has produced remarkable results in two real functional environments presented in the paper. Originality/value – The value of the contribution relies on the comparison and integration of software packages towards an integrated solution that is aimed to be adopted by industry. Apart from that, in this paper, authors illustrate the deployment of the architecture in two different settings.


2015 ◽  
Vol 27 (18) ◽  
pp. 5686-5699 ◽  
Author(s):  
Qutaibah Althebyan ◽  
Yaser Jararweh ◽  
Qussai Yaseen ◽  
Omar AlQudah ◽  
Mahmoud Al-Ayyoub

2017 ◽  
Vol 10 (1) ◽  
pp. 194-200 ◽  
Author(s):  
Sneha Sneha ◽  
Shoney Sebastian

Traditional way of storing such a huge amount of data is not convenient because processing those data in the later stages is very tedious job. So nowadays, Hadoop is used to store and process large amount of data. When we look at the statistics of data generated in the recent years it is very high in the last 2 years. Hadoop is a good framework to store and process data efficiently. It works like parallel processing and there is no failure or data loss as such due to fault tolerance. Job scheduling is an important process in Hadoop Map Reduce. Hadoop comes with three types of schedulers namely FIFO (First in first out), Fair and Capacity Scheduler. The schedulers are now a pluggable component in the Hadoop Map Reduce framework. This paper talks about the native job scheduling algorithms in Hadoop. Fair scheduling algorithm is analysed with its algorithm considering its response time, throughput and performance. Advantages and drawbacks of fair scheduling algorithm is discussed. Improvised fair scheduling algorithm is proposed with new strategy. Analysis is made with respect to response time, throughput and performance is calculated in naive fair scheduling and improvised fair scheduling. Improvised fair Scheduling algorithms is used in the cases where there is jobs with high and less processing time.


2014 ◽  
Vol E97.B (7) ◽  
pp. 1474-1482 ◽  
Author(s):  
Takayoshi IWATA ◽  
Hiroyuki MIYAZAKI ◽  
Fumiyuki ADACHI

Sign in / Sign up

Export Citation Format

Share Document