Job Scheduling in Big Data-A Survey

Abstract Big Data Applications with Scheduling becomes an active research area in last three years. The Hadoop framework becomes very popular and most used frameworks in a distributed data processing. Hadoop is also open source software that allows the user to effectively utilize the hardware. Various scheduling algorithms of the MapReduce model using Hadoop vary with design and behavior, and are used for handling many issues like data locality, awareness with resource, energy and time. This paper gives the outline of job scheduling, classification of the scheduler, and comparison of different existing algorithms with advantages, drawbacks, limitations. In this paper, we discussed various tools and frameworks used for monitoring and the ways to improve the performance in MapReduce. This paper helps the beginners and researchers in understanding the scheduling mechanisms used in Big Data.

Download Full-text

A survey on job scheduling algorithms in Big data processing

2015 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT) ◽

10.1109/icecct.2015.7226035 ◽

2015 ◽

Cited By ~ 13

Author(s):

Jyoti V Gautam ◽

Harshadkumar B Prajapati ◽

Vipul K Dabhi ◽

Sanjay Chaudhary

Keyword(s):

Big Data ◽

Data Processing ◽

Job Scheduling ◽

Scheduling Algorithms ◽

Big Data Processing

Download Full-text

A Survey on Big Data Management and Job Scheduling

International Journal of Computer Applications ◽

10.5120/ijca2015907161 ◽

2015 ◽

Vol 130 (13) ◽

pp. 41-49 ◽

Cited By ~ 1

Author(s):

Sreedhar C. ◽

N. Kasiviswanath ◽

P. Chenna

Keyword(s):

Big Data ◽

Data Management ◽

Job Scheduling

Download Full-text

Improved heuristic job scheduling method to enhance throughput for big data analytics

Tsinghua Science & Technology ◽

10.26599/tst.2020.9010047 ◽

2022 ◽

Vol 27 (2) ◽

pp. 344-357

Author(s):

Zhiyao Hu ◽

Dongsheng Li

Keyword(s):

Big Data ◽

Data Analytics ◽

Job Scheduling ◽

Big Data Analytics ◽

Scheduling Method

Download Full-text

An Investigation Study on QoS and Traffic Aware Job Scheduling Techniques with Big Data

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v5i11.17 ◽

2017 ◽

Vol 5 (11) ◽

pp. 1-7

Author(s):

C.R. Durga Devi ◽

◽

R. Manicka Chezian

Keyword(s):

Big Data ◽

Job Scheduling

Download Full-text

Big Data Hadoop MapReduce Job Scheduling: A Short Survey

Advances in Intelligent Systems and Computing - Information Systems Design and Intelligent Applications ◽

10.1007/978-981-13-3329-3_33 ◽

2018 ◽

pp. 349-365 ◽

Cited By ~ 1

Author(s):

N. Deshai ◽

B. V. D. S. Sekhar ◽

S. Venkataramana ◽

K. Srinivas ◽

G. P. S. Varma

Keyword(s):

Big Data ◽

Job Scheduling ◽

Hadoop Mapreduce ◽

Short Survey

Download Full-text

Development of Multiple Big Data Analytics Platforms with Rapid Response

Scientific Programming ◽

10.1155/2017/6972461 ◽

2017 ◽

Vol 2017 ◽

pp. 1-13 ◽

Cited By ~ 3

Author(s):

Bao Rong Chang ◽

Yun-Da Lee ◽

Po-Hao Liao

Keyword(s):

Big Data ◽

Business Intelligence ◽

Data Analytics ◽

High Performance ◽

Job Scheduling ◽

Big Data Analytics ◽

Data Retrieval ◽

System Throughput ◽

Data Platform ◽

R Programming

The crucial problem of the integration of multiple platforms is how to adapt for their own computing features so as to execute the assignments most efficiently and gain the best outcome. This paper introduced the new approaches to big data platform, RHhadoop and SparkR, and integrated them to form a high-performance big data analytics with multiple platforms as part of business intelligence (BI) to carry out rapid data retrieval and analytics with R programming. This paper aims to develop the optimization for job scheduling using MSHEFT algorithm and implement the optimized platform selection based on computing features for improving the system throughput significantly. In addition, users would simply give R commands rather than run Java or Scala program to perform the data retrieval and analytics in the proposed platforms. As a result, according to performance index calculated for various methods, although the optimized platform selection can reduce the execution time for the data retrieval and analytics significantly, furthermore scheduling optimization definitely increases the system efficiency a lot.

Download Full-text

Highly Parallel Map Reduce Process and Efficient Job Scheduling Methodologies of Big Data Systems.

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a3903.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 3394-3397

Keyword(s):

Big Data ◽

Job Scheduling ◽

Data System ◽

Map Reduce ◽

Data Systems ◽

Processing Strategy ◽

Highly Efficient ◽

Critical Task ◽

Big Data Systems

This paper studies about various job scheduling methodologies used in big data systems. Map reduce is a highly efficient distributed job processing strategy for big data systems. Job scheduling is a critical task of any big data system as the volume of jobs need to be processed is tremendous. This study will go over the map reduce process in detail. It also reviews various job scheduling methodologies and tries to perform an efficient comparison among these methodologies.

Download Full-text