Latency-aware Straggler Mitigation Strategy in Hadoop MapReduce Framework: A Review

Processing huge and complex data to obtain useful information is challenging, even though several big data processing frameworks have been proposed and further enhanced. One of the prominent big data processing frameworks is MapReduce. The main concept of MapReduce framework relies on distributed and parallel processing. However, MapReduce framework is facing serious performance degradations due to the slow execution of certain tasks type called stragglers. Failing to handle stragglers causes delay and affects the overall job execution time. Meanwhile, several straggler reduction techniques have been proposed to improve the MapReduce performance. This study provides a comprehensive and qualitative review of the different existing straggler mitigation solutions. In addition, a taxonomy of the available straggler mitigation solutions is presented. Critical research issues and future research directions are identified and discussed to guide researchers and scholars

Download Full-text

Big Data Analytics in Cloud Computing

Advances in Computer and Electrical Engineering - Novel Practices and Trends in Grid and Cloud Computing ◽

10.4018/978-1-5225-9023-1.ch018 ◽

2019 ◽

pp. 325-341

Author(s):

Rajganesh Nagarajan ◽

Ramkumar Thirunavukarasu

Keyword(s):

Cloud Computing ◽

Big Data ◽

Data Processing ◽

Data Visualization ◽

Data Analytics ◽

Big Data Analytics ◽

Future Research ◽

Big Data Processing ◽

Investment Cost

In this chapter, the authors consider different categories of data, which are processed by the big data analytics tools. The challenges with respect to the big data processing are identified and a solution with the help of cloud computing is highlighted. Since the emergence of cloud computing is highly advocated because of its pay-per-use concept, the data processing tools can be effectively deployed within cloud computing and certainly reduce the investment cost. In addition, this chapter talks about the big data platforms, tools, and applications with data visualization concept. Finally, the applications of data analytics are discussed for future research.

Download Full-text

On Chinese E-Government Development in Big Data Era

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.644-650.5575 ◽

2014 ◽

Vol 644-650 ◽

pp. 5575-5579

Author(s):

Yu Lou ◽

Guo Hui Zhan ◽

Dong Liang Zhang ◽

Zhong Hua Xia ◽

Bang Fan Liu

Keyword(s):

Big Data ◽

Data Processing ◽

Main Theme ◽

Complex Data ◽

Big Data Processing ◽

Processing Mode ◽

Effective Path ◽

Government Development

Big Data is a new concept of computer in the field of e-government, therefore knowledge and content of Big data needs to be deeply explored. In the era of big data, e-government will face enormous challenges and tests, to handle complex data will also become the main theme of the big data era of e-government development. Through the analysis of Big data processing mode, and on this basis, the effect of the era of big data in e-government development, through effective path to big data mode to the development of e-government.

Download Full-text

Integrating the Split/Analyze/Meta-Analyze (SAM) Approach and a Multilevel Framework to Advance Big Data Research in Psychology

Zeitschrift für Psychologie ◽

10.1027/2151-2604/a000345 ◽

2018 ◽

Vol 226 (4) ◽

pp. 274-283 ◽

Cited By ~ 1

Author(s):

Yucheng Eason Zhang ◽

Siqi Liu ◽

Shan Xu ◽

Miles M. Yang ◽

Jian Zhang

Keyword(s):

Big Data ◽

Computer Science ◽

Analytical Techniques ◽

Future Research ◽

Firm Level ◽

Research Directions ◽

Research Issues ◽

Country Level ◽

Multilevel Research ◽

Future Research Directions

Abstract. Though big data research has undergone dramatic developments in recent decades, it has mainly been applied in disciplines such as computer science and business. Psychology research that applies big data to examine research issues in psychology is largely lacking. One of the major challenges regarding the use of big data in psychology is that many researchers in the field may not have sufficient knowledge of big data analytical techniques that are rooted in computer science. This paper integrates the split/analyze/meta-analyze (SAM) approach and a multilevel framework to illustrate how to use the SAM approach to address multilevel research questions with big data. Specifically, we first introduce the SAM approach and then illustrate how to implement this to integrate two big datasets at the firm level and country level. Finally, we discuss theoretical and practical implications, proposing future research directions for psychology scholars.

Download Full-text

Mobile Agent Based MapReduce Framework for Big Data Processing

Advances in Intelligent Systems and Computing - Big Data Analytics ◽

10.1007/978-981-10-6620-7_37 ◽

2017 ◽

pp. 391-402

Author(s):

Umesh Kumar ◽

Sapna Gambhir

Keyword(s):

Big Data ◽

Data Processing ◽

Mobile Agent ◽

Mapreduce Framework ◽

Big Data Processing ◽

Agent Based

Download Full-text

Straggler handling approaches in mapreduce framework: a comparative study

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i1.pp375-382 ◽

2021 ◽

Vol 11 (1) ◽

pp. 375

Author(s):

Anwar H. Katrawi ◽

Rosni Abdullah ◽

Mohammed Anbar ◽

Ibrahim AlShourbaji ◽

Ammar Kamal Abasi

Keyword(s):

Big Data ◽

Data Processing ◽

Database Systems ◽

Mapreduce Framework ◽

Huge Amount ◽

Accurate Identification ◽

Big Data Processing ◽

Identification Methods ◽

Major Bottleneck ◽

Different Sources

The proliferation of information technology produces a huge amount of data called big data that cannot be processed by traditional database systems. These Various types of data come from different sources. However, stragglers are a major bottleneck in big data processing, and hence the early detection and accurate identification of stragglers can have important impacts on the performance of big data processing. This work aims to assess five stragglers identification methods: Hadoop native scheduler, LATE Scheduler, Mantri, MonTool, and Dolly. The performance of these techniques was evaluated based on three benchmarked methods: Sort, Grep and WordCount. The results show that the LATE Scheduler performs the best and it would be efficient to obtain better results for stragglers identification.

Download Full-text