scholarly journals Straggler handling approaches in mapreduce framework: a comparative study

Author(s):  
Anwar H. Katrawi ◽  
Rosni Abdullah ◽  
Mohammed Anbar ◽  
Ibrahim AlShourbaji ◽  
Ammar Kamal Abasi

The proliferation of information technology produces a huge amount of data called big data that cannot be processed by traditional database systems. These Various types of data come from different sources. However, stragglers are a major bottleneck in big data processing, and hence the early detection and accurate identification of stragglers can have important impacts on the performance of big data processing. This work aims to assess five stragglers identification methods: Hadoop native scheduler, LATE Scheduler, Mantri, MonTool, and Dolly. The performance of these techniques was evaluated based on three benchmarked methods: Sort, Grep and WordCount. The results show that the LATE Scheduler performs the best and it would be efficient to obtain better results for stragglers identification.

2022 ◽  
pp. 1162-1191
Author(s):  
Dinesh Chander ◽  
Hari Singh ◽  
Abhinav Kirti Gupta

Data processing has become an important field in today's big data-dominated world. The data has been generating at a tremendous pace from different sources. There has been a change in the nature of data from batch-data to streaming-data, and consequently, data processing methodologies have also changed. Traditional SQL is no longer capable of dealing with this big data. This chapter describes the nature of data and various tools, techniques, and technologies to handle this big data. The chapter also describes the need of shifting big data on to cloud and the challenges in big data processing in the cloud, the migration from data processing to data analytics, tools used in data analytics, and the issues and challenges in data processing and analytics. Then the chapter touches an important application area of streaming data, sentiment analysis, and tries to explore it through some test case demonstrations and results.


Author(s):  
Saifuzzafar Jaweed Ahmed

Big Data has become a very important part of all industries and organizations sectors nowadays. All sectors like energy, banking, retail, hardware, networking, etc all generate a huge amount of unstructured data which is processed and analyzed accurately in a structured form. Then the structured data can reveal very useful information for their business growth. Big Data helps in getting useful data from unstructured or heterogeneous data by analyzing them. Big data initially defined by the volume of a data set. Big data sets are generally huge, measuring tens of terabytes and sometimes crossing the sting of petabytes. Today, big data falls under three categories structured, unstructured, and semi-structured. The size of big data is improving in a fast phase from Terabytes to Exabytes Of data. Also, Big data requires techniques that help to integrate a huge amount of heterogeneous data and to process them. Data Analysis which is a big data process has its applications in various areas such as business processing, disease prevention, cybersecurity, and so on. Big data has three major issues such as data storage, data management, and information retrieval. Big data processing requires a particular setup of hardware and virtual machines to derive results. The processing is completed simultaneously to realize results as quickly as possible. These days big data processing techniques include Text mining and sentimental analysis. Text analytics is a very large field under which there are several techniques, models, methods for automatic and quantitative analysis of textual data. The purpose of this paper is to show how the text analysis and sentimental analysis process the unstructured data and how these techniques extract meaningful information and, thus make information available to the various data mining statistical and machine learning) algorithms.


Author(s):  
Jayashree K. ◽  
Abirami R. ◽  
Rajeswari P.

The successful development of big data and the internet of things (IoT) is increasing and influencing all areas of technologies and businesses. The rapid increase of more devices that are connected to IoT from which enormous amount of data are consumed indicates the way how big data is related with IoT. Since huge amount of data are obtained from different sources, analysis of these data involves much of processing at each and every level to extract knowledge for decision making process. To manage big data in a continuous network that keeps expanding leads to few issues related to data collection, data processing, analytics, and security. To address these issues, certain solution using bigdata approach in IoT are examined. Combining these two areas provides several opportunities developing new systems and identify advanced techniques to solve challenges on big data and IoT.


Author(s):  
Dinesh Chander ◽  
Hari Singh ◽  
Abhinav Kirti Gupta

Data processing has become an important field in today's big data-dominated world. The data has been generating at a tremendous pace from different sources. There has been a change in the nature of data from batch-data to streaming-data, and consequently, data processing methodologies have also changed. Traditional SQL is no longer capable of dealing with this big data. This chapter describes the nature of data and various tools, techniques, and technologies to handle this big data. The chapter also describes the need of shifting big data on to cloud and the challenges in big data processing in the cloud, the migration from data processing to data analytics, tools used in data analytics, and the issues and challenges in data processing and analytics. Then the chapter touches an important application area of streaming data, sentiment analysis, and tries to explore it through some test case demonstrations and results.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Geetanjali Rathee ◽  
Adel Khelifi ◽  
Razi Iqbal

The automated techniques enabled with Artificial Neural Networks (ANN), Internet of Things (IoT), and cloud-based services affect the real-time analysis and processing of information in a variety of applications. In addition, multihoming is a type of network that combines various types of networks into a single environment while managing a huge amount of data. Nowadays, the big data processing and monitoring in multihoming networks provide less attention while reducing the security risk and efficiency during processing or monitoring the information. The use of AI-based systems in multihoming big data with IoT- and AI-integrated systems may benefit in various aspects. Although multihoming security issues and their analysis have been well studied by various scientists and researchers; however, not much attention is paid towards big data security processing in multihoming especially using automated techniques and systems. The aim of this paper is to propose an IoT-based artificial network to process and compute big data processing by ensuring a secure communication multihoming network using the Bayesian Rule (BR) and Levenberg-Marquardt (LM) algorithms. Further, the efficiency and effect on multihoming information processing using an AI-assisted mechanism are experimented over various parameters such as classification accuracy, classification time, specificity, sensitivity, ROC, and F -measure.


2021 ◽  
Vol 2 (2) ◽  
pp. 53-60
Author(s):  
Ajibade Lukuman Saheed ◽  
Abu Bakar Kamalrulnizam ◽  
Ahmed Aliyu ◽  
Tasneem Darwish

Processing huge and complex data to obtain useful information is challenging, even though several big data processing frameworks have been proposed and further enhanced. One of the prominent big data processing frameworks is MapReduce. The main concept of MapReduce framework relies on distributed and parallel processing. However, MapReduce framework is facing serious performance degradations due to the slow execution of certain tasks type called stragglers. Failing to handle stragglers causes delay and affects the overall job execution time. Meanwhile, several straggler reduction techniques have been proposed to improve the MapReduce performance. This study provides a comprehensive and qualitative review of the different existing straggler mitigation solutions. In addition, a taxonomy of the available straggler mitigation solutions is presented. Critical research issues and future research directions are identified and discussed to guide researchers and scholars


Sign in / Sign up

Export Citation Format

Share Document