Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System

Spatial queries frequently used in Hadoop for significant data process. However, vast and massive size of spatial information makes it difficult to process the spatial inquiries proficiently, so they utilized the Hadoop system for process Big Data. We have used Boolean Queries & Geometry Boolean Spatial Data for Query Optimization using Hadoop System. In this paper, we show a lightweight and adaptable spatial data index for big data which will process in Hadoop frameworks. Results demonstrate the proficiency and adequacy of our spatial ordering system for various spatial inquiries.

Download Full-text

Replica-aware data recovery performance improvement for Hadoop system with NVM

CCF Transactions on High Performance Computing ◽

10.1007/s42514-021-00066-9 ◽

2021 ◽

Author(s):

Xin Li ◽

Huijie Li ◽

Youyou Lu ◽

Yanchao Zhao ◽

Xiaolin Qin

Keyword(s):

Performance Improvement ◽

Data Recovery ◽

Recovery Performance ◽

Hadoop System

Download Full-text

Power Swing Blocking Algorithm based on Real and Reactive Power Transient Stability

Electric Power Components and Systems ◽

10.1080/15325008.2021.1906794 ◽

2021 ◽

pp. 1-13

Author(s):

Jigneshkumar Desai ◽

Vijay Makwana

Keyword(s):

Transient Stability ◽

Reactive Power ◽

Power Swing ◽

Blocking Algorithm

Download Full-text

CACHE MECHANISM TO AVOID DULPICATION OF SAME THING IN HADOOP SYSTEM TO SPEED UP THE EXTENSION

International Journal of Research in Engineering and Technology ◽

10.15623/ijret.2014.0311048 ◽

2014 ◽

Vol 03 (11) ◽

pp. 299-302

Author(s):

Ritu A.Mundada .

Keyword(s):

Speed Up ◽

Cache Mechanism ◽

Hadoop System

Download Full-text

Using Hadoop Technology to Overcome Big Data Problems by Choosing Proposed Cost-efficient Scheduler Algorithm for Heterogeneous Hadoop System (BD3)

Journal of Scientific Research and Reports ◽

10.9734/jsrr/2020/v26i930310 ◽

2020 ◽

pp. 58-84

Author(s):

Abou_el_ela Abdou Hussein

Keyword(s):

Big Data ◽

Data Processing ◽

Data Storage ◽

Database Management System ◽

Data Sets ◽

Complex Data ◽

Daily Data ◽

Complex Data Sets ◽

Cost Efficient ◽

Hadoop System

Day by day advanced web technologies have led to tremendous growth amount of daily data generated volumes. This mountain of huge and spread data sets leads to phenomenon that called big data which is a collection of massive, heterogeneous, unstructured, enormous and complex data sets. Big Data life cycle could be represented as, Collecting (capture), storing, distribute, manipulating, interpreting, analyzing, investigate and visualizing big data. Traditional techniques as Relational Database Management System (RDBMS) couldn’t handle big data because it has its own limitations, so Advancement in computing architecture is required to handle both the data storage requisites and the weighty processing needed to analyze huge volumes and variety of data economically. There are many technologies manipulating a big data, one of them is hadoop. Hadoop could be understand as an open source spread data processing that is one of the prominent and well known solutions to overcome handling big data problem. Apache Hadoop was based on Google File System and Map Reduce programming paradigm. Through this paper we dived to search for all big data characteristics starting from first three V's that have been extended during time through researches to be more than fifty six V's and making comparisons between researchers to reach to best representation and the precise clarification of all big data V’s characteristics. We highlight the challenges that face big data processing and how to overcome these challenges using Hadoop and its use in processing big data sets as a solution for resolving various problems in a distributed cloud based environment. This paper mainly focuses on different components of hadoop like Hive, Pig, and Hbase, etc. Also we institutes absolute description of Hadoop Pros and cons and improvements to face hadoop problems by choosing proposed Cost-efficient Scheduler Algorithm for heterogeneous Hadoop system.

Download Full-text

Designing a Hadoop system based on computational resources and network delay for wide area networks

Telecommunication Systems ◽

10.1007/s11235-018-0464-y ◽

2018 ◽

Vol 70 (1) ◽

pp. 13-25

Author(s):

Tomohiro Matsuno ◽

Bijoy Chand Chatterjee ◽

Nattapong Kitsuwan ◽

Eiji Oki ◽

Malathi Veeraraghavan ◽

...

Keyword(s):

Wide Area ◽

Wide Area Networks ◽

Network Delay ◽

Computational Resources ◽

Hadoop System

Download Full-text

Development of a Holistic Prototype Hadoop System for Big Data Handling

Advances in Communication, Devices and Networking - Lecture Notes in Electrical Engineering ◽

10.1007/978-981-15-4932-8_52 ◽

2020 ◽

pp. 463-471

Author(s):

Manisha K. Gupta ◽

Md. Nadeem Akhtar Hasid ◽

Sourav Dhar ◽

H. S. Mruthyunjaya

Keyword(s):

Big Data ◽

Data Handling ◽

Hadoop System

Download Full-text

Hadoop Performance Analysis Model with Deep Data Locality

Information ◽

10.3390/info10070222 ◽

2019 ◽

Vol 10 (7) ◽

pp. 222 ◽

Cited By ~ 1

Author(s):

Sungchul Lee ◽

Ju-Yeon Jo ◽

Yoohwan Kim

Keyword(s):

Big Data ◽

Performance Analysis ◽

Data Locality ◽

Performance Model ◽

Data System ◽

Analysis Model ◽

Physical Test ◽

Data Movement ◽

Hadoop Distributed File System ◽

Hadoop System

Background: Hadoop has become the base framework on the big data system via the simple concept that moving computation is cheaper than moving data. Hadoop increases a data locality in the Hadoop Distributed File System (HDFS) to improve the performance of the system. The network traffic among nodes in the big data system is reduced by increasing a data-local on the machine. Traditional research increased the data-local on one of the MapReduce stages to increase the Hadoop performance. However, there is currently no mathematical performance model for the data locality on the Hadoop. Methods: This study made the Hadoop performance analysis model with data locality for analyzing the entire process of MapReduce. In this paper, the data locality concept on the map stage and shuffle stage was explained. Also, this research showed how to apply the Hadoop performance analysis model to increase the performance of the Hadoop system by making the deep data locality. Results: This research proved the deep data locality for increasing performance of Hadoop via three tests, such as, a simulation base test, a cloud test and a physical test. According to the test, the authors improved the Hadoop system by over 34% by using the deep data locality. Conclusions: The deep data locality improved the Hadoop performance by reducing the data movement in HDFS.

Download Full-text