Researching a Distributed Computing Automation Platform for Big Data Processing

2020 ◽

pp. 206-230

Author(s):

Ankit Shah ◽

Mamta C. Padole

Keyword(s):

Big Data ◽

Distributed Computing ◽

Data Processing ◽

Data Storage ◽

Model Performance ◽

Big Data Processing ◽

Apache Hadoop ◽

Processing Capability ◽

Proposed Model ◽

Capability Evaluation

Big Data processing and analysis requires tremendous processing capability. Distributed computing brings many commodity systems under the common platform to answer the need for Big Data processing and analysis. Apache Hadoop is the most suitable set of tools for Big Data storage, processing, and analysis. But Hadoop found to be inefficient when it comes to heterogeneous set computers which have different processing capabilities. In this research, we propose the Saksham model which optimizes the processing time by efficient use of node processing capability and file management. The proposed model shows the performance improvement for Big Data processing. To achieve better performance, Saksham model uses two vital aspects of heterogeneous distributed computing: Effective block rearrangement policy and use of node processing capability. The results demonstrate that the proposed model successfully achieves better job execution time and improves data locality.

Download Full-text

Optimized and Efficient Computation of Big Data in Heterogonous Internet of Things

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1801.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 6005-6010

Keyword(s):

Big Data ◽

Internet Of Things ◽

Wireless Communication ◽

Distributed Computing ◽

Data Processing ◽

Phase 1 ◽

Efficient Computation ◽

Big Data Processing ◽

Memory Overhead ◽

Cpu Usage

The development of information technology, distributed computing, hardware, wireless communication and intelligent technology has been increased in Internet of Things (IoT) heterogonous filed to improve the limitations of cloud computing in big data processing. Computation of data over wireless communication based distributed computing face different challenges in term of off-loading decision, data delay in heterogonous IoT devices. Optimization of caching, data computation and load maintenance of different edge clouds is still a challenging task in heterogonous IOT for effective processing of big data. So that this paper presents Novel Optimized and Sorted Positional Index List (OSPIL) approach on big data processing to reduce and optimize delay, I/O cost, CPU usage and memory overhead significantly. In this approach sorted index is used to build attributes are (data consists different attributes) arranged in ascending order. This approach consists two phases in data processing, in Phase 1, scan the depth of all the sorted list and schedule the processing data. In phase 2, explore the sorted list and then give results in sequential order on hash table. Experimental results of proposed approach give better and significant data processing results to optimize delay, I/O cost, CPU usage and memory overhead on real world data sets relates wireless communication

Download Full-text

“Saksham Model” Performance Improvisation Using Node Capability Evaluation in Apache Hadoop

Research Anthology on Architectures, Frameworks, and Integration Strategies for Distributed and Cloud Computing ◽

10.4018/978-1-7998-5339-8.ch062 ◽

2021 ◽

pp. 1282-1302

Author(s):

Ankit Shah ◽

Mamta C. Padole

Keyword(s):

Big Data ◽

Distributed Computing ◽

Data Processing ◽

Data Storage ◽

Model Performance ◽

Big Data Processing ◽

Apache Hadoop ◽

Processing Capability ◽

Proposed Model ◽

Capability Evaluation

Big Data processing and analysis requires tremendous processing capability. Distributed computing brings many commodity systems under the common platform to answer the need for Big Data processing and analysis. Apache Hadoop is the most suitable set of tools for Big Data storage, processing, and analysis. But Hadoop found to be inefficient when it comes to heterogeneous set computers which have different processing capabilities. In this research, we propose the Saksham model which optimizes the processing time by efficient use of node processing capability and file management. The proposed model shows the performance improvement for Big Data processing. To achieve better performance, Saksham model uses two vital aspects of heterogeneous distributed computing: Effective block rearrangement policy and use of node processing capability. The results demonstrate that the proposed model successfully achieves better job execution time and improves data locality.

Download Full-text