Researching a Distributed Computing Automation Platform for Big Data Processing

Author(s):  
Nadezhda Bahareva ◽  
Yury Ushakov ◽  
Margarita Ushakova ◽  
Denis Parfenov ◽  
Leonid Legashev ◽  
...  
Author(s):  
Ankit Shah ◽  
Mamta C. Padole

Big Data processing and analysis requires tremendous processing capability. Distributed computing brings many commodity systems under the common platform to answer the need for Big Data processing and analysis. Apache Hadoop is the most suitable set of tools for Big Data storage, processing, and analysis. But Hadoop found to be inefficient when it comes to heterogeneous set computers which have different processing capabilities. In this research, we propose the Saksham model which optimizes the processing time by efficient use of node processing capability and file management. The proposed model shows the performance improvement for Big Data processing. To achieve better performance, Saksham model uses two vital aspects of heterogeneous distributed computing: Effective block rearrangement policy and use of node processing capability. The results demonstrate that the proposed model successfully achieves better job execution time and improves data locality.


The development of information technology, distributed computing, hardware, wireless communication and intelligent technology has been increased in Internet of Things (IoT) heterogonous filed to improve the limitations of cloud computing in big data processing. Computation of data over wireless communication based distributed computing face different challenges in term of off-loading decision, data delay in heterogonous IoT devices. Optimization of caching, data computation and load maintenance of different edge clouds is still a challenging task in heterogonous IOT for effective processing of big data. So that this paper presents Novel Optimized and Sorted Positional Index List (OSPIL) approach on big data processing to reduce and optimize delay, I/O cost, CPU usage and memory overhead significantly. In this approach sorted index is used to build attributes are (data consists different attributes) arranged in ascending order. This approach consists two phases in data processing, in Phase 1, scan the depth of all the sorted list and schedule the processing data. In phase 2, explore the sorted list and then give results in sequential order on hash table. Experimental results of proposed approach give better and significant data processing results to optimize delay, I/O cost, CPU usage and memory overhead on real world data sets relates wireless communication


Author(s):  
Ankit Shah ◽  
Mamta C. Padole

Big Data processing and analysis requires tremendous processing capability. Distributed computing brings many commodity systems under the common platform to answer the need for Big Data processing and analysis. Apache Hadoop is the most suitable set of tools for Big Data storage, processing, and analysis. But Hadoop found to be inefficient when it comes to heterogeneous set computers which have different processing capabilities. In this research, we propose the Saksham model which optimizes the processing time by efficient use of node processing capability and file management. The proposed model shows the performance improvement for Big Data processing. To achieve better performance, Saksham model uses two vital aspects of heterogeneous distributed computing: Effective block rearrangement policy and use of node processing capability. The results demonstrate that the proposed model successfully achieves better job execution time and improves data locality.


2019 ◽  
Vol 12 (1) ◽  
pp. 42 ◽  
Author(s):  
Andrey I. Vlasov ◽  
Konstantin A. Muraviev ◽  
Alexandra A. Prudius ◽  
Demid A. Uzenkov

Sign in / Sign up

Export Citation Format

Share Document