scholarly journals Optimizing Hadoop Performance for Big Data Analytics in Smart Grid

2017 ◽  
Vol 2017 ◽  
pp. 1-11 ◽  
Author(s):  
Mukhtaj Khan ◽  
Zhengwen Huang ◽  
Maozhen Li ◽  
Gareth A. Taylor ◽  
Phillip M. Ashton ◽  
...  

The rapid deployment of Phasor Measurement Units (PMUs) in power systems globally is leading to Big Data challenges. New high performance computing techniques are now required to process an ever increasing volume of data from PMUs. To that extent the Hadoop framework, an open source implementation of the MapReduce computing model, is gaining momentum for Big Data analytics in smart grid applications. However, Hadoop has over 190 configuration parameters, which can have a significant impact on the performance of the Hadoop framework. This paper presents an Enhanced Parallel Detrended Fluctuation Analysis (EPDFA) algorithm for scalable analytics on massive volumes of PMU data. The novel EPDFA algorithm builds on an enhanced Hadoop platform whose configuration parameters are optimized by Gene Expression Programming. Experimental results show that the EPDFA is 29 times faster than the sequential DFA in processing PMU data and 1.87 times faster than a parallel DFA, which utilizes the default Hadoop configuration settings.

Author(s):  
Lidong Wang

Visualization with graphs is popular in the data analysis of Information Technology (IT) networks or computer networks. An IT network is often modelled as a graph with hosts being nodes and traffic being flows on many edges. General visualization methods are introduced in this paper. Applications and technology progress of visualization in IT network analysis and big data in IT network visualization are presented. The challenges of visualization and Big Data analytics in IT network visualization are also discussed. Big Data analytics with High Performance Computing (HPC) techniques, especially Graphics Processing Units (GPUs) helps accelerate IT network analysis and visualization.


Big Data ◽  
2016 ◽  
pp. 1555-1581
Author(s):  
Gueyoung Jung ◽  
Tridib Mukherjee

In the modern information era, the amount of data has exploded. Current trends further indicate exponential growth of data in the future. This prevalent humungous amount of data—referred to as big data—has given rise to the problem of finding the “needle in the haystack” (i.e., extracting meaningful information from big data). Many researchers and practitioners are focusing on big data analytics to address the problem. One of the major issues in this regard is the computation requirement of big data analytics. In recent years, the proliferation of many loosely coupled distributed computing infrastructures (e.g., modern public, private, and hybrid clouds, high performance computing clusters, and grids) have enabled high computing capability to be offered for large-scale computation. This has allowed the execution of the big data analytics to gather pace in recent years across organizations and enterprises. However, even with the high computing capability, it is a big challenge to efficiently extract valuable information from vast astronomical data. Hence, we require unforeseen scalability of performance to deal with the execution of big data analytics. A big question in this regard is how to maximally leverage the high computing capabilities from the aforementioned loosely coupled distributed infrastructure to ensure fast and accurate execution of big data analytics. In this regard, this chapter focuses on synchronous parallelization of big data analytics over a distributed system environment to optimize performance.


IEEE Access ◽  
2020 ◽  
pp. 1-1
Author(s):  
Dabeeruddin Syed ◽  
Ameema Zainab ◽  
Shady S. Refaat ◽  
Haitham Abu-Rub ◽  
Othmane Bouhali

2019 ◽  
Vol 35 (4) ◽  
pp. 893-903 ◽  
Author(s):  
Seemu Sharma ◽  
Seema Bawa

Abstract Cultural data and information on the web are continuously increasing, evolving, and reshaping in the form of big data due to globalization, digitization, and its vast exploration, with common people realizing the importance of ancient values. Therefore, before it becomes unwieldy and too complex to manage, its integration in the form of big data repositories is essential. This article analyzes the complexity of the growing cultural data and presents a Cultural Big Data Repository as an efficient way to store and retrieve cultural big data. The repository is highly scalable and provides integrated high-performance methods for big data analytics in cultural heritage. Experimental results demonstrate that the proposed repository outperforms in terms of space as well as storage and retrieval time of Cultural Big Data.


Sign in / Sign up

Export Citation Format

Share Document