Optimizing Hadoop Performance for Big Data Analytics in Smart Grid

The rapid deployment of Phasor Measurement Units (PMUs) in power systems globally is leading to Big Data challenges. New high performance computing techniques are now required to process an ever increasing volume of data from PMUs. To that extent the Hadoop framework, an open source implementation of the MapReduce computing model, is gaining momentum for Big Data analytics in smart grid applications. However, Hadoop has over 190 configuration parameters, which can have a significant impact on the performance of the Hadoop framework. This paper presents an Enhanced Parallel Detrended Fluctuation Analysis (EPDFA) algorithm for scalable analytics on massive volumes of PMU data. The novel EPDFA algorithm builds on an enhanced Hadoop platform whose configuration parameters are optimized by Gene Expression Programming. Experimental results show that the EPDFA is 29 times faster than the sequential DFA in processing PMU data and 1.87 times faster than a parallel DFA, which utilizes the default Hadoop configuration settings.

Download Full-text

Smart Grid using Big Data Analytics

10.1002/9781118716779 ◽

2017 ◽

Cited By ~ 12

Author(s):

Robert C. Qiu ◽

Paul Antonik

Keyword(s):

Big Data ◽

Smart Grid ◽

Data Analytics ◽

Big Data Analytics

Download Full-text

High performance deep learning techniques for big data analytics

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.5032 ◽

2018 ◽

Vol 30 (23) ◽

pp. e5032

Author(s):

Maozhen Li

Keyword(s):

Big Data ◽

Deep Learning ◽

Data Analytics ◽

High Performance ◽

Big Data Analytics ◽

Learning Techniques

Download Full-text

Big Data and IT Network Data Visualization

International Journal of Mathematical Engineering and Management Sciences ◽

10.33889/ijmems.2018.3.1-002 ◽

2018 ◽

Vol 3 (1) ◽

pp. 9-16 ◽

Cited By ~ 3

Author(s):

Lidong Wang

Keyword(s):

Big Data ◽

Network Analysis ◽

Graphics Processing Units ◽

Data Analytics ◽

High Performance ◽

Big Data Analytics ◽

Network Visualization ◽

Network Data ◽

Graphics Processing ◽

Performance Computing

Visualization with graphs is popular in the data analysis of Information Technology (IT) networks or computer networks. An IT network is often modelled as a graph with hosts being nodes and traffic being flows on many edges. General visualization methods are introduced in this paper. Applications and technology progress of visualization in IT network analysis and big data in IT network visualization are presented. The challenges of visualization and Big Data analytics in IT network visualization are also discussed. Big Data analytics with High Performance Computing (HPC) techniques, especially Graphics Processing Units (GPUs) helps accelerate IT network analysis and visualization.

Download Full-text

Effect of Filtering in Big Data Analytics for Load Forecasting in Smart Grid

Communications in Computer and Information Science - Machine Learning, Image Processing, Network Security and Data Sciences ◽

10.1007/978-981-15-6315-7_10 ◽

2020 ◽

pp. 125-134

Author(s):

Sneha Rai ◽

Mala De

Keyword(s):

Big Data ◽

Smart Grid ◽

Data Analytics ◽

Load Forecasting ◽

Big Data Analytics

Download Full-text

The Role of Big Data Analytics in Smart Grid Management

Emerging Research in Data Engineering Systems and Computer Communications - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-15-0135-7_38 ◽

2020 ◽

pp. 403-412

Author(s):

Bhawna Dhupia ◽

M. Usha Rani ◽

Abdalla Alameen

Keyword(s):

Big Data ◽

Smart Grid ◽

Data Analytics ◽

Big Data Analytics ◽

Grid Management

Download Full-text

Synchronizing Execution of Big Data in Distributed and Parallelized Environments

Big Data ◽

10.4018/978-1-4666-9840-6.ch071 ◽

2016 ◽

pp. 1555-1581

Author(s):

Gueyoung Jung ◽

Tridib Mukherjee

Keyword(s):

Big Data ◽

Distributed System ◽

Data Analytics ◽

High Performance ◽

Large Scale ◽

Big Data Analytics ◽

Loosely Coupled ◽

Current Trends ◽

Distributed Computing Infrastructures ◽

Performance Computing

In the modern information era, the amount of data has exploded. Current trends further indicate exponential growth of data in the future. This prevalent humungous amount of data—referred to as big data—has given rise to the problem of finding the “needle in the haystack” (i.e., extracting meaningful information from big data). Many researchers and practitioners are focusing on big data analytics to address the problem. One of the major issues in this regard is the computation requirement of big data analytics. In recent years, the proliferation of many loosely coupled distributed computing infrastructures (e.g., modern public, private, and hybrid clouds, high performance computing clusters, and grids) have enabled high computing capability to be offered for large-scale computation. This has allowed the execution of the big data analytics to gather pace in recent years across organizations and enterprises. However, even with the high computing capability, it is a big challenge to efficiently extract valuable information from vast astronomical data. Hence, we require unforeseen scalability of performance to deal with the execution of big data analytics. A big question in this regard is how to maximally leverage the high computing capabilities from the aforementioned loosely coupled distributed infrastructure to ensure fast and accurate execution of big data analytics. In this regard, this chapter focuses on synchronous parallelization of big data analytics over a distributed system environment to optimize performance.

Download Full-text

Approaches of enhancing interoperations among high performance computing and big data analytics via augmentation

Cluster Computing ◽

10.1007/s10586-019-02960-y ◽

2019 ◽

Vol 23 (2) ◽

pp. 953-988 ◽

Cited By ~ 3

Author(s):

Ajeet Ram Pathak ◽

Manjusha Pandey ◽

Siddharth S. Rautaray

Keyword(s):

Big Data ◽

High Performance Computing ◽

Data Analytics ◽

High Performance ◽

Big Data Analytics ◽

Performance Computing

Download Full-text

Smart Grid Big Data Analytics: Survey of Technologies, Techniques, and Applications

IEEE Access ◽

10.1109/access.2020.3041178 ◽

2020 ◽

pp. 1-1

Author(s):

Dabeeruddin Syed ◽

Ameema Zainab ◽

Shady S. Refaat ◽

Haitham Abu-Rub ◽

Othmane Bouhali

Keyword(s):

Big Data ◽

Smart Grid ◽

Data Analytics ◽

Big Data Analytics

Download Full-text

Leveraging Big Data Analytics Utilizing Hadoop Framework in Sports Science

Smart Computational Strategies: Theoretical and Practical Aspects ◽

10.1007/978-981-13-6295-8_22 ◽

2019 ◽

pp. 259-272

Author(s):

Gagandeep Jagdev ◽

Sarabjeet Kaur

Keyword(s):

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Sports Science ◽

Hadoop Framework

Download Full-text

CBDR: An efficient storage repository for cultural big data

Digital Scholarship in the Humanities ◽

10.1093/llc/fqz083 ◽

2019 ◽

Vol 35 (4) ◽

pp. 893-903 ◽

Cited By ~ 1

Author(s):

Seemu Sharma ◽

Seema Bawa

Keyword(s):

Big Data ◽

Data Analytics ◽

High Performance ◽

Big Data Analytics ◽

Data Repository ◽

Common People ◽

Data Repositories ◽

Storage And Retrieval ◽

Efficient Storage ◽

Cultural Data

Abstract Cultural data and information on the web are continuously increasing, evolving, and reshaping in the form of big data due to globalization, digitization, and its vast exploration, with common people realizing the importance of ancient values. Therefore, before it becomes unwieldy and too complex to manage, its integration in the form of big data repositories is essential. This article analyzes the complexity of the growing cultural data and presents a Cultural Big Data Repository as an efficient way to store and retrieve cultural big data. The repository is highly scalable and provides integrated high-performance methods for big data analytics in cultural heritage. Experimental results demonstrate that the proposed repository outperforms in terms of space as well as storage and retrieval time of Cultural Big Data.

Download Full-text