Cloud4NFICA-Nearness Factor-Based Incremental Clustering Algorithm Using Microsoft Azure for the Analysis of Intelligent Meter Data

2022 ◽  
pp. 423-442
Author(s):  
Archana Yashodip Chaudhari ◽  
Preeti Mulay

Intelligent electricity meters (IEMs) form a key infrastructure necessary for the growth of smart grids. IEMs generate a considerable amount of electricity data incrementally. However, on an influx of new data, traditional clustering task re-cluster all of the data from scratch. The incremental clustering method is an essential way to solve the problem of clustering with dynamic data. Given the volume of IEM data and the number of data types involved, an incremental clustering method is highly complex. Microsoft Azure provide the processing power necessary to handle incremental clustering analytics. The proposed Cloud4NFICA is a scalable platform of a nearness factor-based incremental clustering algorithm. This research uses the real dataset of Irish households collected by IEMs and related socioeconomic data. Cloud4NFICA is incremental in nature, hence accommodates the influx of new data. Cloud4NFICA was designed as an infrastructure as a service. It is visible from the study that the developed system performs well on the scalability aspect.

2020 ◽  
Vol 10 (2) ◽  
pp. 21-39
Author(s):  
Archana Yashodip Chaudhari ◽  
Preeti Mulay

Intelligent electricity meters (IEMs) form a key infrastructure necessary for the growth of smart grids. IEMs generate a considerable amount of electricity data incrementally. However, on an influx of new data, traditional clustering task re-cluster all of the data from scratch. The incremental clustering method is an essential way to solve the problem of clustering with dynamic data. Given the volume of IEM data and the number of data types involved, an incremental clustering method is highly complex. Microsoft Azure provide the processing power necessary to handle incremental clustering analytics. The proposed Cloud4NFICA is a scalable platform of a nearness factor-based incremental clustering algorithm. This research uses the real dataset of Irish households collected by IEMs and related socioeconomic data. Cloud4NFICA is incremental in nature, hence accommodates the influx of new data. Cloud4NFICA was designed as an infrastructure as a service. It is visible from the study that the developed system performs well on the scalability aspect.


2021 ◽  
Vol 15 ◽  
pp. 14-18
Author(s):  
Arun Pratap Singh Kushwah ◽  
Shailesh Jaloree ◽  
Ramjeevan Singh Thakur

Clustering is an approach of data mining, which helps us to find the underlying hidden structure in the dataset. K-means is a clustering method which usages distance functions to find the similarities or dissimilarities between the instances. DBSCAN is a clustering algorithm, which discovers the arbitrary shapes & sizes of clusters from huge volume of using spatial density method. These two approaches of clustering are the classical methods for efficient clustering but underperform when the data is updated frequently in the databases so, the incremental or gradual clustering approaches are always preferred in this environment. In this paper, an incremental approach for clustering is introduced using K-means and DBSCAN to handle the new datasets dynamically updated in the database in an interval.


2019 ◽  
Vol 1 (1) ◽  
pp. 31-39
Author(s):  
Ilham Safitra Damanik ◽  
Sundari Retno Andani ◽  
Dedi Sehendro

Milk is an important intake to meet nutritional needs. Both consumed by children, and adults. Indonesia has many producers of fresh milk, but it is not sufficient for national milk needs. Data mining is a science in the field of computers that is widely used in research. one of the data mining techniques is Clustering. Clustering is a method by grouping data. The Clustering method will be more optimal if you use a lot of data. Data to be used are provincial data in Indonesia from 2000 to 2017 obtained from the Central Statistics Agency. The results of this study are in Clusters based on 2 milk-producing groups, namely high-dairy producers and low-milk producing regions. From 27 data on fresh milk production in Indonesia, two high-level provinces can be obtained, namely: West Java and East Java. And 25 others were added in 7 provinces which did not follow the calculation of the K-Means Clustering Algorithm, including in the low level cluster.


Author(s):  
Yuancheng Li ◽  
Yaqi Cui ◽  
Xiaolong Zhang

Background: Advanced Metering Infrastructure (AMI) for the smart grid is growing rapidly which results in the exponential growth of data collected and transmitted in the device. By clustering this data, it can give the electricity company a better understanding of the personalized and differentiated needs of the user. Objective: The existing clustering algorithms for processing data generally have some problems, such as insufficient data utilization, high computational complexity and low accuracy of behavior recognition. Methods: In order to improve the clustering accuracy, this paper proposes a new clustering method based on the electrical behavior of the user. Starting with the analysis of user load characteristics, the user electricity data samples were constructed. The daily load characteristic curve was extracted through improved extreme learning machine clustering algorithm and effective index criteria. Moreover, clustering analysis was carried out for different users from industrial areas, commercial areas and residential areas. The improved extreme learning machine algorithm, also called Unsupervised Extreme Learning Machine (US-ELM), is an extension and improvement of the original Extreme Learning Machine (ELM), which realizes the unsupervised clustering task on the basis of the original ELM. Results: Four different data sets have been experimented and compared with other commonly used clustering algorithms by MATLAB programming. The experimental results show that the US-ELM algorithm has higher accuracy in processing power data. Conclusion: The unsupervised ELM algorithm can greatly reduce the time consumption and improve the effectiveness of clustering.


Author(s):  
Ana Belén Ramos-Guajardo

AbstractA new clustering method for random intervals that are measured in the same units over the same group of individuals is provided. It takes into account the similarity degree between the expected values of the random intervals that can be analyzed by means of a two-sample similarity bootstrap test. Thus, the expectations of each pair of random intervals are compared through that test and a p-value matrix is finally obtained. The suggested clustering algorithm considers such a matrix where each p-value can be seen at the same time as a kind of similarity between the random intervals. The algorithm is iterative and includes an objective stopping criterion that leads to statistically similar clusters that are different from each other. Some simulations to show the empirical performance of the proposal are developed and the approach is applied to two real-life situations.


2013 ◽  
Vol 321-324 ◽  
pp. 1939-1942
Author(s):  
Lei Gu

The locality sensitive k-means clustering method has been presented recently. Although this approach can improve the clustering accuracies, it often gains the unstable clustering results because some random samples are employed for the initial centers. In this paper, an initialization method based on the core clusters is used for the locality sensitive k-means clustering. The core clusters can be formed by constructing the σ-neighborhood graph and their centers are regarded as the initial centers of the locality sensitive k-means clustering. To investigate the effectiveness of our approach, several experiments are done on three datasets. Experimental results show that our proposed method can improve the clustering performance compared to the previous locality sensitive k-means clustering.


Sign in / Sign up

Export Citation Format

Share Document