Big Data Clustering Analysis Algorithm for Internet of Things Based on K-Means

Zhanqiu Yu

doi:10.4018/ijdst.2019010101

Big Data Clustering Analysis Algorithm for Internet of Things Based on K-Means

International Journal of Distributed Systems and Technologies ◽

10.4018/ijdst.2019010101 ◽

2019 ◽

Vol 10 (1) ◽

pp. 1-12 ◽

Cited By ~ 2

Author(s):

Zhanqiu Yu

Keyword(s):

Big Data ◽

Internet Of Things ◽

Clustering Analysis ◽

Data Clustering ◽

Clustering Algorithm ◽

Prototype System ◽

Point Selection ◽

Logistics System ◽

Relational Schema ◽

Analysis Algorithm

To explore the Internet of things logistics system application, an Internet of things big data clustering analysis algorithm based on K-mans was discussed. First of all, according to the complex event relation and processing technology, the big data processing of Internet of things was transformed into the extraction and analysis of complex relational schema, so as to provide support for simplifying the processing complexity of big data in Internet of things (IOT). The traditional K-means algorithm was optimized and improved to make it fit the demand of big data RFID data network. Based on Hadoop cloud cluster platform, a K-means cluster analysis was achieved. In addition, based on the traditional clustering algorithm, a center point selection technology suitable for RFID IOT data clustering was selected. The results showed that the clustering efficiency was improved to some extent. As a result, an RFID Internet of things clustering analysis prototype system is designed and realized, which further tests the feasibility.

Download Full-text

A Research Roadmap of Big Data Clustering Algorithms for Future Internet of Things

International Journal of Organizational and Collective Intelligence ◽

10.4018/ijoci.2019040102 ◽

2019 ◽

Vol 9 (2) ◽

pp. 16-30 ◽

Cited By ~ 1

Author(s):

Hind Bangui ◽

Mouzhi Ge ◽

Barbora Buhnova

Keyword(s):

Big Data ◽

Internet Of Things ◽

Mobile Networks ◽

Data Clustering ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Future Internet ◽

Research Challenges ◽

Initial Stage ◽

Big Data Technologies

Due to the massive data increase in different Internet of Things (IoT) domains such as healthcare IoT and Smart City IoT, Big Data technologies have been emerged as critical analytics tools for analyzing the IoT data. Among the Big Data technologies, data clustering is one of the essential approaches to process the IoT data. However, how to select a suitable clustering algorithm for IoT data is still unclear. Furthermore, since Big Data technology are still in its initial stage for different IoT domains, it is thus valuable to propose and structure the research challenges between Big Data and IoT. Therefore, this article starts by reviewing and comparing the data clustering algorithms that can be applied in IoT datasets, and then extends the discussions to a broader IoT context such as IoT dynamics and IoT mobile networks. Finally, this article identifies a set of research challenges that harvest a research roadmap for the Big Data research in IoT domains. The proposed research roadmap aims at bridging the research gaps between Big Data and various IoT contexts.

Download Full-text

Analysis of Fuzzy Clustering for the Adoption in Data Mining

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.989-994.2047 ◽

2014 ◽

Vol 989-994 ◽

pp. 2047-2050

Author(s):

Ying Jie Wang

Keyword(s):

Machine Learning ◽

Data Mining ◽

Mathematical Method ◽

Big Data ◽

Fuzzy Clustering ◽

Clustering Analysis ◽

Data Clustering ◽

Clustering Algorithm ◽

Clustering Methods ◽

Fuzzy Clustering Methods

Data mining is the general methodology for retrieving useful information from big data. Clustering analysis is a mathematical method of classification for unsupervised machine learning. It can be adopted for data classification in Data mining. This paper combines the clustering process by fuzzy way and then deduces a special clustering algorithm with fast fuzzy c-means (FFCM) method. In summary, the paper illustrates the adoption of a series of fuzzy clustering methods in Data Mining. These methods have improved the computational efficiency with learning as the convergence speed is fast. The methodology of this paper presents significantly meaningful for information retrieval of big data.

Download Full-text

A Wavelet Analysis-Based Big Data Spectral Clustering Algorithm for Electric Internet of Things

Journal of Physics Conference Series ◽

10.1088/1742-6596/1627/1/012007 ◽

2020 ◽

Vol 1627 ◽

pp. 012007

Author(s):

Hao Zhang ◽

Xin Liu ◽

Donglan Liu ◽

Hao Yu

Keyword(s):

Big Data ◽

Internet Of Things ◽

Wavelet Analysis ◽

Spectral Clustering ◽

Clustering Algorithm ◽

Spectral Clustering Algorithm

Download Full-text

Clustering Analysis based Power Security Big Data Aggregation in Ubiquitous Power Internet of Things

2020 12th IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC) ◽

10.1109/appeec48164.2020.9220667 ◽

2020 ◽

Author(s):

Guofei Guan ◽

Xinyuan Hu ◽

Yan Xu ◽

Qiqi Luan ◽

Chunpeng Li ◽

...

Keyword(s):

Big Data ◽

Internet Of Things ◽

Data Aggregation ◽

Clustering Analysis

Download Full-text

A Succinct Distributive Big Data Clustering Algorithm Based on Local-Remote Coordination

2015 IEEE International Conference on Systems, Man, and Cybernetics ◽

10.1109/smc.2015.322 ◽

2015 ◽

Author(s):

Chao Ma ◽

Xun Liang ◽

Yuefeng Ma

Keyword(s):

Big Data ◽

Data Clustering ◽

Clustering Algorithm

Download Full-text

K-Anonymity Algorithm Based on CLIQUE for Green Manufacturing

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.312.714 ◽

2013 ◽

Vol 312 ◽

pp. 714-718

Author(s):

Zi Qi Zhao ◽

Xiao Jun Ye ◽

Chun Ping Li

Keyword(s):

Data Processing ◽

Processing Speed ◽

Clustering Analysis ◽

Time Complexity ◽

Clustering Algorithm ◽

Green Manufacturing ◽

Multidimensional Data ◽

Clustering Method ◽

Analysis Algorithm ◽

Clique Algorithm

Multidimensional clustering analysis algorithm is for a class of cell-based clustering method of processing speed quickly, time efficiency, mainly to CLIQUE representatives. With time efficient clustering algorithm CLIQUE algorithm can achieve multi-dimensional k - Anonymous the algorithm KLIQUE, KLIQUE algorithm based CLIQUE efficiently retained their CLIQUE algorithm time complexity of features, can play the CLIQUE multidimensional data for the large amount of data processing advantage.

Download Full-text

An Enhanced K-Means Clustering Algorithm for Pattern Discovery in Big Data Analysis of 3-Phase Electrical Quantities

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.44.26854 ◽

2018 ◽

Vol 7 (4.44) ◽

pp. 8

Author(s):

Dikpride Despa ◽

Gigih Forda Nama

Keyword(s):

Data Mining ◽

Big Data ◽

Internet Of Things ◽

Power Distribution ◽

Distribution System ◽

Clustering Algorithm ◽

Measurement Data ◽

Power Distribution System ◽

Multiple Sensors ◽

Industry Standard

The Unila Internet of Things Research Group (UIRG) was developed online monitoring of power distribution system based on Internet of Things (IoT) technology on Department of Electrical Engineering University of Lampung (Unila), has been running for several months, this system monitored electrical quantities of 3-phase main distribution panel of H-building. The measurement system involve multiple sensors such current sensors and voltage sensors, the measurement data stored in to database server and shown the information in a real-time through a web-based application.Main objective of this research was to capture, analyze, and identified the knowledge pattern of electrical quantities data measurements, using Cross-Industry Standard Process for Data Mining (CRISP-DM) data mining framework, for helping the stake holders to continuous improvement of the quality of electricity services, the initial research limited to total 770847 electrical quantities recorded data that save on database system, since 1 September - 31 October 2018, the dataset consist of 21 attribute electrical quantities such as; voltage, current, power factor values, energy consumption, frequency, on H building 3-Phase main panel control.Rapidminer as leading application on knowledge discovery application was used to analyze the big data, K-Mean cluster algorithm implemented to identify the data pattern, the result indicated that 3-Phase load was unbalanced, and Phase-0 was the most utilized phase, based on from total 5 cluster analysis result.

Download Full-text

Big Data Clustering Algorithm Based on Computer Cloud Platform

10.1007/978-3-030-89511-2_32 ◽

2021 ◽

pp. 254-262

Author(s):

Xiaoyun Gong

Keyword(s):

Big Data ◽

Data Clustering ◽

Clustering Algorithm ◽

Cloud Platform

Download Full-text

Research on Clustering Analysis of Big Data

Advanced Engineering Forum ◽

10.4028/www.scientific.net/aef.6-7.82 ◽

2012 ◽

Vol 6-7 ◽

pp. 82-87 ◽

Cited By ~ 2

Author(s):

Yuan Ming Yuan ◽

Chan Le Wu

Keyword(s):

Big Data ◽

Clustering Analysis ◽

Clustering Algorithm ◽

Mapreduce Framework ◽

Traditional Technologies

Data quantity of Big Data was too big to be processed with traditional clustering analysis technologies. Time consuming was long, problem of computability existed with traditional technologies. Having analyzed on k-means clustering algorithm, a new algorithm was proposed. Parallelizing part of k-means was found. The algorithm was improved with the method of redesigning flow with MapReduce framework. Problems mentioned above were solved. Experiments show that new algorithm is feasible and effective.

Download Full-text

Big Data Clustering with Kernel k-Means: Resources, Time and Performance

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213018600060 ◽

2018 ◽

Vol 27 (04) ◽

pp. 1860006

Author(s):

Nikolaos Tsapanos ◽

Anastasios Tefas ◽

Nikolaos Nikolaidis ◽

Ioannis Pitas

Keyword(s):

Big Data ◽

Data Clustering ◽

Clustering Algorithm ◽

Learning Task ◽

Related Data ◽

Clustering Problem ◽

Processing Power ◽

Trade Offs ◽

Separable Kernel ◽

And Performance

Data clustering is an unsupervised learning task that has found many applications in various scientific fields. The goal is to find subgroups of closely related data samples (clusters) in a set of unlabeled data. A classic clustering algorithm is the so-called k-Means. It is very popular, however, it is also unable to handle cases in which the clusters are not linearly separable. Kernel k-Means is a state of the art clustering algorithm, which employs the kernel trick, in order to perform clustering on a higher dimensionality space, thus overcoming the limitations of classic k-Means regarding the non-linear separability of the input data. With respect to the challenges of Big Data research, a field that has established itself in the last few years and involves performing tasks on extremely large amounts of data, several adaptations of the Kernel k-Means have been proposed, each of which has different requirements in processing power and running time, while also incurring different trade-offs in performance. In this paper, we present several issues and techniques involving the usage of Kernel k-Means for Big Data clustering and how the combination of each component in a clustering framework fares in terms of resources, time and performance. We use experimental results, in order to evaluate several combinations and provide a recommendation on how to approach a Big Data clustering problem.

Download Full-text