Big Data Clustering Analysis Algorithm for Internet of Things Based on K-Means

Author(s):  
Zhanqiu Yu

To explore the Internet of things logistics system application, an Internet of things big data clustering analysis algorithm based on K-mans was discussed. First of all, according to the complex event relation and processing technology, the big data processing of Internet of things was transformed into the extraction and analysis of complex relational schema, so as to provide support for simplifying the processing complexity of big data in Internet of things (IOT). The traditional K-means algorithm was optimized and improved to make it fit the demand of big data RFID data network. Based on Hadoop cloud cluster platform, a K-means cluster analysis was achieved. In addition, based on the traditional clustering algorithm, a center point selection technology suitable for RFID IOT data clustering was selected. The results showed that the clustering efficiency was improved to some extent. As a result, an RFID Internet of things clustering analysis prototype system is designed and realized, which further tests the feasibility.

Author(s):  
Hind Bangui ◽  
Mouzhi Ge ◽  
Barbora Buhnova

Due to the massive data increase in different Internet of Things (IoT) domains such as healthcare IoT and Smart City IoT, Big Data technologies have been emerged as critical analytics tools for analyzing the IoT data. Among the Big Data technologies, data clustering is one of the essential approaches to process the IoT data. However, how to select a suitable clustering algorithm for IoT data is still unclear. Furthermore, since Big Data technology are still in its initial stage for different IoT domains, it is thus valuable to propose and structure the research challenges between Big Data and IoT. Therefore, this article starts by reviewing and comparing the data clustering algorithms that can be applied in IoT datasets, and then extends the discussions to a broader IoT context such as IoT dynamics and IoT mobile networks. Finally, this article identifies a set of research challenges that harvest a research roadmap for the Big Data research in IoT domains. The proposed research roadmap aims at bridging the research gaps between Big Data and various IoT contexts.


2014 ◽  
Vol 989-994 ◽  
pp. 2047-2050
Author(s):  
Ying Jie Wang

Data mining is the general methodology for retrieving useful information from big data. Clustering analysis is a mathematical method of classification for unsupervised machine learning. It can be adopted for data classification in Data mining. This paper combines the clustering process by fuzzy way and then deduces a special clustering algorithm with fast fuzzy c-means (FFCM) method. In summary, the paper illustrates the adoption of a series of fuzzy clustering methods in Data Mining. These methods have improved the computational efficiency with learning as the convergence speed is fast. The methodology of this paper presents significantly meaningful for information retrieval of big data.


2013 ◽  
Vol 312 ◽  
pp. 714-718
Author(s):  
Zi Qi Zhao ◽  
Xiao Jun Ye ◽  
Chun Ping Li

Multidimensional clustering analysis algorithm is for a class of cell-based clustering method of processing speed quickly, time efficiency, mainly to CLIQUE representatives. With time efficient clustering algorithm CLIQUE algorithm can achieve multi-dimensional k - Anonymous the algorithm KLIQUE, KLIQUE algorithm based CLIQUE efficiently retained their CLIQUE algorithm time complexity of features, can play the CLIQUE multidimensional data for the large amount of data processing advantage.


2018 ◽  
Vol 7 (4.44) ◽  
pp. 8
Author(s):  
Dikpride Despa ◽  
Gigih Forda Nama

The Unila Internet of Things Research Group (UIRG) was developed online monitoring of power distribution system based on Internet of Things (IoT) technology on Department of Electrical Engineering University of Lampung (Unila), has been running for several months, this system monitored electrical quantities of 3-phase main distribution panel of H-building. The measurement system involve multiple sensors such current sensors and voltage sensors, the measurement data stored in to database server and shown the information in a real-time through a web-based application.Main objective of this research was to capture, analyze, and identified the knowledge pattern of electrical quantities data measurements, using Cross-Industry Standard Process for Data Mining (CRISP-DM) data mining framework, for helping the stake holders to continuous improvement of the quality of electricity services, the initial research limited to total 770847 electrical quantities recorded data that save on database system, since 1 September - 31 October 2018, the dataset consist of 21 attribute electrical quantities such as; voltage, current, power factor values, energy consumption, frequency, on H building 3-Phase main panel control.Rapidminer as leading application on knowledge discovery application was used to analyze the big data, K-Mean cluster algorithm implemented to identify the data pattern, the result indicated that 3-Phase load was unbalanced, and Phase-0 was the most utilized phase, based on from total 5 cluster analysis result. 


2012 ◽  
Vol 6-7 ◽  
pp. 82-87 ◽  
Author(s):  
Yuan Ming Yuan ◽  
Chan Le Wu

Data quantity of Big Data was too big to be processed with traditional clustering analysis technologies. Time consuming was long, problem of computability existed with traditional technologies. Having analyzed on k-means clustering algorithm, a new algorithm was proposed. Parallelizing part of k-means was found. The algorithm was improved with the method of redesigning flow with MapReduce framework. Problems mentioned above were solved. Experiments show that new algorithm is feasible and effective.


2018 ◽  
Vol 27 (04) ◽  
pp. 1860006
Author(s):  
Nikolaos Tsapanos ◽  
Anastasios Tefas ◽  
Nikolaos Nikolaidis ◽  
Ioannis Pitas

Data clustering is an unsupervised learning task that has found many applications in various scientific fields. The goal is to find subgroups of closely related data samples (clusters) in a set of unlabeled data. A classic clustering algorithm is the so-called k-Means. It is very popular, however, it is also unable to handle cases in which the clusters are not linearly separable. Kernel k-Means is a state of the art clustering algorithm, which employs the kernel trick, in order to perform clustering on a higher dimensionality space, thus overcoming the limitations of classic k-Means regarding the non-linear separability of the input data. With respect to the challenges of Big Data research, a field that has established itself in the last few years and involves performing tasks on extremely large amounts of data, several adaptations of the Kernel k-Means have been proposed, each of which has different requirements in processing power and running time, while also incurring different trade-offs in performance. In this paper, we present several issues and techniques involving the usage of Kernel k-Means for Big Data clustering and how the combination of each component in a clustering framework fares in terms of resources, time and performance. We use experimental results, in order to evaluate several combinations and provide a recommendation on how to approach a Big Data clustering problem.


Sign in / Sign up

Export Citation Format

Share Document