Effective Clustering Analysis Based on New Designed Clustering Validity Index and Revised K-Means Algorithm for Big Data

Author(s):  
Erzhou Zhu ◽  
Peng Wen ◽  
Binbin Zhu ◽  
Feng Liu ◽  
Futian Wang ◽  
...  
2015 ◽  
Vol 23 (3) ◽  
pp. 701-718 ◽  
Author(s):  
Chih-Hung Wu ◽  
Chen-Sen Ouyang ◽  
Li-Wen Chen ◽  
Li-Wei Lu

2016 ◽  
Vol 2016 ◽  
pp. 1-12 ◽  
Author(s):  
Min Ren ◽  
Peiyu Liu ◽  
Zhihao Wang ◽  
Jing Yi

For the shortcoming of fuzzyc-means algorithm (FCM) needing to know the number of clusters in advance, this paper proposed a new self-adaptive method to determine the optimal number of clusters. Firstly, a density-based algorithm was put forward. The algorithm, according to the characteristics of the dataset, automatically determined the possible maximum number of clusters instead of using the empirical rulenand obtained the optimal initial cluster centroids, improving the limitation of FCM that randomly selected cluster centroids lead the convergence result to the local minimum. Secondly, this paper, by introducing a penalty function, proposed a new fuzzy clustering validity index based on fuzzy compactness and separation, which ensured that when the number of clusters verged on that of objects in the dataset, the value of clustering validity index did not monotonically decrease and was close to zero, so that the optimal number of clusters lost robustness and decision function. Then, based on these studies, a self-adaptive FCM algorithm was put forward to estimate the optimal number of clusters by the iterative trial-and-error process. At last, experiments were done on the UCI, KDD Cup 1999, and synthetic datasets, which showed that the method not only effectively determined the optimal number of clusters, but also reduced the iteration of FCM with the stable clustering result.


2019 ◽  
Vol 13 (5) ◽  
pp. 833-841 ◽  
Author(s):  
Ahmad Khan ◽  
Zia ur Rehman ◽  
Muhammad Arfan Jaffar ◽  
Javid Ullah ◽  
Ahmad Din ◽  
...  

2014 ◽  
Vol 951 ◽  
pp. 231-234
Author(s):  
Hong Bo Zhou ◽  
Jun Tao Gao

K-means clustering algorithm clusters datasets according to the certain clustering number k.However k cannot be confirmed beforehand.A new clustering validity index was designed from the standpoint of sample geometry.Based on the index a new method for determining the optimal clustering number in K-means clustering algorithm was proposed.


2019 ◽  
Vol 79 (45-46) ◽  
pp. 33417-33430 ◽  
Author(s):  
Ye Zhao ◽  
Yanrong Guo ◽  
Rui Sun ◽  
Zhengqiong Liu ◽  
Dan Guo

Sign in / Sign up

Export Citation Format

Share Document