U-control Chart-Based Differential Evolution Clustering for Determining the Number of Clusters in k-Means

Author(s):  
Carlos Rondón ◽  
Ivon Romero-Pérez ◽  
Jesús García Guliany ◽  
Ernesto Steffens Sanabria
2019 ◽  
Author(s):  
Ahmad Ilham

Determining the number of clusters k-Means is the most populer problem among data mining researchers because of the difficulty to determining information from the data a priori so that the results cluster un optimal and to be quickly trapped into local minimums. Automatic clustering method with evolutionary computation (EC) approach can solve the k-Means problem. The automatic clustering differential evolution (ACDE) method is one of the most popular methods of the EC approach because it can handle high-dimensional data and improve k-Means drafting performance with low cluster validity values. However, the process of determining k activation threshold on ACDE is still dependent on user considerations, so that the process of determining the number of k-Means clusters is not yet efficient. In this study, the ACDE problem will be improved using the u-control chart (UCC) method, which is proven to be efficiently used to solve k-Means problems automatically. The proposed method is evaluated using the state-of-the-art datasets such as synthetic data and real data (iris, glass, wine, vowel, ruspini) from UCI repository machine learning and using davies bouldin index (DBI) and cosine similarity measure (CS) as an evaluation method. The results of this study indicate that the UCC method has successfully improved the k-Means method with the lowest objective values of DBI and CS of 0.470 and 0.577 respectively. The lowest objective value of DBI and CS is the best method. The proposed method has high performance when compared with other current methods such as genetic clustering for unknown k (GCUK), dynamic clustering pso (DCPSO) and automatic clustering approach based on differential evolution algorithm combining with k-Means for crisp clustering (ACDE) for almost all DBI and CS evaluations. It can be concluded that the UCC method is able to correct the weakness of the ACDE method on determining the number of k-Means clusters by automatically determining k activation threshold


2019 ◽  
Vol 12 (4) ◽  
pp. 306-316
Author(s):  
Ahmad Ilham ◽  
◽  
Romi Wahono ◽  
Catur Supriyanto ◽  
Adi Wijaya ◽  
...  

Author(s):  
Jesús Silva ◽  
Omar Bonerge Pineda Lezama ◽  
Noel Varela ◽  
Jesús García Guiliany ◽  
Ernesto Steffens Sanabria ◽  
...  

2019 ◽  
Vol 2019 ◽  
pp. 1-16
Author(s):  
D. Pham-Toan ◽  
T. Vo-Van ◽  
A. T. Pham-Chau ◽  
T. Nguyen-Trang ◽  
D. Ho-Kieu

This paper proposes an evolutionary computing based automatic partitioned clustering of probability density function, the so-called binary adaptive elitist differential evolution for clustering of probability density functions (baeDE-CDFs). Herein, the k-medoids based representative probability density functions (PDFs) are preferred to the k-means one for their capability of avoiding outlier effectively. Moreover, addressing clustering problem in favor of an evolutionary optimization one permits determining number of clusters “on the run”. Notably, the application of adaptive elitist differential evolution (aeDE) algorithm with binary chromosome representation not only decreases the computational burden remarkably, but also increases the quality of solution significantly. Multiple numerical examples are designed and examined to verify the proposed algorithm’s performance, and the numerical results are evaluated using numerous criteria to give a comprehensive conclusion. After some comparisons with other algorithms in the literature, it is worth noticing that the proposed algorithm reveals an outstanding performance in both quality of solution and computational time in a statistically significant way.


2012 ◽  
Author(s):  
Orawan Watchanupaporn ◽  
Worasait Suwannik

1990 ◽  
Vol 29 (03) ◽  
pp. 200-204 ◽  
Author(s):  
J. A. Koziol

AbstractA basic problem of cluster analysis is the determination or selection of the number of clusters evinced in any set of data. We address this issue with multinomial data using Akaike’s information criterion and demonstrate its utility in identifying an appropriate number of clusters of tumor types with similar profiles of cell surface antigens.


Sign in / Sign up

Export Citation Format

Share Document