Multi-Objective Genetic Algorithm for Robust Clustering with Unknown Number of Clusters

In this paper, a multi-objective genetic algorithm for data clustering based on the robust fuzzy least trimmed squares estimator is presented. The proposed clustering methodology addresses two critical issues in unsupervised data clustering – the ability to produce meaningful partition in noisy data, and the requirement that the number of clusters be known a priori. The multi-objective genetic algorithm-driven clustering technique optimizes the number of clusters as well as cluster assignment, and cluster prototypes. A two-parameter, mapped, fixed point coding scheme is used to represent assignment of data into the true retained set and the noisy trimmed set, and the optimal number of clusters in the retained set. A three-objective criterion is also used as the minimization functional for the multi-objective genetic algorithm. Results on well-known data sets from literature suggest that the proposed methodology is superior to conventional fuzzy clustering algorithms that assume a known value for optimal number of clusters.

Download Full-text

Data clustering and imputing using a two-level multi-objective genetic algorithm (GA): A case study of maintenance cost data for tunnel fans

Cogent Engineering ◽

10.1080/23311916.2018.1513304 ◽

2018 ◽

Vol 5 (1) ◽

pp. 1513304 ◽

Cited By ~ 2

Author(s):

Yamur K. Aldouri ◽

Hassan Al-Chalabi ◽

Liangwei Zhang ◽

Qiang Shawn Cheng

Keyword(s):

Genetic Algorithm ◽

Data Clustering ◽

Cost Data ◽

Maintenance Cost ◽

Multi Objective ◽

Multi Objective Genetic Algorithm

Download Full-text

A Dimensionality reduced Text data clustering with prediction of optimal number of clusters

International Journal of Applied Research on Information Technology and Computing ◽

10.5958/j.0975-8070.2.2.010 ◽

2011 ◽

Vol 2 (2) ◽

pp. 41 ◽

Cited By ~ 3

Author(s):

M. Ramakrishna Murty ◽

JVR Murthy ◽

Prasad Reddy ◽

Suresh Chandra Satapathy

Keyword(s):

Data Clustering ◽

Optimal Number ◽

Text Data ◽

Number Of Clusters ◽

Optimal Number Of Clusters

Download Full-text

A multi-objective genetic algorithm with fuzzy c-means for automatic data clustering

Applied Soft Computing ◽

10.1016/j.asoc.2014.08.036 ◽

2014 ◽

Vol 24 ◽

pp. 679-691 ◽

Cited By ~ 57

Author(s):

Siripen Wikaisuksakul

Keyword(s):

Genetic Algorithm ◽

Data Clustering ◽

Automatic Data ◽

Multi Objective ◽

Fuzzy C Means ◽

Multi Objective Genetic Algorithm

Download Full-text

Estimating the Optimal Number of Clusters in Categorical Data Clustering by Silhouette Coefficient

Communications in Computer and Information Science - Knowledge and Systems Sciences ◽

10.1007/978-981-15-1209-4_1 ◽

2019 ◽

pp. 1-17 ◽

Cited By ~ 3

Author(s):

Duy-Tai Dinh ◽

Tsutomu Fujinami ◽

Van-Nam Huynh

Keyword(s):

Categorical Data ◽

Data Clustering ◽

Optimal Number ◽

Number Of Clusters ◽

Silhouette Coefficient ◽

Categorical Data Clustering ◽

Optimal Number Of Clusters

Download Full-text

Data clustering with mixed features by multi objective genetic algorithm

2012 12th International Conference on Hybrid Intelligent Systems (HIS) ◽

10.1109/his.2012.6421357 ◽

2012 ◽

Cited By ~ 6

Author(s):

Dipankar Dutta ◽

Paramartha Dutta ◽

Jaya Sil

Keyword(s):

Genetic Algorithm ◽

Data Clustering ◽

Multi Objective ◽

Multi Objective Genetic Algorithm ◽

Mixed Features

Download Full-text

A multi-objective genetic algorithm with fuzzy relational clustering for automatic data clustering

2015 2nd International Conference on Electrical Information and Communication Technologies (EICT) ◽

10.1109/eict.2015.7391928 ◽

2015 ◽

Author(s):

Animesh Kundu ◽

Animesh Kumar Paul ◽

Pintu Chandra Shill ◽

Kazuyuki Murase

Keyword(s):

Genetic Algorithm ◽

Data Clustering ◽

Automatic Data ◽

Multi Objective ◽

Relational Clustering ◽

Multi Objective Genetic Algorithm

Download Full-text

A genetic algorithm using Calinski-Harabasz index for automatic clustering problem

Revista Brasileira de Computação Aplicada ◽

10.5335/rbca.v12i3.11117 ◽

2020 ◽

Vol 12 (3) ◽

pp. 97-106

Author(s):

Suzane Pereira Lima ◽

Marcelo Dib Cruz

Keyword(s):

Genetic Algorithm ◽

Clustering Algorithms ◽

Optimal Number ◽

Cluster Validity ◽

Number Of Clusters ◽

Correct Number ◽

Clustering Problem ◽

Automatic Clustering ◽

Optimal Number Of Clusters

Data clustering is a technique that aims to represent a dataset in clusters according to their similarities. In clustering algorithms, it is usually assumed that the number of clusters is known. Unfortunately, the optimal number of clusters is unknown for many applications. This kind of problem is called Automatic Clustering. There are several cluster validity indexes for evaluating solutions, it is known that the quality of a result is influenced by the chosen function. From this, a genetic algorithm is described in this article for the resolution of the automatic clustering using the Calinski-Harabasz Index as a form of evaluation. Comparisons of the results with other algorithms in the literature are also presented. In a first analysis, fitness values equivalent or higher are found in at least 58% of cases for each comparison. Our algorithm can also find the correct number of clusters or close values in 33 cases out of 48. In another comparison, some fitness values are lower, even with the correct number of clusters, but graphically the partitioning are adequate. Thus, it is observed that our proposal is justified and improvements can be studied for cases where the correct number of clusters is not found.

Download Full-text