scholarly journals Does Determination of Initial Cluster Centroids Improve the Performance of K-Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm, Minimum Spanning Tree, and Hierarchical Clustering in an Applied Study

2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Saeedeh Pourahmad ◽  
Atefeh Basirat ◽  
Amir Rahimi ◽  
Marziyeh Doostfatemeh

Random selection of initial centroids (centers) for clusters is a fundamental defect in K-means clustering algorithm as the algorithm’s performance depends on initial centroids and may end up in local optimizations. Various hybrid methods have been introduced to resolve this defect in K-means clustering algorithm. As regards, there are no comparative studies comparing these methods in various aspects, the present paper compared three hybrid methods with K-means clustering algorithm using concepts of genetic algorithm, minimum spanning tree, and hierarchical clustering method. Although these three hybrid methods have received more attention in previous researches, fewer studies have compared their results. Hence, seven quantitative datasets with different characteristics in terms of sample size, number of features, and number of different classes are utilized in present study. Eleven indices of external and internal evaluating index were also considered for comparing the methods. Data indicated that the hybrid methods resulted in higher convergence rate in obtaining the final solution than the ordinary K-means method. Furthermore, the hybrid method with hierarchical clustering algorithm converges to the optimal solution with less iteration than the other two hybrid methods. However, hybrid methods with minimal spanning trees and genetic algorithms may not always or often be more effective than the ordinary K-means method. Therefore, despite the computational complexity, these three hybrid methods have not led to much improvement in the K-means method. However, a simulation study is required to compare the methods and complete the conclusion.

2011 ◽  
Vol 20 (01) ◽  
pp. 139-177 ◽  
Author(s):  
YAN ZHOU ◽  
OLEKSANDR GRYGORASH ◽  
THOMAS F. HAIN

We propose two Euclidean minimum spanning tree based clustering algorithms — one a k-constrained, and the other an unconstrained algorithm. Our k-constrained clustering algorithm produces a k-partition of a set of points for any given k. The algorithm constructs a minimum spanning tree of a set of representative points and removes edges that satisfy a predefined criterion. The process is repeated until k clusters are produced. Our unconstrained clustering algorithm partitions a point set into a group of clusters by maximally reducing the overall standard deviation of the edges in the Euclidean minimum spanning tree constructed from a given point set, without prescribing the number of clusters. We present our experimental results comparing our proposed algorithms with k-means, X-means, CURE, Chameleon, and the Expectation-Maximization (EM) algorithm on both artificial data and benchmark data from the UCI repository. We also apply our algorithms to image color clustering and compare them with the standard minimum spanning tree clustering algorithm as well as CURE, Chameleon, and X-means.


2008 ◽  
Vol 25 (04) ◽  
pp. 575-589 ◽  
Author(s):  
ALOK SINGH ◽  
ANURAG SINGH BAGHEL

Given an undirected, connected, weighted graph, the leaf-constrained minimum spanning tree (LCMST) problem seeks a spanning tree of the graph with smallest weight among all spanning trees of the graph, which contains at least l leaves. In this paper we have proposed two new metaheuristic approaches for the LCMST problem. One is an ant-colony optimization (ACO) algorithm, whereas the other is a tabu search based algorithm. Similar to a previously proposed genetic algorithm, these metaheuristic approaches also use the subset coding that represents a leaf-constrained spanning tree by the set of its interior vertices. Our new approaches perform well in comparison with two best heuristics reported in the literature for the problem — the subset-coded genetic algorithm and a greedy heuristic.


2021 ◽  
Vol 87 (2) ◽  
pp. 273-298
Author(s):  
Roberto Todeschini ◽  
◽  
Cecile Valsecchi

Minimum Spanning Tree (MST) is a well-known clustering algorithm that provides a graphical tree representation of the objects in a data set by exploiting local information to link each pair of similar objects. The a-posteriori analysis of this tree in terms of nodes and edges provides the basis to derive simple classifiers, namely semi-supervised classification approaches based on the minimum spanning tree approach. In this work, we propose different metrics to evaluate the MST ability to group objects of the same a-priori known classes. The classification capability of the proposed approach, using 13 different distance measures, was compared with that of classical supervised classification approaches such as N-Nearest Neighbour (N3), Binned Nearest Neighbour (BNN), Partial Least SquaresDiscriminant Analysis (PLS-DA), K-Nearest Neighbour (KNN), exponentially weighted K-Nearest Neighbour (wKNN) and Support Vector Machine with radial functions (SVMRBF) on 31 data sets. The proposed approach resulted to be competitive and comparable with the considered classical supervised classification methods. Finally, we analysed the role of the 13 different measures in terms of performance and percentage of not-assigned objects.


2016 ◽  
Vol 62 (4) ◽  
pp. 379-388 ◽  
Author(s):  
Iwona Dolińska ◽  
Mariusz Jakubowski ◽  
Antoni Masiukiewicz ◽  
Grzegorz Rządkowski ◽  
Kamil Piórczyński

Abstract Channel assignment in 2.4 GHz band of 802.11 standard is still important issue as a lot of 2.4 GHz devices are in use. This band offers only three non-overlapping channels, so in crowded environment users can suffer from high interference level. In this paper, a greedy algorithm inspired by the Prim’s algorithm for finding minimum spanning trees (MSTs) in undirected graphs is considered for channel assignment in this type of networks. The proposed solution tested for example network distributions achieves results close to the exhaustive approach and is, in many cases, several orders of magnitude faster.


Sign in / Sign up

Export Citation Format

Share Document