scholarly journals An Analysis of the Application of Simplified Silhouette to the Evaluation of k-means Clustering Validity

Author(s):  
Fei Wang ◽  
Hector-Hugo Franco-Penya ◽  
John D. Kelleher ◽  
John Pugh ◽  
Robert Ross
Keyword(s):  
2010 ◽  
Vol 30 (6) ◽  
pp. 1527-1529
Author(s):  
Ge ZHANG ◽  
Ying-jie LEI ◽  
Xing-long ZHAI ◽  
Hong-jing ZHAO

2016 ◽  
Vol 16 (6) ◽  
pp. 27-42 ◽  
Author(s):  
Minghan Yang ◽  
Xuedong Gao ◽  
Ling Li

Abstract Although Clustering Algorithm Based on Sparse Feature Vector (CABOSFV) and its related algorithms are efficient for high dimensional sparse data clustering, there exist several imperfections. Such imperfections as subjective parameter designation and order sensibility of clustering process would eventually aggravate the time complexity and quality of the algorithm. This paper proposes a parameter adjustment method of Bidirectional CABOSFV for optimization purpose. By optimizing Parameter Vector (PV) and Parameter Selection Vector (PSV) with the objective function of clustering validity, an improved Bidirectional CABOSFV algorithm using simulated annealing is proposed, which circumvents the requirement of initial parameter determination. The experiments on UCI data sets show that the proposed algorithm, which can perform multi-adjustment clustering, has a higher accurateness than single adjustment clustering, along with a decreased time complexity through iterations.


2015 ◽  
Vol 47 (2) ◽  
pp. 329-354 ◽  
Author(s):  
Pablo A. Jaskowiak ◽  
Davoud Moulavi ◽  
Antonio C. S. Furtado ◽  
Ricardo J. G. B. Campello ◽  
Arthur Zimek ◽  
...  

Author(s):  
Nicola Fanizzi ◽  
Claudia d’Amato ◽  
Floriana Esposito

We present a method based on clustering techniques to detect possible/probable novel concepts or concept drift in a Description Logics knowledge base. The method exploits a semi-distance measure defined for individuals, that is based on a finite number of dimensions corresponding to a committee of discriminating features (concept descriptions). A maximally discriminating group of features is obtained with a randomized optimization method. In the algorithm, the possible clusterings are represented as medoids (w.r.t. the given metric) of variable length. The number of clusters is not required as a parameter, the method is able to find an optimal choice by means of evolutionary operators and a proper fitness function. An experimentation proves the feasibility of our method and its effectiveness in terms of clustering validity indices. With a supervised learning phase, each cluster can be assigned with a refined or newly constructed intensional definition expressed in the adopted language.


Sign in / Sign up

Export Citation Format

Share Document