A new Kernelized Fuzzy Possibilistic C-Means for high dimensional data clustering based on kernel-induced distance measure

Author(s):  
B. Shanmugapriya ◽  
M. Punithavalli
Information ◽  
2019 ◽  
Vol 10 (6) ◽  
pp. 208 ◽  
Author(s):  
Wenting Wang ◽  
Yulin He ◽  
Liheng Ma ◽  
Joshua Zhexue Huang

In this paper, we propose a latent feature group learning (LFGL) algorithm to discover the feature grouping structures and subspace clusters for high-dimensional data. The feature grouping structures, which are learned in an analytical way, can enhance the accuracy and efficiency of high-dimensional data clustering. In LFGL algorithm, the Darwinian evolutionary process is used to explore the optimal feature grouping structures, which are coded as chromosomes in the genetic algorithm. The feature grouping weighting k-means algorithm is used as the fitness function to evaluate the chromosomes or feature grouping structures in each generation of evolution. To better handle the diverse densities of clusters in high-dimensional data, the original feature grouping weighting k-means is revised with the mass-based dissimilarity measure rather than the Euclidean distance measure and the feature weights are optimized as a nonnegative matrix factorization problem under the orthogonal constraint of feature weight matrix. The genetic operations of mutation and crossover are used to generate the new chromosomes for next generation. In comparison with the well-known clustering algorithms, LFGL algorithm produced encouraging experimental results on real world datasets, which demonstrated the better performance of LFGL when clustering high-dimensional data.


Author(s):  
Parul Agarwal ◽  
Shikha Mehta

Subspace clustering approaches cluster high dimensional data in different subspaces. It means grouping the data with different relevant subsets of dimensions. This technique has become very effective as a distance measure becomes ineffective in a high dimensional space. This chapter presents a novel evolutionary approach to a bottom up subspace clustering SUBSPACE_DE which is scalable to high dimensional data. SUBSPACE_DE uses a self-adaptive DBSCAN algorithm to perform clustering in data instances of each attribute and maximal subspaces. Self-adaptive DBSCAN clustering algorithms accept input from differential evolution algorithms. The proposed SUBSPACE_DE algorithm is tested on 14 datasets, both real and synthetic. It is compared with 11 existing subspace clustering algorithms. Evaluation metrics such as F1_Measure and accuracy are used. Performance analysis of the proposed algorithms is considerably better on a success rate ratio ranking in both accuracy and F1_Measure. SUBSPACE_DE also has potential scalability on high dimensional datasets.


Author(s):  
Dharmveer Singh Rajput ◽  
Pramod Kumar Singh ◽  
Mahua Bhattacharya

Algorithms ◽  
2020 ◽  
Vol 13 (5) ◽  
pp. 109 ◽  
Author(s):  
Marian B. Gorzałczany ◽  
Filip Rudziński

In this paper, we briefly present several modifications and generalizations of the concept of self-organizing neural networks—usually referred to as self-organizing maps (SOMs)—to illustrate their advantages in applications that range from high-dimensional data visualization to complex data clustering. Starting from conventional SOMs, Growing SOMs (GSOMs), Growing Grid Networks (GGNs), Incremental Grid Growing (IGG) approach, Growing Neural Gas (GNG) method as well as our two original solutions, i.e., Generalized SOMs with 1-Dimensional Neighborhood (GeSOMs with 1DN also referred to as Dynamic SOMs (DSOMs)) and Generalized SOMs with Tree-Like Structures (GeSOMs with T-LSs) are discussed. They are characterized in terms of (i) the modification mechanisms used, (ii) the range of network modifications introduced, (iii) the structure regularity, and (iv) the data-visualization/data-clustering effectiveness. The performance of particular solutions is illustrated and compared by means of selected data sets. We also show that the proposed original solutions, i.e., GeSOMs with 1DN (DSOMs) and GeSOMS with T-LSs outperform alternative approaches in various complex clustering tasks by providing up to 20 % increase in the clustering accuracy. The contribution of this work is threefold. First, algorithm-oriented original computer-implementations of particular SOM’s generalizations are developed. Second, their detailed simulation results are presented and discussed. Third, the advantages of our earlier-mentioned original solutions are demonstrated.


Sign in / Sign up

Export Citation Format

Share Document