Adaptive fuzzy exponent cluster ensemble system based feature selection and spectral clustering

Author(s):  
Abdelkarim Ben Ayed ◽  
Mohamed Ben Halima ◽  
Adel M. Alimi
2020 ◽  
Vol 52 (3) ◽  
pp. 1745-1763
Author(s):  
Tong Liu ◽  
Gaven Martin

Author(s):  
Bahareh Khozaei ◽  
Mahdi Eftekhari

In this paper, two novel approaches for unsupervised feature selection are proposed based on the spectral clustering. In the first proposed method, spectral clustering is employed over the features and the center of clusters is selected as well as their nearest-neighbors. These features have a minimum similarity (redundancy) between themselves since they belong to different clusters. Next, samples of data sets are clustered employing spectral clustering so that to the samples of each cluster a specific pseudo-label is assigned. After that according to the obtained pseudo-labels, the information gain of the features is computed that secures the maximum relevancy. Finally, the intersection of the selected features in the two previous steps is determined that simultaneously guarantees both the maximum relevancy and minimum redundancy. Our second proposed approach is very similar to the first one whose only but significant difference with the first method is that it selects one feature from each cluster and sorts all the features in terms of their relevancy. Then, by appending the selected features to a sorted list and ignoring them for the next step, the algorithm continues with the remaining features until all the features to be appended into the sorted list. Both of our proposed methods are compared with state-of-the-art methods and the obtained results confirm the performance of our proposed approaches especially the second one.


2017 ◽  
Vol 2017 ◽  
pp. 1-14
Author(s):  
Wenfen Liu ◽  
Mao Ye ◽  
Jianghong Wei ◽  
Xuexian Hu

Constrained spectral clustering (CSC) method can greatly improve the clustering accuracy with the incorporation of constraint information into spectral clustering and thus has been paid academic attention widely. In this paper, we propose a fast CSC algorithm via encoding landmark-based graph construction into a new CSC model and applying random sampling to decrease the data size after spectral embedding. Compared with the original model, the new algorithm has the similar results with the increase of its model size asymptotically; compared with the most efficient CSC algorithm known, the new algorithm runs faster and has a wider range of suitable data sets. Meanwhile, a scalable semisupervised cluster ensemble algorithm is also proposed via the combination of our fast CSC algorithm and dimensionality reduction with random projection in the process of spectral ensemble clustering. We demonstrate by presenting theoretical analysis and empirical results that the new cluster ensemble algorithm has advantages in terms of efficiency and effectiveness. Furthermore, the approximate preservation of random projection in clustering accuracy proved in the stage of consensus clustering is also suitable for the weighted k-means clustering and thus gives the theoretical guarantee to this special kind of k-means clustering where each point has its corresponding weight.


2020 ◽  
Vol 52 (3) ◽  
pp. 1913-1925
Author(s):  
Qimin Luo ◽  
Guoqiu Wen ◽  
Leyuan Zhang ◽  
Mengmeng Zhan

2018 ◽  
Vol 8 (1) ◽  
pp. 218-227 ◽  
Author(s):  
Adrian-Gabriel Chifu ◽  
Florentina Hristea

Abstract Whether or not word sense disambiguation (WSD) can improve information retrieval (IR) results represents a topic that has been intensely debated over the years, with many inconclusive or contradictory conclusions. The most rarely used type of WSD for this task is the unsupervised one, although it has been proven to be beneficial at a large scale. Our study builds on existing research and tries to improve the most recent unsupervised method which is based on spectral clustering. It investigates the possible benefits of “helping” spectral clustering through feature selection when it performs sense discrimination for IR. Results obtained so far, involving large data collections, encourage us to point out the importance of feature selection even in the case of this advanced, state of the art clustering technique that is known for performing its own feature weighting. By suggesting an improvement of what we consider the most promising approach to usage of WSD in IR, and by commenting on its possible extensions, we state that WSD still holds a promise for IR and hope to stimulate continuation of this line of research, perhaps at an even more successful level.


Sign in / Sign up

Export Citation Format

Share Document