scholarly journals A Bi-directional Fuzzy C-Means Clustering Ensemble Algorithm Considering Local Information

Author(s):  
Chunhua Ren ◽  
Linfu Sun

AbstractThe classic Fuzzy C-means (FCM) algorithm has limited clustering performance and is prone to misclassification of border points. This study offers a bi-directional FCM clustering ensemble approach that takes local information into account (LI_BIFCM) to overcome these challenges and increase clustering quality. First, various membership matrices are created after running FCM multiple times, based on the randomization of the initial cluster centers, and a vertical ensemble is performed using the maximum membership principle. Second, after each execution of FCM, multiple local membership matrices of the sample points are created using multiple K-nearest neighbors, and a horizontal ensemble is performed. Multiple horizontal ensembles can be created using multiple FCM clustering. Finally, the final clustering results are obtained by combining the vertical and horizontal clustering ensembles. Twelve data sets were chosen for testing from both synthetic and real data sources. The LI_BIFCM clustering performance outperformed four traditional clustering algorithms and three clustering ensemble algorithms in the experiments. Furthermore, the final clustering results has a weak correlation with the bi-directional cluster ensemble parameters, indicating that the suggested technique is robust.

2012 ◽  
Vol 2012 ◽  
pp. 1-6 ◽  
Author(s):  
Seyed Mohsen Zabihi ◽  
Mohammad-R Akbarzadeh-T

Clustering involves grouping data points together according to some measure of similarity. Clustering is one of the most significant unsupervised learning problems and do not need any labeled data. There are many clustering algorithms, among which fuzzy c-means (FCM) is one of the most popular approaches. FCM has an objective function based on Euclidean distance. Some improved versions of FCM with rather different objective functions are proposed in recent years. Generalized Improved fuzzy partitions FCM (GIFP-FCM) is one of them, which uses norm distance measure and competitive learning and outperforms the previous algorithms in this field. In this paper, we present a novel FCM clustering method with improved fuzzy partitions that utilizes shadowed sets and try to improve GIFP-FCM in noisy data sets. It enhances the efficiency of GIFP-FCM and improves the clustering results by correctly eliminating most outliers during steps of clustering. We name the novel fuzzy clustering method shadowed set-based GIFP-FCM (SGIFP-FCM). Several experiments on vessel segmentation in retinal images of DRIVE database illustrate the efficiency of the proposed method.


2021 ◽  
Vol 9 (1) ◽  
pp. 1250-1264
Author(s):  
P Gopala Krishna, D Lalitha Bhaskari

In data analysis, items were mostly described by a set of characteristics called features, in which each feature contains only single value for each object. Even so, in existence, some features may include more than one value, such as a person with different job descriptions, activities, phone numbers, skills and different mailing addresses. Such features may be called as multi-valued features, and are mostly classified as null features while analyzing the data using machine learning and data mining techniques.  In this paper, it is proposed a proximity function to be described between two substances with multi-valued features that are put into effect for clustering.The suggested distance approach allows iterative measurements of the similarities around objects as well as their characteristics. For facilitating the most suitable multi-valued factors, we put forward a model targeting at determining each factor’s relative prominence for diverse data extracting problems. The proposed algorithm is a partition clustering strategy that uses fuzzy c- means clustering for evolutions, which is using the novel member ship function by utilizing the proposed similarity measure. The proposed clustering algorithm as fuzzy c- means based Clustering of Multivalued Attribute Data (FCM-MVA).Therefore this becomes feasible using any mechanisms for cluster analysis to group similar data. The findings demonstrate that our test not only improves the performance the traditional measure of similarity but also outperforms other clustering algorithms on the multi-valued clustering framework.  


2014 ◽  
Vol 998-999 ◽  
pp. 873-877
Author(s):  
Zhen Bo Wang ◽  
Bao Zhi Qiu

To reduce the impact of irrelevant attributes on clustering results, and improve the importance of relevant attributes to clustering, this paper proposes fuzzy C-means clustering algorithm based on coefficient of variation (CV-FCM). In the algorithm, coefficient of variation is used to weigh attributes so as to assign different weights to each attribute in the data set, and the magnitude of weight is used to express the importance of different attributes to clusters. In addition, for the characteristic of fuzzy C-means clustering algorithm that it is susceptible to initial cluster center value, the method for the selection of initial cluster center based on maximum distance is introduced on the basis of weighted coefficient of variation. The result of the experiment based on real data sets shows that this algorithm can select cluster center effectively, with the clustering result superior to general fuzzy C-means clustering algorithms.


Author(s):  
Suneetha Chittinen ◽  
Dr. Raveendra Babu Bhogapathi

In this paper, fuzzy c-means algorithm uses neural network algorithm is presented. In pattern recognition, fuzzy clustering algorithms have demonstrated advantage over crisp clustering algorithms to group the high dimensional data into clusters. The proposed work involves two steps. First, a recently developed and Enhanced Kmeans Fast Leaning Artificial Neural Network (KFLANN) frame work is used to determine cluster centers. Secondly, Fuzzy C-means uses these cluster centers to generate fuzzy membership functions. Enhanced K-means Fast Learning Artificial Neural Network (KFLANN) is an algorithm which produces consistent classification of the vectors in to the same clusters regardless of the data presentation sequence. Experiments are conducted on two artificial data sets Iris and New Thyroid. The result shows that Enhanced KFLANN is faster to generate consistent cluster centers and utilizes these for elicitation of efficient fuzzy memberships.


Author(s):  
Bruno Almeida Pimentel ◽  
Renata M. C. R. de Souza

Fuzzy c-Means (FCM) and Possibilistic c-Means (PCM) are the most popular algorithms of the fuzzy and possibilistic clustering approaches, respectively. A hybridization of these methods, called Possibilistic Fuzzy c-Means (PFCM), solves noise sensitivity defect of FCM and overcomes the coincident clusters problem of PCM. Although PFCM have shown good performance in cluster detection, it does not consider that different variables can produce different membership and possibility degrees and this can improve the clustering quality as it has been performed with the Multivariate Fuzzy c-Means (MFCM). Here, this work presents a generalized multivariate approach for possibilistic fuzzy c-means clustering. This approach gives a general form for the clustering criterion of the possibilistic fuzzy clustering with membership and possibility degrees different by cluster and variable and a weighted squared Euclidean distance in order to take into account the shape of clusters. Six multivariate clustering models (special cases) can be derivative from this general form and their properties are presented. Experiments with real and synthetic data sets validate the usefulness of the approach introduced in this paper using the special cases.


2013 ◽  
Vol 765-767 ◽  
pp. 670-673
Author(s):  
Li Bo Hou

Fuzzy C-means (FCM) clustering algorithm is one of the widely applied algorithms in non-supervision of pattern recognition. However, FCM algorithm in the iterative process requires a lot of calculations, especially when feature vectors has high-dimensional, Use clustering algorithm to sub-heap, not only inefficient, but also may lead to "the curse of dimensionality." For the problem, This paper analyzes the fuzzy C-means clustering algorithm in high dimensional feature of the process, the problem of cluster center is an np-hard problem, In order to improve the effectiveness and Real-time of fuzzy C-means clustering algorithm in high dimensional feature analysis, Combination of landmark isometric (L-ISOMAP) algorithm, Proposed improved algorithm FCM-LI. Preliminary analysis of the samples, Use clustering results and the correlation of sample data, using landmark isometric (L-ISOMAP) algorithm to reduce the dimension, further analysis on the basis, obtained the final results. Finally, experimental results show that the effectiveness and Real-time of FCM-LI algorithm in high dimensional feature analysis.


2021 ◽  
pp. 1-18
Author(s):  
Angeliki Koutsimpela ◽  
Konstantinos D. Koutroumbas

Several well known clustering algorithms have their own online counterparts, in order to deal effectively with the big data issue, as well as with the case where the data become available in a streaming fashion. However, very few of them follow the stochastic gradient descent philosophy, despite the fact that the latter enjoys certain practical advantages (such as the possibility of (a) running faster than their batch processing counterparts and (b) escaping from local minima of the associated cost function), while, in addition, strong theoretical convergence results have been established for it. In this paper a novel stochastic gradient descent possibilistic clustering algorithm, called O- PCM 2 is introduced. The algorithm is presented in detail and it is rigorously proved that the gradient of the associated cost function tends to zero in the L 2 sense, based on general convergence results established for the family of the stochastic gradient descent algorithms. Furthermore, an additional discussion is provided on the nature of the points where the algorithm may converge. Finally, the performance of the proposed algorithm is tested against other related algorithms, on the basis of both synthetic and real data sets.


Sign in / Sign up

Export Citation Format

Share Document