Hard and Fuzzy c-Means Clustering with Conditionally Positive Definite Kernel

Author(s):  
Yuchi Kanzawa ◽  
◽  
Yasunori Endo ◽  
Sadaaki Miyamoto ◽  

In this paper, we investigate three types of c-means clustering algorithms with a conditionally positive definite (cpd) kernel. One is based on hard c-means and two are based on standard and entropy-regularized fuzzy c-means. First, based on a cpd kernel describing a squared Euclidean distance between data in feature space, these algorithms are derived from revised optimization problems of the conventional kernel c-means. Next, based on the relationship between the positive definite (pd) kernel and cpd kernel, the revised dissimilarity between a datum and a cluster center in the feature space is shown. Finally, it is shown that a cpd kernel c-means algorithm and a kernel c-means algorithm with a pd kernel derived from the cpd kernel are essentially identical to each other. Explicit mapping for a cpd kernel is also described geometrically.

2014 ◽  
Vol 998-999 ◽  
pp. 873-877
Author(s):  
Zhen Bo Wang ◽  
Bao Zhi Qiu

To reduce the impact of irrelevant attributes on clustering results, and improve the importance of relevant attributes to clustering, this paper proposes fuzzy C-means clustering algorithm based on coefficient of variation (CV-FCM). In the algorithm, coefficient of variation is used to weigh attributes so as to assign different weights to each attribute in the data set, and the magnitude of weight is used to express the importance of different attributes to clusters. In addition, for the characteristic of fuzzy C-means clustering algorithm that it is susceptible to initial cluster center value, the method for the selection of initial cluster center based on maximum distance is introduced on the basis of weighted coefficient of variation. The result of the experiment based on real data sets shows that this algorithm can select cluster center effectively, with the clustering result superior to general fuzzy C-means clustering algorithms.


2020 ◽  
Vol 2020 ◽  
pp. 1-22
Author(s):  
Yao Yang ◽  
Chengmao Wu ◽  
Yawen Li ◽  
Shaoyu Zhang

To improve the effectiveness and robustness of the existing semisupervised fuzzy clustering for segmenting image corrupted by noise, a kernel space semisupervised fuzzy C-means clustering segmentation algorithm combining utilizing neighborhood spatial gray information with fuzzy membership information is proposed in this paper. The mean intensity information of neighborhood window is embedded into the objective function of the existing semisupervised fuzzy C-means clustering, and the Lagrange multiplier method is used to obtain its iterative expression corresponding to the iterative solution of the optimization problem. Meanwhile, the local Gaussian kernel function is used to map the pixel samples from the Euclidean space to the high-dimensional feature space so that the cluster adaptability to different types of image segmentation is enhanced. Experiment results performed on different types of noisy images indicate that the proposed segmentation algorithm can achieve better segmentation performance than the existing typical robust fuzzy clustering algorithms and significantly enhance the antinoise performance.


Author(s):  
Yajin Xu ◽  
Qiong Luo ◽  
Hong Shu

Excess commuting refers to the value of unnecessary commuting or distance costs. Traditional commuting distance models adapt the most efficient scenario with people working in the nearest workplace geographically. Even though there have been some attempts to include constraints with commuter attributes and neighborhood features, problems arise with traditional geographical space and the subjectivity of these predefined characteristics. In this paper, we propose a method to calculate theoretical local minimal costs, which considers preferences that are inherently behavioral based on current work–home trips in the process of reassigning the work–home configuration. Our method is based on a feature space with a higher dimension and with the enlargement of attributes and relations of and between commuters and neighborhoods. Additionally, our solution is arrived at innovatively by improved Fuzzy C-Means clustering and linear programming. Unlike traditional clustering algorithms, our improved method adapts entropy information and selects the initial parameters based on the actual data rather than on prior knowledge. Using the real origin–destination matrix, theoretical minimal costs are calculated within each cluster, referred to as local minimal costs, and the average sum of local minimal costs is our theoretical minimal cost. The difference between the expected minimal cost and the actual cost is the excess commuting. Using our method, experimental results show that only 13% of the daily commuting distance in Wuhan could be avoided, and the theoretical distance is approximately 1.06 km shorter than the actual commuting distance.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Xiujuan Lei ◽  
Fang-Xiang Wu ◽  
Jianfang Tian ◽  
Jie Zhao

Many clustering algorithms are unable to solve the clustering problem of protein-protein interaction (PPI) networks effectively. A novel clustering model which combines the optimization mechanism of artificial bee colony (ABC) with the fuzzy membership matrix is proposed in this paper. The proposed ABC-IFC clustering model contains two parts: searching for the optimum cluster centers using ABC mechanism and forming clusters using intuitionistic fuzzy clustering (IFC) method. Firstly, the cluster centers are set randomly and the initial clustering results are obtained by using fuzzy membership matrix. Then the cluster centers are updated through different functions of bees in ABC algorithm; then the clustering result is obtained through IFC method based on the new optimized cluster center. To illustrate its performance, the ABC-IFC method is compared with the traditional fuzzy C-means clustering and IFC method. The experimental results on MIPS dataset show that the proposed ABC-IFC method not only gets improved in terms of several commonly used evaluation criteria such asprecision,recall, andPvalue, but also obtains a better clustering result.


Author(s):  
R. R. Gharieb ◽  
G. Gendy ◽  
H. Selim

In this paper, the standard hard C-means (HCM) clustering approach to image segmentation is modified by incorporating weighted membership Kullback–Leibler (KL) divergence and local data information into the HCM objective function. The membership KL divergence, used for fuzzification, measures the proximity between each cluster membership function of a pixel and the locally-smoothed value of the membership in the pixel vicinity. The fuzzification weight is a function of the pixel to cluster-centers distances. The used pixel to a cluster-center distance is composed of the original pixel data distance plus a fraction of the distance generated from the locally-smoothed pixel data. It is shown that the obtained membership function of a pixel is proportional to the locally-smoothed membership function of this pixel multiplied by an exponentially distributed function of the minus pixel distance relative to the minimum distance provided by the nearest cluster-center to the pixel. Therefore, since incorporating the locally-smoothed membership and data information in addition to the relative distance, which is more tolerant to additive noise than the absolute distance, the proposed algorithm has a threefold noise-handling process. The presented algorithm, named local data and membership KL divergence based fuzzy C-means (LDMKLFCM), is tested by synthetic and real-world noisy images and its results are compared with those of several FCM-based clustering algorithms.


2013 ◽  
Vol 765-767 ◽  
pp. 670-673
Author(s):  
Li Bo Hou

Fuzzy C-means (FCM) clustering algorithm is one of the widely applied algorithms in non-supervision of pattern recognition. However, FCM algorithm in the iterative process requires a lot of calculations, especially when feature vectors has high-dimensional, Use clustering algorithm to sub-heap, not only inefficient, but also may lead to "the curse of dimensionality." For the problem, This paper analyzes the fuzzy C-means clustering algorithm in high dimensional feature of the process, the problem of cluster center is an np-hard problem, In order to improve the effectiveness and Real-time of fuzzy C-means clustering algorithm in high dimensional feature analysis, Combination of landmark isometric (L-ISOMAP) algorithm, Proposed improved algorithm FCM-LI. Preliminary analysis of the samples, Use clustering results and the correlation of sample data, using landmark isometric (L-ISOMAP) algorithm to reduce the dimension, further analysis on the basis, obtained the final results. Finally, experimental results show that the effectiveness and Real-time of FCM-LI algorithm in high dimensional feature analysis.


Sign in / Sign up

Export Citation Format

Share Document