AN EXTENDED FUZZY CLUSTERING ALGORITHM AND ITS APPLICATION

1995 ◽  
Vol 05 (02) ◽  
pp. 239-259
Author(s):  
SU HWAN KIM ◽  
SEON WOOK KIM ◽  
TAE WON RHEE

For data analyses, it is very important to combine data with similar attribute values into a categorically homogeneous subset, called a cluster, and this technique is called clustering. Generally crisp clustering algorithms are weak in noise, because each datum should be assigned to exactly one cluster. In order to solve the problem, a fuzzy c-means, a fuzzy maximum likelihood estimation, and an optimal fuzzy clustering algorithms in the fuzzy set theory have been proposed. They, however, require a lot of processing time because of exhaustive iteration with an amount of data and their memberships. Especially large memory space results in the degradation of performance in real-time processing applications, because it takes too much time to swap between the main memory and the secondary memory. To overcome these limitations, an extended fuzzy clustering algorithm based on an unsupervised optimal fuzzy clustering algorithm is proposed in this paper. This algorithm assigns a weight factor to each distinct datum considering its occurrence rate. Also, the proposed extended fuzzy clustering algorithm considers the degree of importances of each attribute, which determines the characteristics of the data. The worst case is that the whole data has an uniformly normal distribution, which means the importance of all attributes are the same. The proposed extended fuzzy clustering algorithm has better performance than the unsupervised optimal fuzzy clustering algorithm in terms of memory space and execution time in most cases. For simulation the proposed algorithm is applied to color image segmentation. Also automatic target detection and multipeak detection are considered as applications. These schemes can be applied to any other fuzzy clustering algorithms.

2010 ◽  
Vol 29-32 ◽  
pp. 802-808
Author(s):  
Min Min

On analyzing the common problems in fuzzy clustering algorithms, we put forward the combined fuzzy clustering one, which will automatically generate a reasonable clustering numbers and initial cluster center. This clustering algorithm has been tested by real evaluation data of teaching designs. The result proves that the combined fuzzy clustering based on F-statistic is more effective.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Hong Xia ◽  
Qingyi Dong ◽  
Hui Gao ◽  
Yanping Chen ◽  
ZhongMin Wang

It is difficult to accurately classify a service into specific service clusters for the multirelationships between services. To solve this problem, this paper proposes a service partition method based on particle swarm fuzzy clustering, which can effectively consider multirelationships between services by using a fuzzy clustering algorithm. Firstly, the algorithm for automatically determining the number of clusters is to determine the number of service clusters based on the density of the service core point. Secondly, the fuzzy c -means combined with particle swarm optimization algorithm to find the optimal cluster center of the service. Finally, the fuzzy clustering algorithm uses the improved Gram-cosine similarity to obtain the final results. Extensive experiments on real web service data show that our method is better than mainstream clustering algorithms in accuracy.


2019 ◽  
Vol 28 (04) ◽  
pp. 1950065 ◽  
Author(s):  
Wei Zhang ◽  
Gongxuan Zhang ◽  
Xiaohui Chen ◽  
Yueqi Liu ◽  
Xiumin Zhou ◽  
...  

Hierarchical clustering is a classical method to provide a hierarchical representation for the purpose of data analysis. However, in practical applications, it is difficult to deal with massive datasets due to their high computation complexity. To overcome this challenge, this paper presents a novel distributed storage and computation hierarchical clustering algorithm, which has a lower time complexity than the standard hierarchical clustering algorithms. Our proposed approach is suitable for hierarchical clustering on massive datasets, which has the following advantages. First, the algorithm is able to store massive dataset exceeding the main memory space by using distributed storage nodes. Second, the algorithm is able to efficiently process nearest neighbor searching along parallel lines by using distributed computation at each node. Extensive experiments are carried out to validate the effectiveness of the DHC algorithm. Experimental results demonstrate that the algorithm is 10 times faster than the standard hierarchical clustering algorithm, which is an effective and flexible distributed algorithm of hierarchical clustering for massive datasets.


2021 ◽  
pp. 1-14
Author(s):  
Maolin Shi ◽  
Zihao Wang ◽  
Lizhang Xu

Data clustering based on regression relationship is able to improve the validity and reliability of the engineering data mining results. Surrogate models are widely used to evaluate the regression relationship in the process of data clustering, but there is no single surrogate model that always performs the best for all the regression relationships. To solve this issue, a fuzzy clustering algorithm based on hybrid surrogate model is proposed in this work. The proposed algorithm is based on the framework of fuzzy c-means algorithm, in which the differences between the clusters are evaluated by the regression relationship instead of Euclidean distance. Several surrogate models are simultaneously utilized to evaluate the regression relationship through a weighting scheme. The clustering objective function is designed based on the prediction errors of multiple surrogate models, and an alternating optimization method is proposed to minimize it to obtain the memberships of data and the weights of surrogate models. The synthetic datasets are used to test single surrogate model-based fuzzy clustering algorithms to choose the surrogate models used in the proposed algorithm. It is found that support vector regression-based and response surface-based fuzzy clustering algorithms show competitive clustering performance, so support vector regression and response surface are used to construct the hybrid surrogate model in the proposed algorithm. The experimental results of synthetic datasets and engineering datasets show that the proposed algorithm can provide more competitive clustering performance compared with single surrogate model-based fuzzy clustering algorithms for the datasets with regression relationships.


Sign in / Sign up

Export Citation Format

Share Document