A fuzzy clustering algorithm based on hybrid surrogate model

2021 ◽  
pp. 1-14
Author(s):  
Maolin Shi ◽  
Zihao Wang ◽  
Lizhang Xu

Data clustering based on regression relationship is able to improve the validity and reliability of the engineering data mining results. Surrogate models are widely used to evaluate the regression relationship in the process of data clustering, but there is no single surrogate model that always performs the best for all the regression relationships. To solve this issue, a fuzzy clustering algorithm based on hybrid surrogate model is proposed in this work. The proposed algorithm is based on the framework of fuzzy c-means algorithm, in which the differences between the clusters are evaluated by the regression relationship instead of Euclidean distance. Several surrogate models are simultaneously utilized to evaluate the regression relationship through a weighting scheme. The clustering objective function is designed based on the prediction errors of multiple surrogate models, and an alternating optimization method is proposed to minimize it to obtain the memberships of data and the weights of surrogate models. The synthetic datasets are used to test single surrogate model-based fuzzy clustering algorithms to choose the surrogate models used in the proposed algorithm. It is found that support vector regression-based and response surface-based fuzzy clustering algorithms show competitive clustering performance, so support vector regression and response surface are used to construct the hybrid surrogate model in the proposed algorithm. The experimental results of synthetic datasets and engineering datasets show that the proposed algorithm can provide more competitive clustering performance compared with single surrogate model-based fuzzy clustering algorithms for the datasets with regression relationships.

1995 ◽  
Vol 05 (02) ◽  
pp. 239-259
Author(s):  
SU HWAN KIM ◽  
SEON WOOK KIM ◽  
TAE WON RHEE

For data analyses, it is very important to combine data with similar attribute values into a categorically homogeneous subset, called a cluster, and this technique is called clustering. Generally crisp clustering algorithms are weak in noise, because each datum should be assigned to exactly one cluster. In order to solve the problem, a fuzzy c-means, a fuzzy maximum likelihood estimation, and an optimal fuzzy clustering algorithms in the fuzzy set theory have been proposed. They, however, require a lot of processing time because of exhaustive iteration with an amount of data and their memberships. Especially large memory space results in the degradation of performance in real-time processing applications, because it takes too much time to swap between the main memory and the secondary memory. To overcome these limitations, an extended fuzzy clustering algorithm based on an unsupervised optimal fuzzy clustering algorithm is proposed in this paper. This algorithm assigns a weight factor to each distinct datum considering its occurrence rate. Also, the proposed extended fuzzy clustering algorithm considers the degree of importances of each attribute, which determines the characteristics of the data. The worst case is that the whole data has an uniformly normal distribution, which means the importance of all attributes are the same. The proposed extended fuzzy clustering algorithm has better performance than the unsupervised optimal fuzzy clustering algorithm in terms of memory space and execution time in most cases. For simulation the proposed algorithm is applied to color image segmentation. Also automatic target detection and multipeak detection are considered as applications. These schemes can be applied to any other fuzzy clustering algorithms.


2010 ◽  
Vol 29-32 ◽  
pp. 802-808
Author(s):  
Min Min

On analyzing the common problems in fuzzy clustering algorithms, we put forward the combined fuzzy clustering one, which will automatically generate a reasonable clustering numbers and initial cluster center. This clustering algorithm has been tested by real evaluation data of teaching designs. The result proves that the combined fuzzy clustering based on F-statistic is more effective.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Hong Xia ◽  
Qingyi Dong ◽  
Hui Gao ◽  
Yanping Chen ◽  
ZhongMin Wang

It is difficult to accurately classify a service into specific service clusters for the multirelationships between services. To solve this problem, this paper proposes a service partition method based on particle swarm fuzzy clustering, which can effectively consider multirelationships between services by using a fuzzy clustering algorithm. Firstly, the algorithm for automatically determining the number of clusters is to determine the number of service clusters based on the density of the service core point. Secondly, the fuzzy c -means combined with particle swarm optimization algorithm to find the optimal cluster center of the service. Finally, the fuzzy clustering algorithm uses the improved Gram-cosine similarity to obtain the final results. Extensive experiments on real web service data show that our method is better than mainstream clustering algorithms in accuracy.


2012 ◽  
Vol 2012 ◽  
pp. 1-20 ◽  
Author(s):  
Xuesong Guo ◽  
Zhengwei Zhu ◽  
Jia Shi

Corporate credit-rating prediction using statistical and artificial intelligence techniques has received considerable attentions in the literature. Different from the thoughts of various techniques for adopting support vector machines as binary classifiers originally, a new method, based on support vector domain combined with fuzzy clustering algorithm for multiclassification, is proposed in the paper to accomplish corporate credit rating. By data preprocessing using fuzzy clustering algorithm, only the boundary data points are selected as training samples to accomplish support vector domain specification to reduce computational cost and also achieve better performance. To validate the proposed methodology, real-world cases are used for experiments, with results compared with conventional multiclassification support vector machine approaches and other artificial intelligence techniques. The results show that the proposed model improves the performance of corporate credit-rating with less computational consumption.


2016 ◽  
Vol 7 (2) ◽  
pp. 47-74 ◽  
Author(s):  
Duggirala Raja Kishor ◽  
N.B. Venkateswarlu

Expectation Maximization (EM) is a widely employed mixture model-based data clustering algorithm and produces exceptionally good results. However, many researchers reported that the EM algorithm requires huge computational efforts than other clustering algorithms. This paper presents an algorithm for the novel hybridization of EM and K-Means techniques for achieving better clustering performance (NovHbEMKM). This algorithm first performs K-Means and then using these results it performs EM and K-Means in the alternative iterations. Along with the NovHbEMKM, experiments are carried out with the algorithms for EM, EM using the results of K-Means and Cluster package of Purdue University. Experiments are carried out with datasets from UCI ML repository and synthetic datasets. Execution time, Clustering Fitness and Sum of Squared Errors (SSE) are computed as performance criteria. In all the experiments the proposed NovHbEMKM algorithm is taking less execution time by producing results with higher clustering fitness and lesser SSE than other algorithms including the Cluster package.


Sign in / Sign up

Export Citation Format

Share Document