Fuzzyc-Means with Quadratic Penalty-Vector Regularization Using Kullback-Leibler Information for Uncertain Data

Author(s):  
Naohiko Kinoshita ◽  
◽  
Yasunori Endo ◽  
Yukihiro Hamasuna ◽  
◽  
...  

Clustering, a highly useful unsupervised classification, has been applied in many fields. When, for example, we use clustering to classify a set of objects, it generally ignores any uncertainty included in objects. This is because uncertainty is difficult to deal with and model. It is desirable, however, to handle individual objects as is so that we may classify objects more precisely. In this paper, we propose new clustering algorithms that handle objects having uncertainty by introducing penalty vectors. We show the theoretical relationship between our proposal and conventional algorithms verifying the effectiveness of our proposed algorithms through numerical examples.

Author(s):  
Yasunori Endo ◽  
◽  
Arisa Taniguchi ◽  
Yukihiro Hamasuna ◽  
◽  
...  

Clustering is an unsupervised classification technique for data analysis. In general, each datum in real space is transformed into a point in a pattern space to apply clustering methods. Data cannot often be represented by a point, however, because of its uncertainty, e.g., measurement error margin and missing values in data. In this paper, we will introduce quadratic penalty-vector regularization to handle such uncertain data using Hard c-Means (HCM), which is one of the most typical clustering algorithms. We first propose a new clustering algorithm called hard c-means using quadratic penalty-vector regularization for uncertain data (HCMP). Second, we propose sequential extraction hard c-means using quadratic penalty-vector regularization (SHCMP) to handle datasets whose cluster number is unknown. Furthermore, we verify the effectiveness of our proposed algorithms through numerical examples.


Author(s):  
Naohiko Kinoshita ◽  
◽  
Yasunori Endo ◽  

Clustering is one of the most popular unsupervised classification methods. In this paper, we focus on rough clustering methods based on rough-set representation. Rough k-Means (RKM) is one of the rough clustering method proposed by Lingras et al. Outputs of many clustering algorithms, including RKM depend strongly on initial values, so we must evaluate the validity of outputs. In the case of objectivebased clustering algorithms, the objective function is handled as the measure. It is difficult, however to evaluate the output in RKM, which is not objective-based. To solve this problem, we propose new objective-based rough clustering algorithms and verify theirs usefulness through numerical examples.


Author(s):  
Yukihiro Hamasuna ◽  
◽  
Yasunori Endo ◽  
Sadaaki Miyamoto ◽  

This paper presents a new type of clustering algorithms by using a tolerance vector called tolerant fuzzyc-means clustering (TFCM). In the proposed algorithms, the new concept of tolerance vector plays very important role. In the original concept of tolerance, a tolerance vector attributes to each data. This concept is developed to handle data flexibly, that is, a tolerance vector attributes not only to each data but also each cluster. Using the new concept, we can consider the influence of clusters to each data by the tolerance. First, the new concept of tolerance is introduced into optimization problems based on conventional fuzzyc-means clustering (FCM). Second, the optimization problems with tolerance are solved by using Karush-Kuhn-Tucker conditions. Third, new clustering algorithms are constructed based on the explicit optimal solutions of the optimization problems. Finally, the effectiveness of the proposed algorithms is verified through numerical examples by fuzzy classification function.


Author(s):  
Yukihiro Hamasuna ◽  
◽  
Yasunori Endo ◽  
Sadaaki Miyamoto ◽  

Detecting various kinds of cluster shape is an important problem in the field of clustering. In general, it is difficult to obtain clusters with different sizes or shapes by single-objective function. From that sense, we have proposed the concept of clusterwise tolerance and constructed clustering algorithms based on it. In the field of data mining, regularization techniques are used in order to derive significant classifiers. In this paper, we propose another concept of clusterwise tolerance from the viewpoint of regularization. Moreover, we construct clustering algorithms for data with clusterwise tolerance based onL2- andL1-regularization. After that, we describe fuzzy classification functions of proposed algorithms. Finally, we show the effectiveness of proposed algorithms through numerical examples.


Author(s):  
Yasunori Endo ◽  
◽  
Tomoyuki Suzuki ◽  
Naohiko Kinoshita ◽  
Yukihiro Hamasuna ◽  
...  

The fuzzy non-metric model (FNM) is a representative non-hierarchical clustering method, which is very useful because the belongingness or the membership degree of each datum to each cluster can be calculated directly from the dissimilarities between data and the cluster centers are not used. However, the original FNM cannot handle data with uncertainty. In this study, we refer to the data with uncertainty as “uncertain data,” e.g., incomplete data or data that have errors. Previously, a methods was proposed based on the concept of a tolerance vector for handling uncertain data and some clustering methods were constructed according to this concept, e.g. fuzzyc-means for data with tolerance. These methods can handle uncertain data in the framework of optimization. Thus, in the present study, we apply the concept to FNM. First, we propose a new clustering algorithm based on FNM using the concept of tolerance, which we refer to as the fuzzy non-metric model for data with tolerance. Second, we show that the proposed algorithm can handle incomplete data sets. Third, we verify the effectiveness of the proposed algorithm based on comparisons with conventional methods for incomplete data sets in some numerical examples.


2016 ◽  
Vol 54 (3) ◽  
pp. 300 ◽  
Author(s):  
Mai Dinh Sinh ◽  
Le Hung Trinh ◽  
Ngo Thanh Long

This paper proposes a method of combining fuzzy probability and fuzzy clustering algorithm to classify on multispectral satellite images by relying on fuzzy probability to calculate the number of clusters and the centroid of clusters then using fuzzy clustering to classifying land-cover on the satellite image. In fact, the classification algorithms, the initialization of the clusters and the initial centroid of clusters have great influence on the stability of the algorithms, dealing time and classification results; the unsupervised classification algorithms such as k-Means, c-Means, Iso-data are used quite common for many problems, but the disadvantages is the low accuracy and unstable, especially when dealing with the problems on the satellite image. Results of the algorithm which are proposed show significant reduction of noise in the clusters and comparison with various clustering algorithms like k-means, iso-data, so on. 


Author(s):  
Naohiko Kinoshita ◽  
◽  
Yasunori Endo ◽  
Akira Sugawara ◽  
◽  
...  

Clustering is representative unsupervised classification. Many researchers have proposed clustering algorithms based on mathematical models – methods we call model-based clustering. Clustering techniques are very useful for determining data structures, but model-based clustering is difficult to use for analyzing data correctly because we cannot select a suitable method unless we know the data structure at least partially. The new clustering algorithm we propose introduces soft computing techniques such as fuzzy reasoning in what we call linguistic-based clustering, whose features are not incident to the data structure. We verify the method’s effectiveness through numerical examples.


Author(s):  
Naohiko Kinoshita ◽  
◽  
Yasunori Endo ◽  
Ken Onishi ◽  
◽  
...  

The rough clustering algorithm we proposed based on the optimization of objective function (RCM) has a problem because conventional rough clustering algorithm results do not ensure that solutions are optimal. To solve this problem, we propose rough clustering algorithms based on optimization of an objective function with fuzzy-set representation. This yields more flexible results than RCM. We verify algorithm effectiveness through numerical examples.


Author(s):  
Yasunori Endo ◽  
◽  
Yasushi Hasegawa ◽  
Yukihiro Hamasuna ◽  
Sadaaki Miyamoto ◽  
...  

This paper provides new clustering algorithms for data with tolerance. Tolerance is understood in a broad sense, e.g., calculation errors and loss of attribute of data. The concept of tolerance is modified by using new concept of tolerance vector. First, the concept is explained and optimization problems of clustering are formulated using the vectors. Second, the problems are solved using Karush-Kuhn-Tucker conditions. Third, the new clustering algorithms are constructed by using the solutions of the problems. Moreover, the effectiveness of proposed algorithms is verified through some numerical examples.


2016 ◽  
Vol 13 (10) ◽  
pp. 7093-7098 ◽  
Author(s):  
Shivakumar Nagarajan ◽  
Balaji Narayanan

Software development effort estimation is the way of predicting the effort to improve software economics. Accurate estimation of effort is the most tedious tasks in software projects. However, several methods are used to estimate the software development effort accurately. Imprecise estimation can leads to project failure due to uncertain data. In this paper, a hybrid model based on combination of Particle Swarm Optimization (PSO), K-means clustering algorithms, neural network and ABE method is proposed. The proposed method can be useful to predict better clustering and more accurate estimation and hence, there are difficulties in clustering and outliers in the software projects. The obtained results showed the better clustering result which provides the estimation result accurately. Then, neural network and Analogy methods are used which enhance the accuracy significantly.


Sign in / Sign up

Export Citation Format

Share Document