Fuzzyc-Means with Quadratic Penalty-Vector Regularization Using Kullback-Leibler Information for Uncertain Data

Clustering, a highly useful unsupervised classification, has been applied in many fields. When, for example, we use clustering to classify a set of objects, it generally ignores any uncertainty included in objects. This is because uncertainty is difficult to deal with and model. It is desirable, however, to handle individual objects as is so that we may classify objects more precisely. In this paper, we propose new clustering algorithms that handle objects having uncertainty by introducing penalty vectors. We show the theoretical relationship between our proposal and conventional algorithms verifying the effectiveness of our proposed algorithms through numerical examples.

Download Full-text

Hard c-Means Using Quadratic Penalty-Vector Regularization for Uncertain Data

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2012.p0831 ◽

2012 ◽

Vol 16 (7) ◽

pp. 831-840 ◽

Cited By ~ 1

Author(s):

Yasunori Endo ◽

◽

Arisa Taniguchi ◽

Yukihiro Hamasuna ◽

◽

...

Keyword(s):

Missing Values ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Uncertain Data ◽

Unsupervised Classification ◽

Real Space ◽

Clustering Methods ◽

Cluster Number ◽

Numerical Examples ◽

Classification Technique

Clustering is an unsupervised classification technique for data analysis. In general, each datum in real space is transformed into a point in a pattern space to apply clustering methods. Data cannot often be represented by a point, however, because of its uncertainty, e.g., measurement error margin and missing values in data. In this paper, we will introduce quadratic penalty-vector regularization to handle such uncertain data using Hard c-Means (HCM), which is one of the most typical clustering algorithms. We first propose a new clustering algorithm called hard c-means using quadratic penalty-vector regularization for uncertain data (HCMP). Second, we propose sequential extraction hard c-means using quadratic penalty-vector regularization (SHCMP) to handle datasets whose cluster number is unknown. Furthermore, we verify the effectiveness of our proposed algorithms through numerical examples.

Download Full-text

On Objective-Based Rough Hard and Fuzzyc-Means Clustering

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2015.p0029 ◽

2015 ◽

Vol 19 (1) ◽

pp. 29-35 ◽

Cited By ~ 1

Author(s):

Naohiko Kinoshita ◽

◽

Yasunori Endo ◽

Keyword(s):

Objective Function ◽

Rough Set ◽

Clustering Algorithms ◽

Unsupervised Classification ◽

Classification Methods ◽

Clustering Methods ◽

Clustering Method ◽

Numerical Examples ◽

Initial Values

Clustering is one of the most popular unsupervised classification methods. In this paper, we focus on rough clustering methods based on rough-set representation. Rough k-Means (RKM) is one of the rough clustering method proposed by Lingras et al. Outputs of many clustering algorithms, including RKM depend strongly on initial values, so we must evaluate the validity of outputs. In the case of objectivebased clustering algorithms, the objective function is handled as the measure. It is difficult, however to evaluate the output in RKM, which is not objective-based. To solve this problem, we propose new objective-based rough clustering algorithms and verify theirs usefulness through numerical examples.

Download Full-text

On Tolerant Fuzzyc-Means Clustering

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2009.p0421 ◽

2009 ◽

Vol 13 (4) ◽

pp. 421-428 ◽

Cited By ~ 12

Author(s):

Yukihiro Hamasuna ◽

◽

Yasunori Endo ◽

Sadaaki Miyamoto ◽

Keyword(s):

Optimization Problems ◽

Clustering Algorithms ◽

Fuzzy Classification ◽

Optimal Solutions ◽

Numerical Examples ◽

New Type ◽

Original Concept ◽

Classification Function

This paper presents a new type of clustering algorithms by using a tolerance vector called tolerant fuzzyc-means clustering (TFCM). In the proposed algorithms, the new concept of tolerance vector plays very important role. In the original concept of tolerance, a tolerance vector attributes to each data. This concept is developed to handle data flexibly, that is, a tolerance vector attributes not only to each data but also each cluster. Using the new concept, we can consider the influence of clusters to each data by the tolerance. First, the new concept of tolerance is introduced into optimization problems based on conventional fuzzyc-means clustering (FCM). Second, the optimization problems with tolerance are solved by using Karush-Kuhn-Tucker conditions. Third, new clustering algorithms are constructed based on the explicit optimal solutions of the optimization problems. Finally, the effectiveness of the proposed algorithms is verified through numerical examples by fuzzy classification function.

Download Full-text

Fuzzyc-Means Clustering for Data with Clusterwise Tolerance Based onL2- andL1-Regularization

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2011.p0068 ◽

2011 ◽

Vol 15 (1) ◽

pp. 68-75 ◽

Cited By ~ 4

Author(s):

Yukihiro Hamasuna ◽

◽

Yasunori Endo ◽

Sadaaki Miyamoto ◽

Keyword(s):

Data Mining ◽

Objective Function ◽

Clustering Algorithms ◽

Fuzzy Classification ◽

Numerical Examples ◽

Cluster Shape ◽

Regularization Techniques ◽

Single Objective

Detecting various kinds of cluster shape is an important problem in the field of clustering. In general, it is difficult to obtain clusters with different sizes or shapes by single-objective function. From that sense, we have proposed the concept of clusterwise tolerance and constructed clustering algorithms based on it. In the field of data mining, regularization techniques are used in order to derive significant classifiers. In this paper, we propose another concept of clusterwise tolerance from the viewpoint of regularization. Moreover, we construct clustering algorithms for data with clusterwise tolerance based onL2- andL1-regularization. After that, we describe fuzzy classification functions of proposed algorithms. Finally, we show the effectiveness of proposed algorithms through numerical examples.

Download Full-text

On Fuzzy Non-Metric Model for Data with Tolerance and its Application to Incomplete Data Clustering

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2016.p0571 ◽

2016 ◽

Vol 20 (4) ◽

pp. 571-579 ◽

Cited By ~ 1

Author(s):

Yasunori Endo ◽

◽

Tomoyuki Suzuki ◽

Naohiko Kinoshita ◽

Yukihiro Hamasuna ◽

...

Keyword(s):

Data Clustering ◽

Incomplete Data ◽

Clustering Algorithm ◽

Uncertain Data ◽

Data Sets ◽

Membership Degree ◽

Clustering Methods ◽

Clustering Method ◽

Numerical Examples ◽

Metric Model

The fuzzy non-metric model (FNM) is a representative non-hierarchical clustering method, which is very useful because the belongingness or the membership degree of each datum to each cluster can be calculated directly from the dissimilarities between data and the cluster centers are not used. However, the original FNM cannot handle data with uncertainty. In this study, we refer to the data with uncertainty as “uncertain data,” e.g., incomplete data or data that have errors. Previously, a methods was proposed based on the concept of a tolerance vector for handling uncertain data and some clustering methods were constructed according to this concept, e.g. fuzzyc-means for data with tolerance. These methods can handle uncertain data in the framework of optimization. Thus, in the present study, we apply the concept to FNM. First, we propose a new clustering algorithm based on FNM using the concept of tolerance, which we refer to as the fuzzy non-metric model for data with tolerance. Second, we show that the proposed algorithm can handle incomplete data sets. Third, we verify the effectiveness of the proposed algorithm based on comparisons with conventional methods for incomplete data sets in some numerical examples.

Download Full-text

COMBINING FUZZY PROBABILITY AND FUZZY CLUSTERING FOR MULTISPECTRAL SATELLITE IMAGERY CLASSIFICATION

Vietnam Journal of Science and Technology ◽

10.15625/0866-708x/54/3/6463 ◽

2016 ◽

Vol 54 (3) ◽

pp. 300 ◽

Cited By ~ 2

Author(s):

Mai Dinh Sinh ◽

Le Hung Trinh ◽

Ngo Thanh Long

Keyword(s):

Fuzzy Clustering ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Satellite Image ◽

Great Influence ◽

Unsupervised Classification ◽

Classification Algorithms ◽

Fuzzy Probability ◽

Multispectral Satellite Images ◽

The Stability

This paper proposes a method of combining fuzzy probability and fuzzy clustering algorithm to classify on multispectral satellite images by relying on fuzzy probability to calculate the number of clusters and the centroid of clusters then using fuzzy clustering to classifying land-cover on the satellite image. In fact, the classification algorithms, the initialization of the clusters and the initial centroid of clusters have great influence on the stability of the algorithms, dealing time and classification results; the unsupervised classification algorithms such as k-Means, c-Means, Iso-data are used quite common for many problems, but the disadvantages is the low accuracy and unstable, especially when dealing with the problems on the satellite image. Results of the algorithm which are proposed show significant reduction of noise in the clusters and comparison with various clustering algorithms like k-means, iso-data, so on.

Download Full-text

On Hierarchical Linguistic-Based Clustering

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2015.p0900 ◽

2015 ◽

Vol 19 (6) ◽

pp. 900-906 ◽

Cited By ~ 1

Author(s):

Naohiko Kinoshita ◽

◽

Yasunori Endo ◽

Akira Sugawara ◽

◽

...

Keyword(s):

Data Structure ◽

Data Structures ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Fuzzy Reasoning ◽

Unsupervised Classification ◽

Clustering Techniques ◽

Model Based Clustering ◽

Model Based ◽

Soft Computing Techniques

Clustering is representative unsupervised classification. Many researchers have proposed clustering algorithms based on mathematical models – methods we call model-based clustering. Clustering techniques are very useful for determining data structures, but model-based clustering is difficult to use for analyzing data correctly because we cannot select a suitable method unless we know the data structure at least partially. The new clustering algorithm we propose introduces soft computing techniques such as fuzzy reasoning in what we call linguistic-based clustering, whose features are not incident to the data structure. We verify the method’s effectiveness through numerical examples.

Download Full-text

On Objective-Based Rough Clustering with Fuzzy-Set Representation

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2015.p0632 ◽

2015 ◽

Vol 19 (5) ◽

pp. 632-638

Author(s):

Naohiko Kinoshita ◽

◽

Yasunori Endo ◽

Ken Onishi ◽

◽

...

Keyword(s):

Objective Function ◽

Fuzzy Set ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Numerical Examples

The rough clustering algorithm we proposed based on the optimization of objective function (RCM) has a problem because conventional rough clustering algorithm results do not ensure that solutions are optimal. To solve this problem, we propose rough clustering algorithms based on optimization of an objective function with fuzzy-set representation. This yields more flexible results than RCM. We verify algorithm effectiveness through numerical examples.

Download Full-text

Fuzzyc-Means for Data with Rectangular Maximum Tolerance Range

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2008.p0461 ◽

2008 ◽

Vol 12 (5) ◽

pp. 461-466 ◽

Cited By ~ 8

Author(s):

Yasunori Endo ◽

◽

Yasushi Hasegawa ◽

Yukihiro Hamasuna ◽

Sadaaki Miyamoto ◽

...

Keyword(s):

Optimization Problems ◽

Clustering Algorithms ◽

Broad Sense ◽

Tolerance Range ◽

Numerical Examples ◽

Maximum Tolerance

This paper provides new clustering algorithms for data with tolerance. Tolerance is understood in a broad sense, e.g., calculation errors and loss of attribute of data. The concept of tolerance is modified by using new concept of tolerance vector. First, the concept is explained and optimization problems of clustering are formulated using the vectors. Second, the problems are solved using Karush-Kuhn-Tucker conditions. Third, the new clustering algorithms are constructed by using the solutions of the problems. Moreover, the effectiveness of proposed algorithms is verified through some numerical examples.

Download Full-text

K-Means Clustering Algorithms to Compute Software Effort Estimation

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2016.5676 ◽

2016 ◽

Vol 13 (10) ◽

pp. 7093-7098 ◽

Cited By ~ 1

Author(s):

Shivakumar Nagarajan ◽

Balaji Narayanan

Keyword(s):

Neural Network ◽

Software Development ◽

Clustering Algorithms ◽

Uncertain Data ◽

Accurate Estimation ◽

Development Effort ◽

Effort Estimation ◽

Software Effort Estimation ◽

Software Projects ◽

Software Development Effort

Software development effort estimation is the way of predicting the effort to improve software economics. Accurate estimation of effort is the most tedious tasks in software projects. However, several methods are used to estimate the software development effort accurately. Imprecise estimation can leads to project failure due to uncertain data. In this paper, a hybrid model based on combination of Particle Swarm Optimization (PSO), K-means clustering algorithms, neural network and ABE method is proposed. The proposed method can be useful to predict better clustering and more accurate estimation and hence, there are difficulties in clustering and outliers in the software projects. The obtained results showed the better clustering result which provides the estimation result accurately. Then, neural network and Analogy methods are used which enhance the accuracy significantly.

Download Full-text