On Entropy Based Fuzzy Non Metric Model – Proposal, Kernelization and Pairwise Constraints –

Author(s):  
Yasunori Endo ◽  

The fuzzy non metric model is a kind of clustering method in which belongingness or the membership grade of each datum to each cluster is calculated directly from dissimilarities between data, and cluster centers are not used. In this paper, we first construct a new fuzzy non metric model with entropy regularization. Second, we kernelize the proposed method by introducing kernel functions. Third, we consider pairwise constraints with the proposed method. We then confirm the above methods through some simple numerical examples.

Author(s):  
Yasunori Endo ◽  
◽  
Ayako Heki ◽  
Yukihiro Hamasuna ◽  
◽  
...  

The non metricmodel is a kind of clustering method in which belongingness or the membership grade of each object in each cluster is calculated directly from dissimilarities between objects and in which cluster centers are not used. The clustering field has recently begun to focus on rough set representation instead of fuzzy set representation. Conventional clustering algorithms classify a set of objects into clusters with clear boundaries, that is, one object must belong to one cluster. Many objects in the real world, however, belong to more than one cluster because cluster boundaries overlap each other. Fuzzy set representation of clusters makes it possible for each object to belong to more than one cluster. The fuzzy degree of membership may, however, be too descriptive for interpreting clustering results. Rough set representation handles such cases. Clustering based on rough sets could provide a solution that is less restrictive than conventional clustering and more descriptive than fuzzy clustering. This paper covers two types of Rough-set-based Non Metric model (RNM). One algorithm is the Roughset-based Hard Non Metric model (RHNM) and the other is the Rough-set-based Fuzzy Non Metric model (RFNM). In both algorithms, clusters are represented by rough sets and each cluster consists of lower and upper approximation. The effectiveness of proposed algorithms is evaluated through numerical examples.


Author(s):  
Yukihiro Hamasuna ◽  
◽  
Yasunori Endo ◽  

This paper proposes entropy-basedL1-regularized possibilistic clustering and a method of sequential cluster extraction from relational data.Sequential cluster extractionmeans that the algorithm extracts cluster one by one. The assignment prototype algorithmis a typical clustering method for relational data. The membership degree of each object to each cluster is calculated directly from dissimilarities between objects. An entropy-basedL1-regularized possibilistic assignment prototype algorithm is proposed first to induce belongingness for a membership grade. An algorithm of sequential cluster extraction based on the proposed method is constructed and the effectiveness of the proposed methods is shown through numerical examples.


Author(s):  
Yasunori Endo ◽  
◽  
Tomoyuki Suzuki ◽  
Naohiko Kinoshita ◽  
Yukihiro Hamasuna ◽  
...  

The fuzzy non-metric model (FNM) is a representative non-hierarchical clustering method, which is very useful because the belongingness or the membership degree of each datum to each cluster can be calculated directly from the dissimilarities between data and the cluster centers are not used. However, the original FNM cannot handle data with uncertainty. In this study, we refer to the data with uncertainty as “uncertain data,” e.g., incomplete data or data that have errors. Previously, a methods was proposed based on the concept of a tolerance vector for handling uncertain data and some clustering methods were constructed according to this concept, e.g. fuzzyc-means for data with tolerance. These methods can handle uncertain data in the framework of optimization. Thus, in the present study, we apply the concept to FNM. First, we propose a new clustering algorithm based on FNM using the concept of tolerance, which we refer to as the fuzzy non-metric model for data with tolerance. Second, we show that the proposed algorithm can handle incomplete data sets. Third, we verify the effectiveness of the proposed algorithm based on comparisons with conventional methods for incomplete data sets in some numerical examples.


2019 ◽  
Vol 15 (1) ◽  
pp. 19-38
Author(s):  
Toshihiro Osaragi

It is necessary to classify numerical values of spatial data when representing them on a map so that, visually, it can be as clearly understood as possible. Inevitably some loss of information from the original data occurs in the process of this classification. A gate loss of information might lead to a misunderstanding of the nature of original data. At the same time, when we understand the spatial distribution of attribute values, forming spatial clusters is regarded as an effective means, in which values can be regarded as statistically equivalent and distribute continuous in the same patches. In this study, a classification method for organizing spatial data is proposed, in which any loss of information is minimized. Also, a spatial clustering method based on Akaike's Information Criterion is proposed. Some numerical examples of their applications are shown using actual spatial data for the Tokyo metropolitan area.


Author(s):  
Vicenç Torra ◽  
Yasuo Narukawa ◽  
Mark Daumas

This issue features decision making and other tools used in artificial intelligence applications. More specifically, the issue includes five papers focused on aggregation operators and clustering. The series starts with a paper by Yoshida on weighted quasiarithmetic means that focuses on their monotonicity viewed from utility and weighting functions. In the second paper, Nohmi, Honda and Okazaki focus on trust evaluation for networks, studying matrix operations based on t-norms and t-conorms. The authors also propose fuzzy graphs using adjacent matrices. These works are followed by three on fuzzy clustering. Kanzawa, Endo and Miyamoto present a variation of fuzzy c-means based on kernel functions in an approach developed for data with tolerance. Endo covers clustering using kernel functions. The paper is based on a fuzzy nonmetric model including pairwise constraints in the clustering process. The concluding paper also uses pairwise constraints, but within agglomerative hierarchical clustering. Hamasuna, Endo and Miyamoto include clusterwise tolerance in their mode. As the editors of this issue, we would like to thank the referees for their work in the reviews and journal editors-in-chief Profs. Toshio Fukuda and Kaoru Hirota and the journal staff for their support.


Author(s):  
Kei Kitajima ◽  
Yasunori Endo ◽  
Yukihiro Hamasuna ◽  
◽  
◽  
...  

Clustering is a method of data analysis without the use of supervised data. Even-sized clustering based on optimization (ECBO) is a clustering algorithm that focuses on cluster size with the constraints that cluster sizes must be the same. However, this constraints makes ECBO inconvenient to apply in cases where a certain margin of cluster size is allowed. It is believed that this issue can be overcome by applying a fuzzy clustering method. Fuzzy clustering can represent the membership of data to clusters more flexible. In this paper, we propose a new even-sized clustering algorithm based on fuzzy clustering and verify its effectiveness through numerical examples.


Author(s):  
Sadaaki Miyamoto ◽  
◽  
Youichi Nakayama ◽  

We discuss hard c-means clustering using a mapping into a high-dimensional space considered within the theory of support vector machines. Two types of iterative algorithms are developed. Effectiveness of the proposed method is shown by numerical examples.


2020 ◽  
Vol 34 (10) ◽  
pp. 13863-13864
Author(s):  
Ting-En Lin ◽  
Hua Xu ◽  
Hanlei Zhang

Discovering new user intents is an emerging task in the dialogue system. In this paper, we propose a self-supervised clustering method that can naturally incorporate pairwise constraints as prior knowledge to guide the clustering process and does not require intensive feature engineering. Extensive experiments on three benchmark datasets show that our method can yield significant improvements over strong baselines.


Author(s):  
Yukihiro Hamasuna ◽  
◽  
Yasunori Endo ◽  
Sadaaki Miyamoto ◽  

This paper presents semi-supervised agglomerative hierarchical clustering algorithm using clusterwise tolerance based pairwise constraints. In semi-supervised clustering, pairwise constraints, that is, must-link and cannot-link, are frequently used in order to improve clustering properties. From that sense, we will propose another way named clusterwise tolerance based pairwise constraints to handle must-link and cannot-link constraints inL2-space. In addition, we will propose semi-supervised agglomerative hierarchical clustering algorithm based on it. We will, moreover, show the effectiveness of the proposed method through numerical examples.


Sign in / Sign up

Export Citation Format

Share Document