Fuzzy Clustering of Incomplete Data by Means of Similarity Measures

Context. In most clustering (classification without a teacher) tasks associated with real data processing, the initial information is usually distorted by abnormal outliers (noise) and gaps. It is clear that “classical” methods of artificial intelligence (both batch and online) are ineffective in this situation.The goal of the paper is to propose the procedure of fuzzy clustering of incomplete data using credibilistic approach and similarity measure of special type. Objective. The goal of the work is credibilistic fuzzy clustering of distorted data, using of credibility theory. Method. The procedure of fuzzy clustering of incomplete data using credibilistic approach and similarity measure of special type based on the use of both robust goal functions of a special type and similarity measures, insensitive to outliers and designed to work both in batch and its recurrent online version designed to solve Data Stream Mining problems when data are fed to processing sequentially in real time. Results. The introduced methods are simple in numerical implementation and are free from the drawbacks inherent in traditional methods of probabilistic and possibilistic fuzzy clustering data distorted by abnormal outliers (noise) and gaps. Conclusions. The conducted experiments have confirmed the effectiveness of proposed methods of credibilistic fuzzy clustering of distorted data operability and allow recommending it for use in practice for solving the problems of automatic clusterization of distorted data. The proposed method is intended for use in hybrid systems of computational intelligence and, above all, in the problems of learning artificial neural networks, neuro-fuzzy systems, as well as in the problems of clustering and classification.

Download Full-text

A new iterative fuzzy clustering approach for incomplete data

Journal of Statistics and Management Systems ◽

10.1080/09720510.2020.1714150 ◽

2020 ◽

Vol 23 (1) ◽

pp. 91-102 ◽

Cited By ~ 1

Author(s):

Sonia Goel ◽

Meena Tushir

Keyword(s):

Fuzzy Clustering ◽

Incomplete Data ◽

Clustering Approach

Download Full-text

A Study of Different Similarity Measures on the Performance of Fuzzy Clustering

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i4.168173 ◽

2018 ◽

Vol 6 (4) ◽

pp. 168-173

Author(s):

O.A. Mohamed Jafar ◽

Keyword(s):

Fuzzy Clustering ◽

Similarity Measures

Download Full-text

Fuzzy Clustering of Incomplete Data Based on Cluster Dispersion

Computational Intelligence for Knowledge-Based Systems Design - Lecture Notes in Computer Science ◽

10.1007/978-3-642-14049-5_7 ◽

2010 ◽

pp. 59-68 ◽

Cited By ~ 11

Author(s):

Ludmila Himmelspach ◽

Stefan Conrad

Keyword(s):

Fuzzy Clustering ◽

Incomplete Data

Download Full-text

Simultaneous Application of Fuzzy Clustering and Quantification with Incomplete Categorical Data

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2004.p0397 ◽

2004 ◽

Vol 8 (4) ◽

pp. 397-402 ◽

Cited By ~ 7

Author(s):

Katsuhiro Honda ◽

◽

Yoshihito Nakamura ◽

Hidetomo Ichihashi

Keyword(s):

Principal Component Analysis ◽

Loss Function ◽

Fuzzy Clustering ◽

Categorical Data ◽

Incomplete Data ◽

Numerical Experiments ◽

Missing Values ◽

Principal Component ◽

Homogeneity Analysis ◽

Simultaneous Application

This paper proposes the simultaneous application of homogeneity analysis and fuzzy clustering with incomplete data. Taking into account the similarity between the loss function for homogeneity analysis and the least squares criterion for principal component analysis, we define the new objective function in a formulation similar to linear fuzzy clustering with missing values. Numerical experiments demonstrate the feasibility of the proposed method.

Download Full-text

Different Approaches for Missing Data Handling in Fuzzy Clustering: A Review

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096512666191127121710 ◽

2020 ◽

Vol 13 (6) ◽

pp. 833-846

Author(s):

Sonia Goel ◽

Meena Tushir

Keyword(s):

Missing Data ◽

Fuzzy Clustering ◽

Incomplete Data ◽

Clustering Algorithm ◽

Linear Interpolation ◽

Performance Criteria ◽

Data Sets ◽

Data Set ◽

Fcm Clustering ◽

Missing Attributes

Introduction: Incomplete data sets containing some missing attributes is a prevailing problem in many research areas. The reasons for the lack of missing attributes may be several; human error in tabulating/recording the data, machine failure, errors in data acquisition or refusal of a patient/customer to answer few questions in a questionnaire or survey. Further, clustering of such data sets becomes a challenge. Objective: In this paper, we presented a critical review of various methodologies proposed for handling missing data in clustering. The focus of this paper is the comparison of various imputation techniques based FCM clustering and the four clustering strategies proposed by Hathway and Bezdek. Methods: In this paper, we imputed the missing values in incomplete datasets by various imputation/ non-imputation techniques to complete the data set and then conventional fuzzy clustering algorithm is applied to get the clustering results. Results: Experiments on various synthetic data sets and real data sets from UCI repository are carried out. To evaluate the performance of the various imputation/ non-imputation based FCM clustering algorithm, several performance criteria and statistical tests are considered. Experimental results on various data sets show that the linear interpolation based FCM clustering performs significantly better than other imputation as well as non-imputation techniques. Conclusion: It is concluded that the clustering algorithm is data specific, no clustering technique can give good results on all data sets. It depends upon both the data type and the percentage of missing attributes in the dataset. Through this study, we have shown that the linear interpolation based FCM clustering algorithm can be used effectively for clustering of incomplete data set.

Download Full-text

A study on a fuzzy clustering for mixed numerical and categorical incomplete data

2013 International Conference on Fuzzy Theory and Its Applications (iFUZZY) ◽

10.1109/ifuzzy.2013.6825477 ◽

2013 ◽

Cited By ~ 2

Author(s):

Takashi Furukawa ◽

Shin-ichi Ohnishi ◽

Takahiro Yamanoi

Keyword(s):

Fuzzy Clustering ◽

Incomplete Data

Download Full-text

Fuzzy clustering of incomplete data based on missing attribute interval size

2015 IEEE 9th International Conference on Anti-counterfeiting, Security, and Identification (ASID) ◽

10.1109/icasid.2015.7405670 ◽

2015 ◽

Cited By ~ 2

Author(s):

Li Zhang ◽

Baoxing Li ◽

Liyong Zhang ◽

Dawei Li

Keyword(s):

Fuzzy Clustering ◽

Incomplete Data ◽

Interval Size

Download Full-text

A Comparative Study on TIBA Imputation Methods in FCMdd-Based Linear Clustering with Relational Data

Advances in Fuzzy Systems ◽

10.1155/2011/265170 ◽

2011 ◽

Vol 2011 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Takeshi Yamamoto ◽

Katsuhiro Honda ◽

Akira Notsu ◽

Hidetomo Ichihashi

Keyword(s):

Comparative Study ◽

Iterative Algorithm ◽

Fuzzy Clustering ◽

Incomplete Data ◽

Numerical Experiments ◽

Missing Values ◽

Relational Data ◽

Imputation Methods ◽

Clustering Model ◽

Linear Cluster

Relational fuzzy clustering has been developed for extracting intrinsic cluster structures of relational data and was extended to a linear fuzzy clustering model based on Fuzzyc-Medoids (FCMdd) concept, in which Fuzzyc-Means-(FCM-) like iterative algorithm was performed by defining linear cluster prototypes using two representative medoids for each line prototype. In this paper, the FCMdd-type linear clustering model is further modified in order to handle incomplete data including missing values, and the applicability of several imputation methods is compared. In several numerical experiments, it is demonstrated that some pre-imputation strategies contribute to properly selecting representative medoids of each cluster.

Download Full-text

On Cluster Validity for Fuzzy Clustering of Incomplete Data

Lecture Notes in Computer Science - Scalable Uncertainty Management ◽

10.1007/978-3-642-33362-0_50 ◽

2012 ◽

pp. 612-618

Author(s):

Ludmila Himmelspach ◽

João Paulo Carvalho ◽

Stefan Conrad

Keyword(s):

Fuzzy Clustering ◽

Incomplete Data ◽

Cluster Validity

Download Full-text