Simultaneous Application of Fuzzy Clustering and Quantification with Incomplete Categorical Data

Author(s):  
Katsuhiro Honda ◽  
◽  
Yoshihito Nakamura ◽  
Hidetomo Ichihashi

This paper proposes the simultaneous application of homogeneity analysis and fuzzy clustering with incomplete data. Taking into account the similarity between the loss function for homogeneity analysis and the least squares criterion for principal component analysis, we define the new objective function in a formulation similar to linear fuzzy clustering with missing values. Numerical experiments demonstrate the feasibility of the proposed method.

2011 ◽  
Vol 2011 ◽  
pp. 1-10 ◽  
Author(s):  
Takeshi Yamamoto ◽  
Katsuhiro Honda ◽  
Akira Notsu ◽  
Hidetomo Ichihashi

Relational fuzzy clustering has been developed for extracting intrinsic cluster structures of relational data and was extended to a linear fuzzy clustering model based on Fuzzyc-Medoids (FCMdd) concept, in which Fuzzyc-Means-(FCM-) like iterative algorithm was performed by defining linear cluster prototypes using two representative medoids for each line prototype. In this paper, the FCMdd-type linear clustering model is further modified in order to handle incomplete data including missing values, and the applicability of several imputation methods is compared. In several numerical experiments, it is demonstrated that some pre-imputation strategies contribute to properly selecting representative medoids of each cluster.


2019 ◽  
Vol 13 ◽  
pp. 174830261986744
Author(s):  
Ran Zhang ◽  
Bin Ye ◽  
Peng Liu

Nowadays, datasets containing a very large number of variables or features are routinely generated in many fields. Dimension reduction techniques are usually performed prior to statistically analyzing these datasets in order to avoid the effects of the curse of dimensionality. Principal component analysis is one of the most important techniques for dimension reduction and data visualization. However, datasets with missing values arising in almost every field will produce biased estimates and are difficult to handle, especially in the high dimension, low sample size settings. By exploiting a Lasso estimator of the population covariance matrix, we propose to regularize the principal component analysis to reduce the dimensionality of dataset with missing data. The Lasso estimator of covariance matrix is computationally tractable by solving a convex optimization problem. To illustrate the effectiveness of our method on dimension reduction, the principal component directions are evaluated by the metrics of Frobenius norm and cosine distance. The performances are compared with other incomplete data handling methods such as mean substitution and multiple imputation. Simulation results also show that our method is superior to other incomplete data handling methods in the context of discriminant analysis of real world high-dimensional datasets.


Mathematics ◽  
2021 ◽  
Vol 9 (13) ◽  
pp. 1498
Author(s):  
Karel J. in’t Hout ◽  
Jacob Snoeijer

We study the principal component analysis based approach introduced by Reisinger and Wittum (2007) and the comonotonic approach considered by Hanbali and Linders (2019) for the approximation of American basket option values via multidimensional partial differential complementarity problems (PDCPs). Both approximation approaches require the solution of just a limited number of low-dimensional PDCPs. It is demonstrated by ample numerical experiments that they define approximations that lie close to each other. Next, an efficient discretisation of the pertinent PDCPs is presented that leads to a favourable convergence behaviour.


Talanta ◽  
2007 ◽  
Vol 72 (1) ◽  
pp. 172-178 ◽  
Author(s):  
I STANIMIROVA ◽  
M DASZYKOWSKI ◽  
B WALCZAK

Kursor ◽  
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Annisa Eka Haryati ◽  
Sugiyarto Sugiyarto ◽  
Rizki Desi Arindra Putri

Multivariate statistics have related problems with large data dimensions. One method that can be used is principal component analysis (PCA). Principal component analysis (PCA) is a technique used to reduce data dimensions consisting of several dependent variables while maintaining variance in the data. PCA can be used to stabilize measurements in statistical analysis, one of which is cluster analysis. Fuzzy clustering is a method of grouping based on membership values ​​that includes fuzzy sets as a weighting basis for grouping. In this study, the fuzzy clustering method used is Fuzzy Subtractive Clustering (FSC) and Fuzzy C-Means (FCM) with a combination of the Minkowski Chebysev distance. The purpose of this study was to compare the cluster results obtained from the FSC and FCM using the DBI validity index. The results obtained indicate that the results of clustering using FCM are better than the FSC.


Sign in / Sign up

Export Citation Format

Share Document