A Fast Multiobjective Fuzzy Clustering with Multimeasures Combination

Most of the existing clustering algorithms are often based on Euclidean distance measure. However, only using Euclidean distance measure may not be sufficient enough to partition a dataset with different structures. Thus, it is necessary to combine multiple distance measures into clustering. However, the weights for different distance measures are hard to set. Accordingly, it appears natural to keep multiple distance measures separately and to optimize them simultaneously by applying a multiobjective optimization technique. Recently a new clustering algorithm called ‘multiobjective evolutionary clustering based on combining multiple distance measures’ (MOECDM) was proposed to integrate Euclidean and Path distance measures together for partitioning the dataset with different structures. However, it is time-consuming due to the large-sized genes. This paper proposes a fast multiobjective fuzzy clustering algorithm for partitioning the dataset with different structures. In this algorithm, a real encoding scheme is adopted to represent the individual. Two fuzzy clustering objective functions are designed based on Euclidean and Path distance measures, respectively, to evaluate the goodness of each individual. An improved evolutionary operator is also introduced accordingly to increase the convergence speed and the diversity of the population. In the final generation, a set of nondominated solutions can be obtained. The best solution and the best distance measure are selected by using a semisupervised method. Afterwards, an updated algorithm is also designed to detect the optimal cluster number automatically. The proposed algorithms are applied to many datasets with different structures, and the results of eight artificial and six real-life datasets are shown in experiments. Experimental results have shown that the proposed algorithms can not only successfully partition the dataset with different structures, but also reduce the computational cost.

Download Full-text

A convolution-based distance measure for fuzzy singletons and its application in a pattern recognition problem

Integrated Computer-Aided Engineering ◽

10.3233/ica-200629 ◽

2020 ◽

Vol 28 (1) ◽

pp. 51-63 ◽

Cited By ~ 2

Author(s):

Rodrigo Naranjo ◽

Matilde Santos ◽

Luis Garmendia

Keyword(s):

Pattern Recognition ◽

Fuzzy Number ◽

Euclidean Distance ◽

Distance Measure ◽

Distance Measures ◽

Trading System ◽

Weighted Distance ◽

Pattern Recognition Problem ◽

Euclidean Distance Measure ◽

Desirable Behavior

A new method to measure the distance between fuzzy singletons (FSNs) is presented. It first fuzzifies a crisp number to a generalized trapezoidal fuzzy number (GTFN) using the Mamdani fuzzification method. It then treats an FSN as an impulse signal and transforms the FSN into a new GTFN by convoluting it with the original GTFN. In so doing, an existing distance measure for GTFNs can be used to measure distance between FSNs. It is shown that the new measure offers a desirable behavior over the Euclidean and weighted distance measures in the following sense: Under the new measure, the distance between two FSNs is larger when they are in different GTFNs, and smaller when they are in the same GTFN. The advantage of the new measure is demonstrated on a fuzzy forecasting trading system over two different real stock markets, which provides better predictions with larger profits than those obtained using the Euclidean distance measure for the same system.

Download Full-text

Clustering of OECD Countries Out of Pocket Health Expenditure Time Series Data

Research in Applied Economics ◽

10.5296/rae.v8i2.9377 ◽

2016 ◽

Vol 8 (2) ◽

pp. 23

Author(s):

Songul Cinaroglu

Keyword(s):

Euclidean Distance ◽

Time Series Data ◽

Distance Measure ◽

Health Expenditures ◽

Oecd Countries ◽

Distance Measures ◽

Series Data ◽

Longest Common Subsequences ◽

Study Results ◽

Euclidean Distance Measure

Out of pocket health expenditures points out to the payments made by households at the point they receive health services. Frequently these include doctor consultation fees, purchase of medication and hospital bills. In this study hierarchical clustering method was used for classification of 34 countries which are members of OECD (Organization for Economic Cooperation and Development) in terms of out of pocket health expenditures for the years between 1995-2011. Longest common subsequences (LCS), correlation coefficient and Euclidean distance measure was used as a measure of similarity and distance in hierarchical clustering. At the end of the analysis it was found that LCS and Euclidean distance measures were the best for determining clusters. Furthermore, study results led to understand grouping of OECD countries according to health expenditures.

Download Full-text

Kernel-Based Robust Bias-Correction Fuzzy Weighted C-Ordered-Means Clustering Algorithm

Symmetry ◽

10.3390/sym11060753 ◽

2019 ◽

Vol 11 (6) ◽

pp. 753

Author(s):

Wenyuan Zhang ◽

Xijuan Guo ◽

Tianyu Huang ◽

Jiale Liu ◽

Jun Chen

Keyword(s):

Bias Correction ◽

Euclidean Distance ◽

Clustering Algorithm ◽

Distance Measure ◽

Similarity Measures ◽

Distance Measures ◽

Background Information ◽

Local Similarity ◽

Original Algorithm ◽

Fcm Clustering

The spatial constrained Fuzzy C-means clustering (FCM) is an effective algorithm for image segmentation. Its background information improves the insensitivity to noise to some extent. In addition, the membership degree of Euclidean distance is not suitable for revealing the non-Euclidean structure of input data, since it still lacks enough robustness to noise and outliers. In order to overcome the problem above, this paper proposes a new kernel-based algorithm based on the Kernel-induced Distance Measure, which we call it Kernel-based Robust Bias-correction Fuzzy Weighted C-ordered-means Clustering Algorithm (KBFWCM). In the construction of the objective function, KBFWCM algorithm comprehensively takes into account that the spatial constrained FCM clustering algorithm is insensitive to image noise and involves a highly intensive computation. Aiming at the insensitivity of spatial constrained FCM clustering algorithm to noise and its image detail processing, the KBFWCM algorithm proposes a comprehensive algorithm combining fuzzy local similarity measures (space and grayscale) and the typicality of data attributes. Aiming at the poor robustness of the original algorithm to noise and outliers and its highly intensive computation, a Kernel-based clustering method that includes a class of robust non-Euclidean distance measures is proposed in this paper. The experimental results show that the KBFWCM algorithm has a stronger denoising and robust effect on noise image.

Download Full-text

Computing Expectiles Using k-Nearest Neighbours Approach

Symmetry ◽

10.3390/sym13040645 ◽

2021 ◽

Vol 13 (4) ◽

pp. 645

Author(s):

Muhammad Farooq ◽

Sehrish Sarfraz ◽

Christophe Chesneau ◽

Mahmood Ul Hassan ◽

Muhammad Ali Raza ◽

...

Keyword(s):

Computational Cost ◽

Real Life ◽

Distance Measures ◽

Computational Time ◽

High Dimensional ◽

Test Error ◽

Nearest Neighbours ◽

Comparable Performance ◽

Asymmetric Least Squares ◽

Low Computational Cost

Expectiles have gained considerable attention in recent years due to wide applications in many areas. In this study, the k-nearest neighbours approach, together with the asymmetric least squares loss function, called ex-kNN, is proposed for computing expectiles. Firstly, the effect of various distance measures on ex-kNN in terms of test error and computational time is evaluated. It is found that Canberra, Lorentzian, and Soergel distance measures lead to minimum test error, whereas Euclidean, Canberra, and Average of (L1,L∞) lead to a low computational cost. Secondly, the performance of ex-kNN is compared with existing packages er-boost and ex-svm for computing expectiles that are based on nine real life examples. Depending on the nature of data, the ex-kNN showed two to 10 times better performance than er-boost and comparable performance with ex-svm regarding test error. Computationally, the ex-kNN is found two to five times faster than ex-svm and much faster than er-boost, particularly, in the case of high dimensional data.

Download Full-text

Linkage analysis using geographical proximity: a test of the efficacy of distance measures

Journal of Criminological Research Policy and Practice ◽

10.1108/jcrpp-01-2020-0006 ◽

2020 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Shumpei Haginoya ◽

Aiko Hanayama ◽

Tamae Koike

Keyword(s):

Environmental Factors ◽

Euclidean Distance ◽

Distance Measure ◽

Distance Measures ◽

Manhattan Distance ◽

Geographical Proximity ◽

Discrimination Accuracy ◽

Content Type ◽

Shortest Route ◽

The Impact

Purpose The purpose of this paper was to compare the accuracy of linking crimes using geographical proximity between three distance measures: Euclidean (distance measured by the length of a straight line between two locations), Manhattan (distance obtained by summing north-south distance and east-west distance) and the shortest route distances. Design/methodology/approach A total of 194 cases committed by 97 serial residential burglars in Aomori Prefecture in Japan between 2004 and 2015 were used in the present study. The Mann–Whitney U test was used to compare linked (two offenses committed by the same offender) and unlinked (two offenses committed by different offenders) pairs for each distance measure. Discrimination accuracy between linked and unlinked crime pairs was evaluated using area under the receiver operating characteristic curve (AUC). Findings The Mann–Whitney U test showed that the distances of the linked pairs were significantly shorter than those of the unlinked pairs for all distance measures. Comparison of the AUCs showed that the shortest route distance achieved significantly higher accuracy compared with the Euclidean distance, whereas there was no significant difference between the Euclidean and the Manhattan distance or between the Manhattan and the shortest route distance. These findings give partial support to the idea that distance measures taking the impact of environmental factors into consideration might be able to identify a crime series more accurately than Euclidean distances. Research limitations/implications Although the results suggested a difference between the Euclidean and the shortest route distance, it was small, and all distance measures resulted in outstanding AUC values, probably because of the ceiling effects. Further investigation that makes the same comparison in a narrower area is needed to avoid this potential inflation of discrimination accuracy. Practical implications The shortest route distance might contribute to improving the accuracy of crime linkage based on geographical proximity. However, further investigation is needed to recommend using the shortest route distance in practice. Given that the targeted area in the present study was relatively large, the findings may contribute especially to improve the accuracy of proactive comparative case analysis for estimating the whole picture of the distribution of serial crimes in the region by selecting more effective distance measure. Social implications Implications to improve the accuracy in linking crimes may contribute to assisting crime investigations and the earlier arrest of offenders. Originality/value The results of the present study provide an initial indication of the efficacy of using distance measures taking environmental factors into account.

Download Full-text

Image Retrieval of Self-Adapt Distance Measure Based on SLLE

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.989-994.3675 ◽

2014 ◽

Vol 989-994 ◽

pp. 3675-3678

Author(s):

Xiao Fen Wang ◽

Hai Na Zhang ◽

Xiu Rong Qiu ◽

Jiang Ping Song ◽

Ke Xin Zhang

Keyword(s):

Image Retrieval ◽

High Dimension ◽

Topological Structure ◽

Euclidean Distance ◽

Distance Measure ◽

Content Based Image Retrieval ◽

Locally Linear Embedding ◽

Euclidean Distance Measure ◽

Linear Embedding ◽

Dimension Space

Self-adapt distance measure supervised locally linear embedding solves the problem that Euclidean distance measure can not apart from samples in content-based image retrieval. This method uses discriminative distance measure to construct k-NN and effectively keeps its topological structure in high dimension space, meanwhile it broadens interval of samples and strengthens the ability of classifying. Experiment results show the ADM-SLLE date-reducing-dimension method speeds up the image retrieval and acquires high accurate rate in retrieval.

Download Full-text

A new decision based unsymmetric trimmed median filter using Euclidean distance measure for removal of high density Salt and Pepper noise from images

International Conference on Information Communication and Embedded Systems (ICICES2014) ◽

10.1109/icices.2014.7033825 ◽

2014 ◽

Cited By ~ 1

Author(s):

T. Santhanam ◽

K. Chithra

Keyword(s):

Euclidean Distance ◽

Distance Measure ◽

Median Filter ◽

High Density ◽

Salt And Pepper Noise ◽

Euclidean Distance Measure ◽

Salt And Pepper

Download Full-text

Fuzzy Distance Measure and Fuzzy Clustering Algorithm

Journal of Interdisciplinary Mathematics ◽

10.1080/09720502.2013.842049 ◽

2015 ◽

Vol 18 (5) ◽

pp. 471-492 ◽

Cited By ~ 3

Author(s):

Ismat Beg ◽

Tabasam Rashid

Keyword(s):

Fuzzy Clustering ◽

Clustering Algorithm ◽

Distance Measure ◽

Fuzzy Distance ◽

Fuzzy Clustering Algorithm

Download Full-text

An Euclidean distance measure between covariance matrices of speech cepstra for text-independent speaker recognition

Proceedings of the 1997 South African Symposium on Communications and Signal Processing. COMSIG '97 ◽

10.1109/comsig.1997.630003 ◽

2002 ◽

Cited By ~ 3

Author(s):

J.N.L. Brummer ◽

L.R. Strydom

Keyword(s):

Speaker Recognition ◽

Euclidean Distance ◽

Distance Measure ◽

Covariance Matrices ◽

Euclidean Distance Measure

Download Full-text

Calibration Approach Product Type Estimators of Population Mean in Stratified Sampling with Single Constraint: A Comparison of Three Distance Measures

Asian Journal of Probability and Statistics ◽

10.9734/ajpas/2021/v15i230350 ◽

2021 ◽

pp. 41-58

Author(s):

Enang, Ekaette Inyang ◽

Ojua, Doris Nkan ◽

T. T. Ojewale

Keyword(s):

Distance Measure ◽

Real Life ◽

High Gain ◽

Product Type ◽

Distance Measures ◽

Minimum Entropy ◽

Chi Square ◽

Data Set ◽

Population Mean ◽

Single Constraint

This study employed the method of calibration on product type estimator to propose calibration product type estimators using three distance measures namely; chi-square distance measure, the minimum entropy distance measure and the modified chi-square distance measure for single constraint. The estimators of variances of the proposed estimators were also obtained. An empirical study to ascertain the performance of these estimators was carried out using real life and stimulated data set. The result with the real life data showed that the proposed calibration product type estimator produced better estimates of the population mean compared to and . Results from the simulation study showed that the proposed calibration product type estimators had a high gain in efficiency as compared to the product type estimator. The simulation result also showed that the proposed estimators were more consistent and reliable under the Gamma and Exponential distributions with the exponential distribution taking the lead. The conventional product type estimator however was found to be better if the underlying distributional assumption is normal in nature.

Download Full-text