A dissimilarity measure for the k-Modes clustering algorithm

A dissimilarity measure based Fuzzy c-means (FCM) clustering algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/ifs-120730 ◽

2014 ◽

Vol 26 (1) ◽

pp. 229-238 ◽

Cited By ~ 5

Author(s):

Usman Qamar

Keyword(s):

Clustering Algorithm ◽

Dissimilarity Measure ◽

Fuzzy C Means ◽

Fcm Clustering

Download Full-text

A Global-Relationship Dissimilarity Measure for thek-Modes Clustering Algorithm

Computational Intelligence and Neuroscience ◽

10.1155/2017/3691316 ◽

2017 ◽

Vol 2017 ◽

pp. 1-7 ◽

Cited By ~ 3

Author(s):

Hongfang Zhou ◽

Yihui Zhang ◽

Yibin Liu

Keyword(s):

Categorical Data ◽

Clustering Algorithm ◽

Real Data ◽

Dissimilarity Measure ◽

Data Sets ◽

Dissimilarity Measures

Thek-modes clustering algorithm has been widely used to cluster categorical data. In this paper, we firstly analyzed thek-modes algorithm and its dissimilarity measure. Based on this, we then proposed a novel dissimilarity measure, which is named as GRD. GRD considers not only the relationships between the object and all cluster modes but also the differences of different attributes. Finally the experiments were made on four real data sets from UCI. And the corresponding results show that GRD achieves better performance than two existing dissimilarity measures used ink-modes and Cao’s algorithms.

Download Full-text

A Fuzzy Relational Clustering Algorithm Based on a Dissimilarity Measure Extracted From Data

IEEE Transactions on Systems Man and Cybernetics Part B (Cybernetics) ◽

10.1109/tsmcb.2003.817041 ◽

2004 ◽

Vol 34 (1) ◽

pp. 775-781 ◽

Cited By ~ 15

Author(s):

P. Corsini ◽

B. Lazzerini ◽

F. Marcelloni

Keyword(s):

Clustering Algorithm ◽

Dissimilarity Measure ◽

Relational Clustering

Download Full-text

On the Impact of Dissimilarity Measure in k-Modes Clustering Algorithm

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.2007.53 ◽

2007 ◽

Vol 29 (3) ◽

pp. 503-507 ◽

Cited By ~ 112

Author(s):

Michael Ng ◽

Mark Li ◽

Joshua Huang ◽

Zengyou He

Keyword(s):

Clustering Algorithm ◽

Dissimilarity Measure ◽

The Impact

Download Full-text

Augmented intuitive dissimilarity metric for clustering of Web user sessions

Journal of Information Science ◽

10.1177/0165551516648259 ◽

2016 ◽

Vol 43 (4) ◽

pp. 480-491 ◽

Cited By ~ 7

Author(s):

Dilip Singh Sisodia ◽

Shrish Verma ◽

Om Prakash Vyas

Keyword(s):

Clustering Algorithm ◽

Syntactic Structure ◽

Dissimilarity Measure ◽

Cluster Validity ◽

Dissimilarity Measures ◽

User Clustering ◽

The Right ◽

Access Patterns ◽

Principle Objective ◽

Group Similarity

Clustering is a very useful technique to categorise Web users with common browsing activities, access patterns and navigational behaviour. Web user clustering is used to build Web visitor profiles that make the core of a personalised information recommender system. These systems are used to comprehend Web users surfing activities by offering tailored content to Web users with similar interests. The principle objective of Web user sessions clustering is to maximise the intra-group while minimising the inter-group similarity. Efficient clustering of Web users’ sessions not only depend on the clustering algorithm’s nature but also depend on how well user concerns are captured and accommodated by the dissimilarity measure that are used. Determining the right dissimilarity measure to capture the access behaviour of the Web user is very significant for substantial clustering. In this paper, an intuitive dissimilarity measure is presented to estimate a Web user’s concern from augmented Web user sessions. The proposed usage dissimilarity measure between two Web user sessions is based on the accessing page relevance, the syntactic structure of page URL and hierarchical structure of the website. This proposed intuitive dissimilarity measure was used with K-Medoids Clustering algorithm for experimentation and results were compared with other independent dissimilarity measures. The worth of the generated clusters were evaluated by two unsupervised cluster validity indexes. The experimental results show that intuitive augmented session dissimilarity measure is more realistic and superior as compared to the other independent dissimilarity measures regarding cluster validity indexes.

Download Full-text

The Choice of an Appropriate Information Dissimilarity Measure for Hierarchical Clustering of River Streamflow Time Series, Based on Calculated Lyapunov Exponent and Kolmogorov Measures

Entropy ◽

10.3390/e21020215 ◽

2019 ◽

Vol 21 (2) ◽

pp. 215 ◽

Cited By ~ 2

Author(s):

Dragutin Mihailović ◽

Emilija Nikolić-Đorić ◽

Slavica Malinović-Milićević ◽

Vijay Singh ◽

Anja Mihailović ◽

...

Keyword(s):

Lyapunov Exponent ◽

Hierarchical Clustering ◽

Kolmogorov Complexity ◽

Clustering Algorithm ◽

Dissimilarity Measure ◽

Brazos River ◽

Discharge Data ◽

Hierarchical Algorithm ◽

Daily Streamflow ◽

Average Linkage

The purpose of this paper was to choose an appropriate information dissimilarity measure for hierarchical clustering of daily streamflow discharge data, from twelve gauging stations on the Brazos River in Texas (USA), for the period 1989–2016. For that purpose, we selected and compared the average-linkage clustering hierarchical algorithm based on the compression-based dissimilarity measure (NCD), permutation distribution dissimilarity measure (PDDM), and Kolmogorov distance (KD). The algorithm was also compared with K-means clustering based on Kolmogorov complexity (KC), the highest value of Kolmogorov complexity spectrum (KCM), and the largest Lyapunov exponent (LLE). Using a dissimilarity matrix based on NCD, PDDM, and KD for daily streamflow, the agglomerative average-linkage hierarchical algorithm was applied. The key findings of this study are that: (i) The KD clustering algorithm is the most suitable among others; (ii) ANOVA analysis shows that there exist highly significant differences between mean values of four clusters, confirming that the choice of the number of clusters was suitably done; and (iii) from the clustering we found that the predictability of streamflow data of the Brazos River given by the Lyapunov time (LT), corrected for randomness by Kolmogorov time (KT) in days, lies in the interval from two to five days.

Download Full-text

An Improved K-modes Clustering Algorithm Based on Intra-cluster and Inter-cluster Dissimilarity Measure

Proceedings of the 2nd International Conference on Computer Engineering, Information Science & Application Technology (ICCIA 2017) ◽

10.2991/iccia-17.2017.67 ◽

2017 ◽

Author(s):

Hongfang Zhou ◽

Yihui Zhang ◽

Yibin Liu

Keyword(s):

Clustering Algorithm ◽

Dissimilarity Measure

Download Full-text

Supplemental Material for A New Dissimilarity Measure for Finding Semantic Structure in Category Fluency Data With Implications for Understanding Memory Organization in Schizophrenia

Neuropsychology ◽

10.1037/0894-4105.20.6.685.supp ◽

2006 ◽

Keyword(s):

Dissimilarity Measure ◽

Semantic Structure ◽

Memory Organization ◽

Category Fluency

Download Full-text

Distributed Entropy Energy-Efficient Clustering algorithm for cluster head selection (DEEEC)

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189135 ◽

2020 ◽

Vol 39 (6) ◽

pp. 8139-8147

Author(s):

Ranganathan Arun ◽

Rangaswamy Balamurugan

Keyword(s):

Energy Efficient ◽

Clustering Algorithm ◽

Cluster Head ◽

Residual Energy ◽

Energy Utilization ◽

Sensor Nodes ◽

Second Stage ◽

Energy Efficient Clustering ◽

Two Stages ◽

Ch Selection

In Wireless Sensor Networks (WSN) the energy of Sensor nodes is not certainly sufficient. In order to optimize the endurance of WSN, it is essential to minimize the utilization of energy. Head of group or Cluster Head (CH) is an eminent method to develop the endurance of WSN that aggregates the WSN with higher energy. CH for intra-cluster and inter-cluster communication becomes dependent. For complete, in WSN, the Energy level of CH extends its life of cluster. While evolving cluster algorithms, the complicated job is to identify the energy utilization amount of heterogeneous WSNs. Based on Chaotic Firefly Algorithm CH (CFACH) selection, the formulated work is named “Novel Distributed Entropy Energy-Efficient Clustering Algorithm”, in short, DEEEC for HWSNs. The formulated DEEEC Algorithm, which is a CH, has two main stages. In the first stage, the identification of temporary CHs along with its entropy value is found using the correlative measure of residual and original energy. Along with this, in the clustering algorithm, the rotating epoch and its entropy value must be predicted automatically by its sensor nodes. In the second stage, if any member in the cluster having larger residual energy, shall modify the temporary CHs in the direction of the deciding set. The target of the nodes with large energy has the probability to be CHs which is determined by the above two stages meant for CH selection. The MATLAB is required to simulate the DEEEC Algorithm. The simulated results of the formulated DEEEC Algorithm produce good results with respect to the energy and increased lifetime when it is correlated with the current traditional clustering protocols being used in the Heterogeneous WSNs.

Download Full-text

Handling WSD using Hierarchical Clustering Algorithm with sentences

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset1841120 ◽

2018 ◽

pp. 83-88

Author(s):

Mohana Priya K ◽

Pooja Ragavi S ◽

Krishna Priya G

Keyword(s):

Hierarchical Clustering ◽

Similarity Measure ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Cosine Similarity Measure ◽

Hierarchical Clustering Algorithm ◽

Multiple Levels ◽

Pos Tagger ◽

Sentence Clustering ◽

The Right

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%

Download Full-text