The Discretization of Continuous Attributes Based on Improved SOM Clustering

In order to solve the problem of continuous attribute discretization, a new improved SOM clustering algorithm was proposed. The algorithm uses the SOM to achieve the initial cluster and get the clustering up limit, then treats the initial cluster centers as samples and use the BIRCH hierarchical clustering algorithm to get secondary clustering, then solves the problems of inflated clusters and identifies discrete breakpoints set. Finally, find the nearest neighbors of each cluster center among these any samples of Breakpoints sets which belong to its attribute, and use it as a basis of discrete trimming. The experimental results show that the proposed algorithm outperforms the conventional discrete SOM clustering algorithm in the breakpoints set (contour factor to enhance 75%) and discrete accuracy (incompatible degrees closer to 0) aspects.

Download Full-text

Improved minimum-minimum roughness algorithm for clustering categorical data

International Journal of ADVANCED AND APPLIED SCIENCES ◽

10.21833/ijaas.2021.10.006 ◽

2021 ◽

Vol 8 (10) ◽

pp. 43-50

Author(s):

Truong et al. ◽

Keyword(s):

Machine Learning ◽

Data Mining ◽

Hierarchical Clustering ◽

Categorical Data ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Experimental Results ◽

Data Sets ◽

Top Down ◽

Hierarchical Clustering Algorithm

Clustering is a fundamental technique in data mining and machine learning. Recently, many researchers are interested in the problem of clustering categorical data and several new approaches have been proposed. One of the successful and pioneering clustering algorithms is the Minimum-Minimum Roughness algorithm (MMR) which is a top-down hierarchical clustering algorithm and can handle the uncertainty in clustering categorical data. However, MMR tends to choose the category with less value leaf node with more objects, leading to undesirable clustering results. To overcome such shortcomings, this paper proposes an improved version of the MMR algorithm for clustering categorical data, called IMMR (Improved Minimum-Minimum Roughness). Experimental results on actual data sets taken from UCI show that the IMMR algorithm outperforms MMR in clustering categorical data.

Download Full-text

Weighted k-Prototypes Clustering Algorithm Based on the Hybrid Dissimilarity Coefficient

Mathematical Problems in Engineering ◽

10.1155/2020/5143797 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Ziqi Jia ◽

Ling Song

Keyword(s):

Categorical Data ◽

Clustering Algorithm ◽

Numerical Data ◽

Experimental Results ◽

Cluster Center ◽

Real Dataset ◽

Dissimilarity Coefficient ◽

Initial Cluster ◽

Data Objects ◽

Selection Of

The k-prototypes algorithm is a hybrid clustering algorithm that can process Categorical Data and Numerical Data. In this study, the method of initial Cluster Center selection was improved and a new Hybrid Dissimilarity Coefficient was proposed. Based on the proposed Hybrid Dissimilarity Coefficient, a weighted k-prototype clustering algorithm based on the hybrid dissimilarity coefficient was proposed (WKPCA). The proposed WKPCA algorithm not only improves the selection of initial Cluster Centers, but also puts a new method to calculate the dissimilarity between data objects and Cluster Centers. The real dataset of UCI was used to test the WKPCA algorithm. Experimental results show that WKPCA algorithm is more efficient and robust than other k-prototypes algorithms.

Download Full-text

Simple primitive recognition via hierarchical face clustering

Computational Visual Media ◽

10.1007/s41095-020-0192-6 ◽

2020 ◽

Vol 6 (4) ◽

pp. 431-443

Author(s):

Xiaolong Yang ◽

Xiaohong Jia

Keyword(s):

Hierarchical Clustering ◽

Test Data ◽

Efficient Algorithm ◽

Clustering Algorithm ◽

State Of The Art ◽

Experimental Results ◽

Face Clustering ◽

Wide Range ◽

Hierarchical Clustering Algorithm ◽

Bottom To Top

AbstractWe present a simple yet efficient algorithm for recognizing simple quadric primitives (plane, sphere, cylinder, cone) from triangular meshes. Our approach is an improved version of a previous hierarchical clustering algorithm, which performs pairwise clustering of triangle patches from bottom to top. The key contributions of our approach include a strategy for priority and fidelity consideration of the detected primitives, and a scheme for boundary smoothness between adjacent clusters. Experimental results demonstrate that the proposed method produces qualitatively and quantitatively better results than representative state-of-the-art methods on a wide range of test data.

Download Full-text

A Novel Local Density Hierarchical Clustering Algorithm Based on Reverse Nearest Neighbors

Mathematical Problems in Engineering ◽

10.1155/2019/2959017 ◽

2019 ◽

Vol 2019 ◽

pp. 1-10

Author(s):

Yaohui Liu ◽

Dong Liu ◽

Fang Yu ◽

Zhengming Ma

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Local Density ◽

Clustering Algorithms ◽

Real Data ◽

Nearest Neighbors ◽

Clustering Methods ◽

Density Peak ◽

Hierarchical Clustering Algorithm

Clustering is widely used in data analysis, and density-based methods are developed rapidly in the recent 10 years. Although the state-of-art density peak clustering algorithms are efficient and can detect arbitrary shape clusters, they are nonsphere type of centroid-based methods essentially. In this paper, a novel local density hierarchical clustering algorithm based on reverse nearest neighbors, RNN-LDH, is proposed. By constructing and using a reverse nearest neighbor graph, the extended core regions are found out as initial clusters. Then, a new local density metric is defined to calculate the density of each object; meanwhile, the density hierarchical relationships among the objects are built according to their densities and neighbor relations. Finally, each unclustered object is classified to one of the initial clusters or noise. Results of experiments on synthetic and real data sets show that RNN-LDH outperforms the current clustering methods based on density peak or reverse nearest neighbors.

Download Full-text

Hierarchical clustering of maximum parsimony reconciliations

BMC Bioinformatics ◽

10.1186/s12859-019-3223-5 ◽

2019 ◽

Vol 20 (1) ◽

Author(s):

Ross Mawhorter ◽

Ran Libeskind-Hadas

Keyword(s):

Hierarchical Clustering ◽

Maximum Parsimony ◽

General Framework ◽

Clustering Algorithm ◽

Experimental Results ◽

The Other ◽

Worst Case ◽

New Approach ◽

Hierarchical Clustering Algorithm ◽

Symbiont Species

Abstract Background Maximum parsimony reconciliation in the duplication-transfer-loss model is a widely-used method for analyzing the evolutionary histories of pairs of entities such as hosts and parasites, symbiont species, and species and genes. While efficient algorithms are known for finding maximum parsimony reconciliations, the number of such reconciliations can be exponential in the size of the trees. Since these reconciliations can differ substantially from one another, making inferences from any one reconciliation may lead to conclusions that are not supported, or may even be contradicted, by other maximum parsimony reconciliations. Therefore, there is a need to find small sets of best representative reconciliations when the space of solutions is large and diverse. Results We provide a general framework for hierarchical clustering the space of maximum parsimony reconciliations. We demonstrate this framework for two specific linkage criteria, one that seeks to maximize the average support of the events found in the reconciliations in each cluster and the other that seeks to minimize the distance between reconciliations in each cluster. We analyze the asymptotic worst-case running times and provide experimental results that demonstrate the viability and utility of this approach. Conclusions The hierarchical clustering algorithm method proposed here provides a new approach to find a set of representative reconciliations in the potentially vast and diverse space of maximum parsimony reconciliations.

Download Full-text

Research on Improved K-Means Clustering Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.403-408.1977 ◽

2011 ◽

Vol 403-408 ◽

pp. 1977-1980

Author(s):

Yin Sheng Zhang ◽

Hui Lin Shan ◽

Jia Qiang Li ◽

Jie Zhou

Keyword(s):

Clustering Algorithm ◽

Cluster Center ◽

Local Optimum ◽

Java Language ◽

Initial Cluster ◽

Feature Extraction And Selection ◽

Clustering Quality ◽

Fuzzy Neural ◽

Hierarchical Clustering Algorithm ◽

Selection Of

The traditional K-means clustering algorithm prematurely plunges into a local optimum because of sensitive selection of the initial cluster center. Hierarchical clustering algorithm can be used to generate the initial cluster center of K-means clustering algorithm. The geometric features of input data can achieve a good distribution by means of pretreatment and feature extraction and selection. In the learning of fuzzy neural network, Java language is used to write source code of the algorithm. The experimental results show that new algorithm has improved the clustering quality effectively.

Download Full-text

Handling WSD using Hierarchical Clustering Algorithm with sentences

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset1841120 ◽

2018 ◽

pp. 83-88

Author(s):

Mohana Priya K ◽

Pooja Ragavi S ◽

Krishna Priya G

Keyword(s):

Hierarchical Clustering ◽

Similarity Measure ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Cosine Similarity Measure ◽

Hierarchical Clustering Algorithm ◽

Multiple Levels ◽

Pos Tagger ◽

Sentence Clustering ◽

The Right

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%

Download Full-text

Based on a hierarchical clustering algorithm: detecting the community structure of a resting state brain network

Future Computer and Information Technology ◽

10.2495/icfcit130781 ◽

2013 ◽

Author(s):

Wenzhao Liu ◽

Limin Niu ◽

Junjie Chen

Keyword(s):

Community Structure ◽

Hierarchical Clustering ◽

Resting State ◽

Clustering Algorithm ◽

Brain Network ◽

Hierarchical Clustering Algorithm

Download Full-text

Evaluation of students programming skills on a computer programming course with a hierarchical clustering algorithm

2020 IEEE Frontiers in Education Conference (FIE) ◽

10.1109/fie44824.2020.9274130 ◽

2020 ◽

Author(s):

Davi Bernardo Silva ◽

Carlos N. Silla

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Computer Programming ◽

Programming Skills ◽

Hierarchical Clustering Algorithm

Download Full-text

Hesitant Fuzzy Linguistic Agglomerative Hierarchical Clustering Algorithm and Its Application in Judicial Practice

Mathematics ◽

10.3390/math9040370 ◽

2021 ◽

Vol 9 (4) ◽

pp. 370

Author(s):

Shuangsheng Wu ◽

Jie Lin ◽

Zhenyu Zhang ◽

Yushu Yang

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Agglomerative Hierarchical Clustering ◽

Research Gaps ◽

Judicial Practice ◽

Linguistic Term ◽

Clustering Effect ◽

Hierarchical Clustering Algorithm ◽

Fuzzy Linguistic

The fuzzy clustering algorithm has become a research hotspot in many fields because of its better clustering effect and data expression ability. However, little research focuses on the clustering of hesitant fuzzy linguistic term sets (HFLTSs). To fill in the research gaps, we extend the data type of clustering to hesitant fuzzy linguistic information. A kind of hesitant fuzzy linguistic agglomerative hierarchical clustering algorithm is proposed. Furthermore, we propose a hesitant fuzzy linguistic Boole matrix clustering algorithm and compare the two clustering algorithms. The proposed clustering algorithms are applied in the field of judicial execution, which provides decision support for the executive judge to determine the focus of the investigation and the control. A clustering example verifies the clustering algorithm’s effectiveness in the context of hesitant fuzzy linguistic decision information.

Download Full-text