HIERARCHICAL SPHERICAL CLUSTERING

This work introduces an alternative representation for large dimensional data sets. Instead of using 2D or 3D representations, data is located on the surface of a sphere. Together with this representation, a hierarchical clustering algorithm is defined to analyse and extract the structure of the data. The algorithm builds a hierarchical structure (a dendrogram) in such a way that different cuts of the structure lead to different partitions of the surface of the sphere. This can be seen as a set of concentric spheres, each one being of different granularity. Also, to obtain an initial assignment of the data on the surface of the sphere, a method based on Sammon's mapping has been developed.

Download Full-text

Improved minimum-minimum roughness algorithm for clustering categorical data

International Journal of ADVANCED AND APPLIED SCIENCES ◽

10.21833/ijaas.2021.10.006 ◽

2021 ◽

Vol 8 (10) ◽

pp. 43-50

Author(s):

Truong et al. ◽

Keyword(s):

Machine Learning ◽

Data Mining ◽

Hierarchical Clustering ◽

Categorical Data ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Experimental Results ◽

Data Sets ◽

Top Down ◽

Hierarchical Clustering Algorithm

Clustering is a fundamental technique in data mining and machine learning. Recently, many researchers are interested in the problem of clustering categorical data and several new approaches have been proposed. One of the successful and pioneering clustering algorithms is the Minimum-Minimum Roughness algorithm (MMR) which is a top-down hierarchical clustering algorithm and can handle the uncertainty in clustering categorical data. However, MMR tends to choose the category with less value leaf node with more objects, leading to undesirable clustering results. To overcome such shortcomings, this paper proposes an improved version of the MMR algorithm for clustering categorical data, called IMMR (Improved Minimum-Minimum Roughness). Experimental results on actual data sets taken from UCI show that the IMMR algorithm outperforms MMR in clustering categorical data.

Download Full-text

A local cores-based hierarchical clustering algorithm for data sets with complex structures

Neural Computing and Applications ◽

10.1007/s00521-018-3641-8 ◽

2018 ◽

Vol 31 (11) ◽

pp. 8051-8068 ◽

Cited By ~ 3

Author(s):

Dongdong Cheng ◽

Qingsheng Zhu ◽

Jinlong Huang ◽

Quanwang Wu ◽

Lijun Yang

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Data Sets ◽

Complex Structures ◽

Hierarchical Clustering Algorithm

Download Full-text

A fast hierarchical clustering algorithm for large-scale protein sequence data sets

Computers in Biology and Medicine ◽

10.1016/j.compbiomed.2014.02.016 ◽

2014 ◽

Vol 48 ◽

pp. 94-101 ◽

Cited By ~ 10

Author(s):

Sándor M. Szilágyi ◽

László Szilágyi

Keyword(s):

Hierarchical Clustering ◽

Protein Sequence ◽

Large Scale ◽

Clustering Algorithm ◽

Sequence Data ◽

Data Sets ◽

Protein Sequence Data ◽

Hierarchical Clustering Algorithm

Download Full-text

Leaders–Subleaders: An efficient hierarchical clustering algorithm for large data sets

Pattern Recognition Letters ◽

10.1016/j.patrec.2003.12.013 ◽

2004 ◽

Vol 25 (4) ◽

pp. 505-513 ◽

Cited By ~ 46

Author(s):

P.A Vijaya ◽

M Narasimha Murty ◽

D.K Subramanian

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Hierarchical Clustering Algorithm

Download Full-text

A Local Cores-Based Hierarchical Clustering Algorithm for Data Sets with Complex Structures

2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC) ◽

10.1109/compsac.2018.00063 ◽

2018 ◽

Cited By ~ 5

Author(s):

Dongdong Cheng ◽

Qingsheng Zhu ◽

Quanwang Wu

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Data Sets ◽

Complex Structures ◽

Hierarchical Clustering Algorithm

Download Full-text

STUDY OF HIERARCHICAL CLUSTERING PARALLEL COMPUTATION ON PRAM MODEL

International Journal of Computational Methods ◽

10.1142/s021987621100271x ◽

2011 ◽

Vol 08 (03) ◽

pp. 597-609 ◽

Cited By ~ 2

Author(s):

Y. T. ZHOU ◽

Z. H. HE ◽

Z. G. WU

Keyword(s):

Parallel Algorithm ◽

Hierarchical Clustering ◽

Clustering Algorithm ◽

Spanning Trees ◽

Computing Time ◽

Data Sets ◽

Data Set ◽

Pram Model ◽

Hierarchical Clustering Algorithm ◽

Memory Conflicts

An adaptive parallel algorithm for hierarchical clustering based on PRAM model was presented. The following approaches were devised to produce the optimized clustered data set, including the data preprocessing based on "90-10" rule to decrease the size of the data set, progressively the parallel algorithm to create Euclid minimum spanning trees on absolute graph, and the algorithm that determined the split strategies and dealt with the memory conflicts. The data set was clustered based on the noncollision memory, the lowest cost, and weakest PRAM-EREW model. N data sets were clustered in O((λn)2/p) time (0.1 ≤ λ ≤ 0.3) by performing this algorithm using p processors (1 ≤ p ≤ n/ log (n)). The parallel hierarchical clustering algorithm based on PRAM model was adaptive, and of noncollision memory. The computing time could be significantly reduced after original inputting data was effectually preprocessed through the improved preprocessing methods presented in this paper.

Download Full-text

RECENT RESULTS IN HIERARCHICAL CLUSTERING: I–THE REDUCIBLE NEIGHBORHOODS CLUSTERING ALGORITHM

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001493000285 ◽

1993 ◽

Vol 07 (03) ◽

pp. 541-571 ◽

Cited By ~ 5

Author(s):

MICHEL BRUYNOOGHE

Keyword(s):

Hierarchical Clustering ◽

Speech Processing ◽

Clustering Algorithm ◽

Large Data ◽

Original Data ◽

Large Data Sets ◽

Data Sets ◽

Data Set ◽

Hierarchical Clustering Algorithm ◽

Better Than

The clustering of large data sets is of great interest in fields such as pattern recognition, numerical taxonomy, image or speech processing. The traditional Ascendant Hierarchical Algorithm (AHC) cannot be run for sets of more than a few thousand elements. The reducible neighborhoods clustering algorithm, which is presented in this paper, has overtaken the limits of the traditional hierarchical clustering algorithm by generating an exact hierarchy on a large data set. The theoretical justification of this algorithm is the so-called Bruynooghe reducibility principle, that lays down the condition under which the exact hierarchy may be constructed locally, by carrying out aggregations in restricted regions of the representation space. As for the Day and Edelsbrunner algorithm, the maximum theoretical time complexity of the reducible neighborhoods clustering algorithm is O(n2 log n), regardless of the chosen clustering strategy. But the reducible neighborhoods clustering algorithm uses the original data table and its practical performances are by far better than Day and Edelsbrunner’s algorithm, thus allowing the hierarchical clustering of large data sets, i.e. composed of more than 10 000 objects.

Download Full-text

Handling WSD using Hierarchical Clustering Algorithm with sentences

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset1841120 ◽

2018 ◽

pp. 83-88

Author(s):

Mohana Priya K ◽

Pooja Ragavi S ◽

Krishna Priya G

Keyword(s):

Hierarchical Clustering ◽

Similarity Measure ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Cosine Similarity Measure ◽

Hierarchical Clustering Algorithm ◽

Multiple Levels ◽

Pos Tagger ◽

Sentence Clustering ◽

The Right

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%

Download Full-text

Based on a hierarchical clustering algorithm: detecting the community structure of a resting state brain network

Future Computer and Information Technology ◽

10.2495/icfcit130781 ◽

2013 ◽

Author(s):

Wenzhao Liu ◽

Limin Niu ◽

Junjie Chen

Keyword(s):

Community Structure ◽

Hierarchical Clustering ◽

Resting State ◽

Clustering Algorithm ◽

Brain Network ◽

Hierarchical Clustering Algorithm

Download Full-text

User Power Behavior Similarity Clustering Based on Unsupervised Extreme Learning Machine Algorithm

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096512666191004130655 ◽

2020 ◽

Vol 13 (5) ◽

pp. 641-649

Author(s):

Yuancheng Li ◽

Yaqi Cui ◽

Xiaolong Zhang

Keyword(s):

Extreme Learning Machine ◽

Clustering Algorithm ◽

Characteristic Curve ◽

Clustering Algorithms ◽

Data Sets ◽

Residential Areas ◽

Processing Power ◽

Learning Machine ◽

Advanced Metering ◽

Matlab Programming

Background: Advanced Metering Infrastructure (AMI) for the smart grid is growing rapidly which results in the exponential growth of data collected and transmitted in the device. By clustering this data, it can give the electricity company a better understanding of the personalized and differentiated needs of the user. Objective: The existing clustering algorithms for processing data generally have some problems, such as insufficient data utilization, high computational complexity and low accuracy of behavior recognition. Methods: In order to improve the clustering accuracy, this paper proposes a new clustering method based on the electrical behavior of the user. Starting with the analysis of user load characteristics, the user electricity data samples were constructed. The daily load characteristic curve was extracted through improved extreme learning machine clustering algorithm and effective index criteria. Moreover, clustering analysis was carried out for different users from industrial areas, commercial areas and residential areas. The improved extreme learning machine algorithm, also called Unsupervised Extreme Learning Machine (US-ELM), is an extension and improvement of the original Extreme Learning Machine (ELM), which realizes the unsupervised clustering task on the basis of the original ELM. Results: Four different data sets have been experimented and compared with other commonly used clustering algorithms by MATLAB programming. The experimental results show that the US-ELM algorithm has higher accuracy in processing power data. Conclusion: The unsupervised ELM algorithm can greatly reduce the time consumption and improve the effectiveness of clustering.

Download Full-text