High-Dimensional Data Clustering Algorithm Based on Stacked-Random Projection

Cross Breed Clustering Algorithm for High Dimensional Data

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a5313.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 5049-5052

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Clustering ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

High Dimensional Data ◽

High Dimensional ◽

Growing Domain ◽

Present World

Clustering plays a major role in machine learning and also in data mining. Deep learning is fast growing domain in present world. Improving the quality of the clustering results by adopting the deep learning algorithms. Many clustering algorithm process various datasets to get the better results. But for the high dimensional data clustering is still an issue to process and get the quality clustering results with the existing clustering algorithms. In this paper, the cross breed clustering algorithm for high dimensional data is utilized. Various datasets are used to get the results.

Download Full-text

High-Dimensional Text Clustering by Dimensionality Reduction and Improved Density Peak

Wireless Communications and Mobile Computing ◽

10.1155/2020/8881112 ◽

2020 ◽

Vol 2020 ◽

pp. 1-16

Author(s):

Yujia Sun ◽

Jan Platoš

Keyword(s):

Dimensionality Reduction ◽

Data Clustering ◽

High Dimensional Data ◽

Random Projection ◽

Experimental Results ◽

High Dimensional ◽

Density Peak ◽

Text Data ◽

Number Of Clusters ◽

Density Peaks

This study focuses on high-dimensional text data clustering, given the inability of K-means to process high-dimensional data and the need to specify the number of clusters and randomly select the initial centers. We propose a Stacked-Random Projection dimensionality reduction framework and an enhanced K-means algorithm DPC-K-means based on the improved density peaks algorithm. The improved density peaks algorithm determines the number of clusters and the initial clustering centers of K-means. Our proposed algorithm is validated using seven text datasets. Experimental results show that this algorithm is suitable for clustering of text data by correcting the defects of K-means.

Download Full-text

High dimensional data Clustering Algorithm Based on Sparse Feature Vector for Categorical Attributes

2010 International Conference on Logistics Systems and Intelligent Management (ICLSIM) ◽

10.1109/iclsim.2010.5461099 ◽

2010 ◽

Cited By ~ 2

Author(s):

Sen Wu ◽

Guiying Wei

Keyword(s):

Data Clustering ◽

Clustering Algorithm ◽

Feature Vector ◽

High Dimensional Data ◽

High Dimensional ◽

Categorical Attributes

Download Full-text

An Improved K-Means Algorithm of High-Dimensional Data

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.926-930.2968 ◽

2014 ◽

Vol 926-930 ◽

pp. 2968-2972

Author(s):

Cheng Cheng Zheng ◽

Hong Zhang

Keyword(s):

Data Clustering ◽

Clustering Algorithm ◽

High Efficiency ◽

High Dimensional Data ◽

High Dimensional

This paper summarizes the characteristics of high-dimensional data and the difficulties of high-dimensional data clustering, points out the shortcomings of traditional clustering algorithm in performing clustering high-dimensional data, and proposes an improved K-means algorithm to complete the high-dimensional data clustering, the algorithm has better scalability and high efficiency, suitable for handling large document sets.

Download Full-text

A Fast Clustering Algorithm for Large-scale and High Dimensional Data

ACTA AUTOMATICA SINICA ◽

10.3724/sp.j.1004.2009.00859 ◽

2009 ◽

Vol 35 (7) ◽

pp. 859-866

Author(s):

Ming LIU ◽

Xiao-Long WANG ◽

Yuan-Chao LIU

Keyword(s):

Large Scale ◽

Clustering Algorithm ◽

High Dimensional Data ◽

High Dimensional

Download Full-text

A meta-heuristic density-based subspace clustering algorithm for high-dimensional data

Soft Computing ◽

10.1007/s00500-021-05973-1 ◽

2021 ◽

Author(s):

Parul Agarwal ◽

Shikha Mehta ◽

Ajith Abraham

Keyword(s):

Clustering Algorithm ◽

High Dimensional Data ◽

Subspace Clustering ◽

High Dimensional

Download Full-text

Scalable hierarchical clustering by composition rank vector encoding and tree structure

10.1101/2020.04.12.038026 ◽

2020 ◽

Author(s):

Xiao Lai ◽

Pu Tian

Keyword(s):

Machine Learning ◽

Hierarchical Clustering ◽

Clustering Algorithm ◽

High Dimensional Data ◽

Machine Learning Algorithms ◽

Tree Structure ◽

Supervised Machine Learning ◽

High Dimensional ◽

Rank Vector ◽

Nonlinear Correlations

AbstractSupervised machine learning, especially deep learning based on a wide variety of neural network architectures, have contributed tremendously to fields such as marketing, computer vision and natural language processing. However, development of un-supervised machine learning algorithms has been a bottleneck of artificial intelligence. Clustering is a fundamental unsupervised task in many different subjects. Unfortunately, no present algorithm is satisfactory for clustering of high dimensional data with strong nonlinear correlations. In this work, we propose a simple and highly efficient hierarchical clustering algorithm based on encoding by composition rank vectors and tree structure, and demonstrate its utility with clustering of protein structural domains. No record comparison, which is an expensive and essential common step to all present clustering algorithms, is involved. Consequently, it achieves linear time and space computational complexity hierarchical clustering, thus applicable to arbitrarily large datasets. The key factor in this algorithm is definition of composition, which is dependent upon physical nature of target data and therefore need to be constructed case by case. Nonetheless, the algorithm is general and applicable to any high dimensional data with strong nonlinear correlations. We hope this algorithm to inspire a rich research field of encoding based clustering well beyond composition rank vector trees.

Download Full-text

Fuzzy C Means Clustering Algorithm for High Dimensional Data Using Feature Subset Selection Technique

IOSR Journal of Computer Engineering ◽

10.9790/0661-16226469 ◽

2014 ◽

Vol 16 (2) ◽

pp. 64-69 ◽

Cited By ~ 1

Author(s):

N. Manjula ◽

◽

S. Pandiarajan ◽

J. Jagadeesan

Keyword(s):

Clustering Algorithm ◽

High Dimensional Data ◽

Subset Selection ◽

Feature Subset Selection ◽

High Dimensional ◽

Feature Subset ◽

Selection Technique ◽

Fuzzy C Means ◽

Fuzzy C Means Clustering

Download Full-text

Robust models and novel similarity measures for high-dimensional data clustering

10.32657/10356/48657 ◽

2012 ◽

Author(s):

Duc Thang Nguyen

Keyword(s):

Data Clustering ◽

High Dimensional Data ◽

Similarity Measures ◽

High Dimensional

Download Full-text

A heuristic-based fuzzy co-clustering algorithm for categorization of high-dimensional data

Fuzzy Sets and Systems ◽

10.1016/j.fss.2007.10.003 ◽

2008 ◽

Vol 159 (4) ◽

pp. 371-389 ◽

Cited By ~ 23

Author(s):

William-Chandra Tjhi ◽

Lihui Chen

Keyword(s):

Clustering Algorithm ◽

High Dimensional Data ◽

High Dimensional

Download Full-text