PCA-K-Means Based Clustering Algorithm for High Dimensional and Overlapping Spectra Signals

Fuzzy C-means (FCM) clustering algorithm is one of the widely applied algorithms in non-supervision of pattern recognition. However, FCM algorithm in the iterative process requires a lot of calculations, especially when feature vectors has high-dimensional, Use clustering algorithm to sub-heap, not only inefficient, but also may lead to "the curse of dimensionality." For the problem, This paper analyzes the fuzzy C-means clustering algorithm in high dimensional feature of the process, the problem of cluster center is an np-hard problem, In order to improve the effectiveness and Real-time of fuzzy C-means clustering algorithm in high dimensional feature analysis, Combination of landmark isometric (L-ISOMAP) algorithm, Proposed improved algorithm FCM-LI. Preliminary analysis of the samples, Use clustering results and the correlation of sample data, using landmark isometric (L-ISOMAP) algorithm to reduce the dimension, further analysis on the basis, obtained the final results. Finally, experimental results show that the effectiveness and Real-time of FCM-LI algorithm in high dimensional feature analysis.

Download Full-text

Scalable hierarchical clustering by composition rank vector encoding and tree structure

10.1101/2020.04.12.038026 ◽

2020 ◽

Author(s):

Xiao Lai ◽

Pu Tian

Keyword(s):

Machine Learning ◽

Hierarchical Clustering ◽

Clustering Algorithm ◽

High Dimensional Data ◽

Machine Learning Algorithms ◽

Tree Structure ◽

Supervised Machine Learning ◽

High Dimensional ◽

Rank Vector ◽

Nonlinear Correlations

AbstractSupervised machine learning, especially deep learning based on a wide variety of neural network architectures, have contributed tremendously to fields such as marketing, computer vision and natural language processing. However, development of un-supervised machine learning algorithms has been a bottleneck of artificial intelligence. Clustering is a fundamental unsupervised task in many different subjects. Unfortunately, no present algorithm is satisfactory for clustering of high dimensional data with strong nonlinear correlations. In this work, we propose a simple and highly efficient hierarchical clustering algorithm based on encoding by composition rank vectors and tree structure, and demonstrate its utility with clustering of protein structural domains. No record comparison, which is an expensive and essential common step to all present clustering algorithms, is involved. Consequently, it achieves linear time and space computational complexity hierarchical clustering, thus applicable to arbitrarily large datasets. The key factor in this algorithm is definition of composition, which is dependent upon physical nature of target data and therefore need to be constructed case by case. Nonetheless, the algorithm is general and applicable to any high dimensional data with strong nonlinear correlations. We hope this algorithm to inspire a rich research field of encoding based clustering well beyond composition rank vector trees.

Download Full-text

A Novel High-Dimensional Trajectories Construction Network based on Multi-Clustering Algorithm

10.21203/rs.3.rs-1060086/v1 ◽

2021 ◽

Author(s):

Feiyang Ren ◽

Yi Han ◽

Shaohan Wang ◽

He Jiang

Keyword(s):

Economic Analysis ◽

Clustering Algorithm ◽

Transportation Network ◽

High Dimensional ◽

Clustering Methods ◽

Marine Transportation ◽

Network Construction ◽

National Economic ◽

Multi Level ◽

State Of Art

Abstract A novel marine transportation network based on high-dimensional AIS data with a multi-level clustering algorithm is proposed to discover important waypoints in trajectories based on selected navigation features. This network contains two parts: the calculation of major nodes with CLIQUE and BIRCH clustering methods and navigation network construction with edge construction theory. Unlike the state-of-art work for navigation clustering with only ship coordinate, the proposed method contains more high-dimensional features such as drafting, weather, and fuel consumption. By comparing the historical AIS data, more than 220,133 lines of data in 30 days were used to extract 440 major nodal points in less than 4 minutes with ordinary PC specs (i5 processer). The proposed method can be performed on more dimensional data for better ship path planning or even national economic analysis. Current work has shown good performance on complex ship trajectories distinction and great potential for future shipping transportation market analytical predictions.

Download Full-text

Fuzzy C Means Clustering Algorithm for High Dimensional Data Using Feature Subset Selection Technique

IOSR Journal of Computer Engineering ◽

10.9790/0661-16226469 ◽

2014 ◽

Vol 16 (2) ◽

pp. 64-69 ◽

Cited By ~ 1

Author(s):

N. Manjula ◽

◽

S. Pandiarajan ◽

J. Jagadeesan

Keyword(s):

Clustering Algorithm ◽

High Dimensional Data ◽

Subset Selection ◽

Feature Subset Selection ◽

High Dimensional ◽

Feature Subset ◽

Selection Technique ◽

Fuzzy C Means ◽

Fuzzy C Means Clustering

Download Full-text

A heuristic-based fuzzy co-clustering algorithm for categorization of high-dimensional data

Fuzzy Sets and Systems ◽

10.1016/j.fss.2007.10.003 ◽

2008 ◽

Vol 159 (4) ◽

pp. 371-389 ◽

Cited By ~ 23

Author(s):

William-Chandra Tjhi ◽

Lihui Chen

Keyword(s):

Clustering Algorithm ◽

High Dimensional Data ◽

High Dimensional

Download Full-text

DPM: Fast and scalable clustering algorithm for large scale high dimensional datasets

2014 10th International Computer Engineering Conference (ICENCO) ◽

10.1109/icenco.2014.7050427 ◽

2014 ◽

Author(s):

Tamer F. Ghanem ◽

Wail S. Elkilani ◽

Hatem S. Ahmed ◽

Mohiy M. Hadhoud

Keyword(s):

Large Scale ◽

Clustering Algorithm ◽

High Dimensional ◽

Scalable Clustering ◽

High Dimensional Datasets

Download Full-text

A Grid-Based Clustering Algorithm for High-Dimensional Data Streams

Advanced Data Mining and Applications - Lecture Notes in Computer Science ◽

10.1007/11527503_97 ◽

2005 ◽

pp. 824-831 ◽

Cited By ~ 10

Author(s):

Yansheng Lu ◽

Yufen Sun ◽

Guiping Xu ◽

Gang Liu

Keyword(s):

Data Streams ◽

Clustering Algorithm ◽

High Dimensional Data ◽

High Dimensional ◽

Grid Based

Download Full-text

An adaptive Dendrite-HDMR metamodeling technique for high dimensional problems

Journal of Mechanical Design ◽

10.1115/1.4053526 ◽

2022 ◽

pp. 1-38

Author(s):

Qi Zhang ◽

Yizhong Wu ◽

Li Lu ◽

Ping Qiao

Keyword(s):

Adaptive Sampling ◽

Clustering Algorithm ◽

Learning Algorithm ◽

Weight Ratio ◽

Propulsion System ◽

High Dimensional ◽

High Dimensional Model Representation ◽

K Nearest Neighbor ◽

Metamodeling Technique ◽

Explicit Expressions

Abstract High dimensional model representation (HDMR), decomposing the high-dimensional problem into summands of different order component terms, has been widely researched to work out the dilemma of “curse-of-dimensionality” when using surrogate techniques to approximate high-dimensional problems in engineering design. However, the available one-metamodel-based HDMRs usually encounter the predicament of prediction uncertainty, while current multi-metamodels-based HDMRs cannot provide simple explicit expressions for black-box problems, and have high computational complexity in terms of constructing the model by the explored points and predicting the responses of unobserved locations. Therefore, aimed at such problems, a new stand-alone HDMR metamodeling technique, termed as Dendrite-HDMR, is proposed in this study based on the hierarchical Cut-HDMR and the white-box machine learning algorithm, Dendrite Net. The proposed Dendrite-HDMR not only provides succinct and explicit expressions in the form of Taylor expansion, but also has relatively higher accuracy and stronger stability for most mathematical functions than other classical HDMRs with the assistance of the proposed adaptive sampling strategy, named KKMC, in which k-means clustering algorithm, k-Nearest Neighbor classification algorithm and the maximum curvature information of the provided expression are utilized to sample new points to refine the model. Finally, the Dendrite-HDMR technique is applied to solve the design optimization problem of the solid launch vehicle propulsion system with the purpose of improving the impulse-weight ratio, which represents the design level of the propulsion system.

Download Full-text