Why principal component analysis is not an appropriate feature extraction method for hyperspectral data

Author(s):  
A. Cheriyadat ◽  
L.M. Bruce
Computation ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 78
Author(s):  
Shengkun Xie

Feature extraction plays an important role in machine learning for signal processing, particularly for low-dimensional data visualization and predictive analytics. Data from real-world complex systems are often high-dimensional, multi-scale, and non-stationary. Extracting key features of this type of data is challenging. This work proposes a novel approach to analyze Epileptic EEG signals using both wavelet power spectra and functional principal component analysis. We focus on how the feature extraction method can help improve the separation of signals in a low-dimensional feature subspace. By transforming EEG signals into wavelet power spectra, the functionality of signals is significantly enhanced. Furthermore, the power spectra transformation makes functional principal component analysis suitable for extracting key signal features. Therefore, we refer to this approach as a double feature extraction method since both wavelet transform and functional PCA are feature extractors. To demonstrate the applicability of the proposed method, we have tested it using a set of publicly available epileptic EEGs and patient-specific, multi-channel EEG signals, for both ictal signals and pre-ictal signals. The obtained results demonstrate that combining wavelet power spectra and functional principal component analysis is promising for feature extraction of epileptic EEGs. Therefore, they can be useful in computer-based medical systems for epilepsy diagnosis and epileptic seizure detection problems.


2012 ◽  
Vol 572 ◽  
pp. 7-12
Author(s):  
Fei He ◽  
Quan Yang ◽  
Bao Jian Wang

With more and more process data acquired from manufacturing process, extracting useful information to build empirical models of past successful operations is urgently required to get higher product quality. Clustering is the important data mining methods, where feature extraction is a significant factor to ensure the accurate rate of clustering and classification. As a common non-linear feature extraction method, kernel principal component analysis (KPCA) uses the variance as the information metric, but the variance is not always effective in some cases. Since information entropy is nonlinear and can effectively represent the dependencies of features, the Renyi entropy is used as the information metric to extract the feature in this paper. Simulation data, Tennessee Eastman and hot rolling process data are used for model validation. As a result the proposed method has better performance on feature extraction, compared with traditional KPCA.


2012 ◽  
Vol 2012 ◽  
pp. 1-13 ◽  
Author(s):  
Shengkun Xie ◽  
Anna T. Lawniczak ◽  
Sridhar Krishnan ◽  
Pietro Lio

We introduce multiscale wavelet kernels to kernel principal component analysis (KPCA) to narrow down the search of parameters required in the calculation of a kernel matrix. This new methodology incorporates multiscale methods into KPCA for transforming multiscale data. In order to illustrate application of our proposed method and to investigate the robustness of the wavelet kernel in KPCA under different levels of the signal to noise ratio and different types of wavelet kernel, we study a set of two-class clustered simulation data. We show that WKPCA is an effective feature extraction method for transforming a variety of multidimensional clustered data into data with a higher level of linearity among the data attributes. That brings an improvement in the accuracy of simple linear classifiers. Based on the analysis of the simulation data sets, we observe that multiscale translation invariant wavelet kernels for KPCA has an enhanced performance in feature extraction. The application of the proposed method to real data is also addressed.


2020 ◽  
Vol 9 (2) ◽  
pp. 72-79
Author(s):  
Sari Ayu Wulandari ◽  
Sutikno Madnasri ◽  
Ratih Pramitasari ◽  
Susilo Susilo

The need for aroma recognition devices or often known as enose (electronic nose), is increasing. In the health field, enose can detect early diabetes mellitus (DM) type 2 from the aroma of urine. Enose is an aroma recognition tool that uses a pattern recognition algorithm to recognize the urine aroma of diabetics based on input signals from an array of gas sensors. The need for portable enose devices is increasing due to the increasing need for real-time needs. Enose devices have an enormous impact on the choice of the gas sensor Array in the enose. This article discusses the effect of the number of sensor arrays used on the recognition results. Enose uses a maximum of 4 sensors, with a maximum feature matrix. After that, the feature matrix enters the PCA (Principal Component Analysis) feature extraction and clustering using the FCM (Fuzzy C Means) method. The number of sensors indicates the number of features. Enose using method for feature selection, it’s a variation from 4 sensors, where experiment 1 uses 4 sensors, experiment 2 uses a variation of 3 sensors and experiment 3 uses a variation of 2 sensors. Especially for sensors 3 and 4 using feature extraction method, PCA (Principal Component Analysis), to reduce features to only 2 best features. As for the variation of 2 sensors use primer feature matrix. After feature selection, the number of features is 2 out of 11 variations. Next, do the grouping using the FCM (Fuzzy C Means) method. The results show that using two sensors has a high accuracy rate of 92.5%.


Sign in / Sign up

Export Citation Format

Share Document