scholarly journals Adaptive dimensionality reduction for neural network-based online principal component analysis

PLoS ONE ◽  
2021 ◽  
Vol 16 (3) ◽  
pp. e0248896
Author(s):  
Nico Migenda ◽  
Ralf Möller ◽  
Wolfram Schenck

“Principal Component Analysis” (PCA) is an established linear technique for dimensionality reduction. It performs an orthonormal transformation to replace possibly correlated variables with a smaller set of linearly independent variables, the so-called principal components, which capture a large portion of the data variance. The problem of finding the optimal number of principal components has been widely studied for offline PCA. However, when working with streaming data, the optimal number changes continuously. This requires to update both the principal components and the dimensionality in every timestep. While the continuous update of the principal components is widely studied, the available algorithms for dimensionality adjustment are limited to an increment of one in neural network-based and incremental PCA. Therefore, existing approaches cannot account for abrupt changes in the presented data. The contribution of this work is to enable in neural network-based PCA the continuous dimensionality adjustment by an arbitrary number without the necessity to learn all principal components. A novel algorithm is presented that utilizes several PCA characteristics to adaptivly update the optimal number of principal components for neural network-based PCA. A precise estimation of the required dimensionality reduces the computational effort while ensuring that the desired amount of variance is kept. The computational complexity of the proposed algorithm is investigated and it is benchmarked in an experimental study against other neural network-based and incremental PCA approaches where it produces highly competitive results.

2019 ◽  
Vol 11 (10) ◽  
pp. 1219 ◽  
Author(s):  
Lan Zhang ◽  
Hongjun Su ◽  
Jingwei Shen

Dimensionality reduction (DR) is an important preprocessing step in hyperspectral image applications. In this paper, a superpixelwise kernel principal component analysis (SuperKPCA) method for DR that performs kernel principal component analysis (KPCA) on each homogeneous region is proposed to fully utilize the KPCA’s ability to acquire nonlinear features. Moreover, for the proposed method, the differences in the DR results obtained based on different fundamental images (the first principal components obtained by principal component analysis (PCA), KPCA, and minimum noise fraction (MNF)) are compared. Extensive experiments show that when 5, 10, 20, and 30 samples from each class are selected, for the Indian Pines, Pavia University, and Salinas datasets: (1) when the most suitable fundamental image is selected, the classification accuracy obtained by SuperKPCA can be increased by 0.06%–0.74%, 3.88%–4.37%, and 0.39%–4.85%, respectively, when compared with SuperPCA, which performs PCA on each homogeneous region; (2) the DR results obtained based on different first principal components are different and complementary. By fusing the multiscale classification results obtained based on different first principal components, the classification accuracy can be increased by 0.54%–2.68%, 0.12%–1.10%, and 0.01%–0.08%, respectively, when compared with the method based only on the most suitable fundamental image.


2010 ◽  
Vol 10 (03) ◽  
pp. 343-363
Author(s):  
ULRIK SÖDERSTRÖM ◽  
HAIBO LI

In this paper, we examine how much information is needed to represent the facial mimic, based on Paul Ekman's assumption that the facial mimic can be represented with a few basic emotions. Principal component analysis is used to compact the important facial expressions. Theoretical bounds for facial mimic representation are presented both for using a certain number of principal components and a certain number of bits. When 10 principal components are used to reconstruct color image video at a resolution of 240 × 176 pixels the representation bound is on average 36.8 dB, measured in peak signal-to-noise ratio. Practical confirmation of the theoretical bounds is demonstrated. Quantization of projection coefficients affects the representation, but a quantization with approximately 7-8 bits is found to match an exact representation, measured in mean square error.


2012 ◽  
Vol 2012 ◽  
pp. 1-9 ◽  
Author(s):  
Manoj Tripathy

This paper describes a new approach for power transformer differential protection which is based on the wave-shape recognition technique. An algorithm based on neural network principal component analysis (NNPCA) with back-propagation learning is proposed for digital differential protection of power transformer. The principal component analysis is used to preprocess the data from power system in order to eliminate redundant information and enhance hidden pattern of differential current to discriminate between internal faults from inrush and overexcitation conditions. This algorithm has been developed by considering optimal number of neurons in hidden layer and optimal number of neurons at output layer. The proposed algorithm makes use of ratio of voltage to frequency and amplitude of differential current for transformer operating condition detection. This paper presents a comparative study of power transformer differential protection algorithms based on harmonic restraint method, NNPCA, feed forward back propagation neural network (FFBPNN), space vector analysis of the differential signal, and their time characteristic shapes in Park’s plane. The algorithms are compared as to their speed of response, computational burden, and the capability to distinguish between a magnetizing inrush and power transformer internal fault. The mathematical basis for each algorithm is briefly described. All the algorithms are evaluated using simulation performed with PSCAD/EMTDC and MATLAB.


Author(s):  
A. Kallepalli ◽  
A. Kumar ◽  
K. Khoshelham

Hyperspectral data finds applications in the domain of remote sensing. However, with the increase in amounts of information and advantages associated, come the "curse" of dimensionality and additional computational load. The question most often remains as to which subset of the data best represents the information in the imagery. The present work is an attempt to establish entropy, a statistical measure for quantifying uncertainty, as a formidable measure for determining the optimal number of principal components (PCs) for improved identification of land cover classes. Feature extraction from the Airborne Prism EXperiment (APEX) data was achieved utilizing Principal Component Analysis (PCA). However, determination of optimal number of PCs is vital as addition of computational load to the classification algorithm with no significant improvement in accuracy can be avoided. Considering the soft classification approach applied in this work, entropy results are to be analyzed. Comparison of these entropy measures with traditional accuracy assessment of the corresponding „hardened‟ outputs showed results in the affirmative of the objective. The present work concentrates on entropy being utilized for optimal feature extraction for pre-processing before further analysis, rather than the analysis of accuracy obtained from principal component analysis and possibilistic <i>c</i>-means classification. Results show that 7 PCs of the APEX dataset would be the optimal choice, as they show lower entropy and higher accuracy, along with better identification compared to other combinations while utilizing the APEX dataset.


2020 ◽  
Vol 14 ◽  
pp. 174830262097353
Author(s):  
Xiaowei Zhang ◽  
Zhongming Teng

Principal component analysis (PCA) has been a powerful tool for high-dimensional data analysis. It is usually redesigned to the incremental PCA algorithm for processing streaming data. In this paper, we propose a subspace type incremental two-dimensional PCA algorithm (SI2DPCA) derived from an incremental updating of the eigenspace to compute several principal eigenvectors at the same time for the online feature extraction. The algorithm overcomes the problem that the approximate eigenvectors extracted from the traditional incremental two-dimensional PCA algorithm (I2DPCA) are not mutually orthogonal, and it presents more efficiently. In numerical experiments, we compare the proposed SI2DPCA with the traditional I2DPCA in terms of the accuracy of computed approximations, orthogonality errors, and execution time based on widely used datasets, such as FERET, Yale, ORL, and so on, to confirm the superiority of SI2DPCA.


1992 ◽  
Vol 75 (3) ◽  
pp. 929-930 ◽  
Author(s):  
Oliver C. S. Tzeng

This note summarizes my remarks on the application of reliability of the principal component and the eigenvalue-greater-than-1 rule for determining the number of factors in principal component analysis of a correlation matrix. Due to the unpredictability and uselessness of the reliability approach and the Kaiser-Guttman rule, research workers are encouraged to use other methods such as the scree test.


2020 ◽  
Vol 62 (5) ◽  
pp. 517-524
Author(s):  
Yan Wang ◽  
G. Jie ◽  
W. Na ◽  
Y. Chao ◽  
Z. Li ◽  
...  

Abstract This paper aims to improve the calculation efficiency and accuracy of concrete damage degree identification, and then to analyze the damage mechanism of concrete damage. First, the correlation analysis and principal component analysis of 15 characteristic parameters of acoustic emission signals accompanying concrete uniaxial compression and splitting damage process are performed through which the dimension is reduced into 5 non-correlated principal components. Then, based on the analysis of the relationship between each principal component and the damage and cracking mechanism of concrete, the damage degree of concrete is identified as an input variable of the BP neural network. The results show that the 5 principal components effectively eliminate redundant information and carry information on the failure mechanism of concrete damage and the damage process. Principal component analysis and the neural network are used to achieve the accurate recognition of acoustic emission parameters and the degree of concrete damage.


Author(s):  
А.О. Алексанян ◽  
С.О. Старков ◽  
К.В. Моисеев

Данная статья затрагивает проблему распознавания лиц при решении задачи идентификации, где в качестве входных данных для последующей классификации используются вектора-признаки, полученные в результате работы сети глубокого обучения. Немногие существующие алгоритмы способны проводить классификацию на открытых наборах (open-set classification) с достаточно высокой степенью надежности. Общепринятым подходом к проведению классификации является применение классификатора на основании порогового значения. Такой подход обладает рядом существенных недостатков, что и является причиной низкого качества классификации на открытых наборах. Из основных недостатков можно выделить следующие. Во-первых, отсутствие фиксированного порога — невозможно подобрать универсальный порог для каждого лица. Во-вторых, увеличение порога ведет к снижению качества классификации. И, в-третьих, при пороговой классификации одному лицу может соответствовать сразу большое количество классов. В связи с этим мы предлагаем использование метода главных компонент в качестве дополнительного способа понижения размерности, вдобавок к выделению ключевых признаков лица сетью глубокого обучения, для дальнейшей классификации векторов-признаков. Геометрически применение метода главных компонент к векторам-признакам и проведение дальнейшей классификации равносильно поиску пространства меньшей размерности, в котором проекции исходных векторов будут хорошо разделимы. Идея понижения размерности логически вытекает из предположения, что не все компоненты N-мерных векторов-признаков несут значимый вклад в описание человеческого лица и что лишь некоторые компоненты образуют большую часть дисперсии. Таким образом, выделение только значимых компонентов из векторов-признаков позволяет производить разделение классов на основании самых вариативных признаков, без изучения при этом менее информативных данных и без сравнения вектора в пространстве большой размерности. The study objective is face recognition for identification purposes. The input data to be classified are attribute vectors generated by a deep learning neural network. The few existing algorithms can perform sufficiently reliable openset classification. The common approach to classification is using a classification threshold. It has several disadvantages leading to the low quality of openset classifications. The key disadvantages are as follows. First, there is no set threshold: it is impossible to find a common threshold suitable for every face. Second, the higher the threshold, the lower the quality of classification. Third, with the threshold classification more than one class can match a face. For this reason, we proposed to apply the principal component analysis as an extra dimensionality reduction tool besides identifying the key face attributes by a deep learning neural network for subsequent classification of the attribute vectors. In geometric terms, the principal component analysis application to attribute vectors with subsequent classification is similar to a search for a lowdimension space where the projections of the source vectors can be easily separated. The dimensionality reduction concept is based on the assumption that not all the components on Ndimensional attribute vectors are relevant for the human face representation, and only some of them produce the larger part of the dispersion. Therefore, by selecting only the relevant components of the attribute vectors we can separate the classes using the most variable attributes while skipping the less informative data and not comparing the vectors in a highdimensional space.


Author(s):  
Guang-Ho Cha

Principal component analysis (PCA) is an important tool in many areas including data reduction and interpretation, information retrieval, image processing, and so on. Kernel PCA has recently been proposed as a nonlinear extension of the popular PCA. The basic idea is to first map the input space into a feature space via a nonlinear map and then compute the principal components in that feature space. This paper illustrates the potential of kernel PCA for dimensionality reduction and feature extraction in multimedia retrieval. By the use of Gaussian kernels, the principal components were computed in the feature space of an image data set and they are used as new dimensions to approximate image features. Extensive experimental results show that kernel PCA performs better than linear PCA with respect to the retrieval quality as well as the retrieval precision in content-based image retrievals.Keywords: Principal component analysis, kernel principal component analysis, multimedia retrieval, dimensionality reduction, image retrieval


Sign in / Sign up

Export Citation Format

Share Document