scholarly journals A Face Categorization Algorithm Based on Convolutional Neural Networks and Principal Component Analysis

Author(s):  
А.О. Алексанян ◽  
С.О. Старков ◽  
К.В. Моисеев

Данная статья затрагивает проблему распознавания лиц при решении задачи идентификации, где в качестве входных данных для последующей классификации используются вектора-признаки, полученные в результате работы сети глубокого обучения. Немногие существующие алгоритмы способны проводить классификацию на открытых наборах (open-set classification) с достаточно высокой степенью надежности. Общепринятым подходом к проведению классификации является применение классификатора на основании порогового значения. Такой подход обладает рядом существенных недостатков, что и является причиной низкого качества классификации на открытых наборах. Из основных недостатков можно выделить следующие. Во-первых, отсутствие фиксированного порога — невозможно подобрать универсальный порог для каждого лица. Во-вторых, увеличение порога ведет к снижению качества классификации. И, в-третьих, при пороговой классификации одному лицу может соответствовать сразу большое количество классов. В связи с этим мы предлагаем использование метода главных компонент в качестве дополнительного способа понижения размерности, вдобавок к выделению ключевых признаков лица сетью глубокого обучения, для дальнейшей классификации векторов-признаков. Геометрически применение метода главных компонент к векторам-признакам и проведение дальнейшей классификации равносильно поиску пространства меньшей размерности, в котором проекции исходных векторов будут хорошо разделимы. Идея понижения размерности логически вытекает из предположения, что не все компоненты N-мерных векторов-признаков несут значимый вклад в описание человеческого лица и что лишь некоторые компоненты образуют большую часть дисперсии. Таким образом, выделение только значимых компонентов из векторов-признаков позволяет производить разделение классов на основании самых вариативных признаков, без изучения при этом менее информативных данных и без сравнения вектора в пространстве большой размерности. The study objective is face recognition for identification purposes. The input data to be classified are attribute vectors generated by a deep learning neural network. The few existing algorithms can perform sufficiently reliable openset classification. The common approach to classification is using a classification threshold. It has several disadvantages leading to the low quality of openset classifications. The key disadvantages are as follows. First, there is no set threshold: it is impossible to find a common threshold suitable for every face. Second, the higher the threshold, the lower the quality of classification. Third, with the threshold classification more than one class can match a face. For this reason, we proposed to apply the principal component analysis as an extra dimensionality reduction tool besides identifying the key face attributes by a deep learning neural network for subsequent classification of the attribute vectors. In geometric terms, the principal component analysis application to attribute vectors with subsequent classification is similar to a search for a lowdimension space where the projections of the source vectors can be easily separated. The dimensionality reduction concept is based on the assumption that not all the components on Ndimensional attribute vectors are relevant for the human face representation, and only some of them produce the larger part of the dispersion. Therefore, by selecting only the relevant components of the attribute vectors we can separate the classes using the most variable attributes while skipping the less informative data and not comparing the vectors in a highdimensional space.

2021 ◽  
Vol 25 (2) ◽  
pp. 169-178
Author(s):  
Changro Lee

Despite the popularity deep learning has been gaining, measuring the uncertainty within the result has not met expectations in many deep learning applications and this includes property valuation. In real-world tasks, however, rather than simply requiring predictions, assurance of the certainty of the predictions is also demanded. In this study, supervised learning is combined with unsupervised learning to bridge this gap. A method based on principal component analysis, a popular tool of unsupervised learning, was developed and used to represent the uncertainty in property valuation. Then, a neural network, a representative algorithm to implement supervised learning, was constructed, and trained to predict land prices. Finally, the uncertainty that was measured using principal component analysis was incorporated into the price predicted by the neural network. This hybrid approach is shown to be likely to improve the credibility of the valuation work. The findings of this study are expected to generate interest in the integration of the two learning approaches, thereby promoting the rapid adoption of deep learning tools in the property valuation industry.


2017 ◽  
Vol 23 (1) ◽  
pp. 67
Author(s):  
Luis E. Huamanchumo de la Cuba ◽  
Luis A. Sánchez Alvarado

La presente investigación plantea como objetivo estudiar aspectos técnicos relacionados con la implementación de la red neuronal de Análisis de Componentes Principales (ACP) en términos de su capacidad predictiva, generalización y precisión con el fin de establecer criterios óptimos para su validación, evaluación del desempeño e implementación. Para ello, se plantea la hipótesis de que la estructura estadística de los datos influye significativamente en el óptimo desempeño de la red neuronal de ACP en el contexto no supervisado. Se demostró que el algoritmo Hebbiano de la fase de aprendizaje garantiza la calidad de representación de la red debido a que capitaliza eficientemente la información en escenarios con varianza generalizada grande. Palabras clave.-Análisis de componentes principales, Algoritmo hebbiano, Reducción de dimensionalidad. ABSTRACTThe purpose of this research is to study technical aspects involved in the implementation of a Principal Component Analysis (PCA) neural network in terms of predictive capacity, generalization and accuracy in order to establish optimal criteria for the validation and implementation thereof. Our hypothesis is that the statistical structure of the data affects the optimal performance of a PCA neural network in the unsupervised context. It was demonstrated that the Hebbian algorithm at the learning phase ensures enhanced quality of network representation as it makes efficient use of information where generalized variance is large. Keywords.-Principal component analysis, Hebbian algorithm, Dimensionality reduction


2020 ◽  
Author(s):  
Kristiina Ausmees ◽  
Carl Nettelblad

ABSTRACTDimensionality reduction is a data transformation technique widely used in various fields of genomics research, with principal component analysis one of the most frequently employed methods. Application of principal component analysis to genotype data is known to capture genetic similarity between individuals, and is used for visualization of genetic variation, identification of population structure as well as ancestry mapping. However, the method is based on a linear model that is sensitive to characteristics of data such as correlation of single-nucleotide polymorphisms due to linkage disequilibrium, resulting in limitations in its ability to capture complex population structure.Deep learning models are a type of nonlinear machine learning method in which the features used in data transformation are decided by the model in a data-driven manner, rather than by the researcher, and have been shown to present a promising alternative to traditional statistical methods for various applications in omics research. In this paper, we propose a deep learning model based on a convolutional autoencoder architecture for dimensionality reduction of genotype data.Using a highly diverse cohort of human samples, we demonstrate that the model can identify population clusters and provide richer visual information in comparison to principal component analysis, and also yield a more accurate population classification model. We also discuss the use of the methodology for more general characterization of genotype data, showing that models of a similar architecture can be used as a genetic clustering method, comparing results to the ADMIXTURE software frequently used in population genetic studies.


PLoS ONE ◽  
2021 ◽  
Vol 16 (3) ◽  
pp. e0248896
Author(s):  
Nico Migenda ◽  
Ralf Möller ◽  
Wolfram Schenck

“Principal Component Analysis” (PCA) is an established linear technique for dimensionality reduction. It performs an orthonormal transformation to replace possibly correlated variables with a smaller set of linearly independent variables, the so-called principal components, which capture a large portion of the data variance. The problem of finding the optimal number of principal components has been widely studied for offline PCA. However, when working with streaming data, the optimal number changes continuously. This requires to update both the principal components and the dimensionality in every timestep. While the continuous update of the principal components is widely studied, the available algorithms for dimensionality adjustment are limited to an increment of one in neural network-based and incremental PCA. Therefore, existing approaches cannot account for abrupt changes in the presented data. The contribution of this work is to enable in neural network-based PCA the continuous dimensionality adjustment by an arbitrary number without the necessity to learn all principal components. A novel algorithm is presented that utilizes several PCA characteristics to adaptivly update the optimal number of principal components for neural network-based PCA. A precise estimation of the required dimensionality reduces the computational effort while ensuring that the desired amount of variance is kept. The computational complexity of the proposed algorithm is investigated and it is benchmarked in an experimental study against other neural network-based and incremental PCA approaches where it produces highly competitive results.


Sign in / Sign up

Export Citation Format

Share Document