Impact of Kernel-PCA on Different Features for Person Re-Identification

In the driving field of computer vision, re-identification of an individual in a camera network is very challenging task. Existing methods mainly focus on strategies based on feature learning, which provide feature space and force the same person to be closer than separate individuals. These methods rely to a large extent on high-dimensional feature vectors to achieve high re-identification accuracy. Due to computational cost and efficiency, they are difficult to achieve in practical applications. We comprehensively analyzed the effect of kernel-based principal component analysis (PCA) on some existing high-dimensional person re-identification feature extractors to solve these problems. We initially formulate a kernel function on the extracted features and then apply PCA, significantly reducing the feature dimension. After that, we have proved that the kernel is very effective on different state-of-the-art high-dimensional feature descriptors. Finally, a thorough experimental evaluation of the reference person re-identification data set determined that the prediction method was significantly superior to more advanced techniques and computationally feasible.

Download Full-text

A Novel Density-based Technique for Outlier Detection of High Dimensional Data Utilizing Full Feature Space

Information Technology And Control ◽

10.5755/j01.itc.50.1.25588 ◽

2021 ◽

Vol 50 (1) ◽

pp. 138-152

Author(s):

Mujeeb Ur Rehman ◽

Dost Muhammad Khan

Keyword(s):

Data Mining ◽

Outlier Detection ◽

High Dimensional Data ◽

Research Work ◽

Feature Space ◽

High Dimensional ◽

Data Set ◽

Data Points ◽

Low Dimensional ◽

Intrinsic Feature

Recently, anomaly detection has acquired a realistic response from data mining scientists as a graph of its reputation has increased smoothly in various practical domains like product marketing, fraud detection, medical diagnosis, fault detection and so many other fields. High dimensional data subjected to outlier detection poses exceptional challenges for data mining experts and it is because of natural problems of the curse of dimensionality and resemblance of distant and adjoining points. Traditional algorithms and techniques were experimented on full feature space regarding outlier detection. Customary methodologies concentrate largely on low dimensional data and hence show ineffectiveness while discovering anomalies in a data set comprised of a high number of dimensions. It becomes a very difficult and tiresome job to dig out anomalies present in high dimensional data set when all subsets of projections need to be explored. All data points in high dimensional data behave like similar observations because of its intrinsic feature i.e., the distance between observations approaches to zero as the number of dimensions extends towards infinity. This research work proposes a novel technique that explores deviation among all data points and embeds its findings inside well established density-based techniques. This is a state of art technique as it gives a new breadth of research towards resolving inherent problems of high dimensional data where outliers reside within clusters having different densities. A high dimensional dataset from UCI Machine Learning Repository is chosen to test the proposed technique and then its results are compared with that of density-based techniques to evaluate its efficiency.

Download Full-text

Dimensionality and Its Reduction

Statistics, Data Mining, and Machine Learning in Astronomy ◽

10.23943/princeton/9780691151687.003.0007 ◽

2014 ◽

Author(s):

Andrew J. Connolly ◽

Jacob T. VanderPlas ◽

Alexander Gray ◽

Andrew J. Connolly ◽

Jacob T. VanderPlas ◽

...

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Reduction Technique ◽

High Dimensional ◽

Data Sets ◽

Data Set ◽

Gaussian Distributions ◽

Dimensionality Reduction Technique ◽

Alternative Techniques ◽

New Generation

With the dramatic increase in data available from a new generation of astronomical telescopes and instruments, many analyses must address the question of the complexity as well as size of the data set. This chapter deals with how we can learn which measurements, properties, or combinations thereof carry the most information within a data set. It describes techniques that are related to concepts discussed when describing Gaussian distributions, density estimation, and the concepts of information content. The chapter begins with an exploration of the problems posed by high-dimensional data. It then describes the data sets used in this chapter, and introduces perhaps the most important and widely used dimensionality reduction technique, principal component analysis (PCA). The remainder of the chapter discusses several alternative techniques which address some of the weaknesses of PCA.

Download Full-text

Non-Intrusive Load Disaggregation by Linear Classifier Group Considering Multi-Feature Integration

Applied Sciences ◽

10.3390/app9173558 ◽

2019 ◽

Vol 9 (17) ◽

pp. 3558 ◽

Cited By ~ 3

Author(s):

Jinying Yu ◽

Yuchen Gao ◽

Yuxin Wu ◽

Dian Jiao ◽

Chang Su ◽

...

Keyword(s):

Identification Accuracy ◽

Data Set ◽

Linear Discriminant ◽

Practical Applications ◽

Core Technology ◽

Open Source Data ◽

Source Data ◽

Load Monitoring ◽

Global Similarity ◽

Linear Discriminant Classifier

Non-intrusive load monitoring (NILM) is a core technology for demand response (DR) and energy conservation services. Traditional NILM methods are rarely combined with practical applications, and most studies aim to disaggregate the whole loads in a household, which leads to low identification accuracy. In this method, the event detection method is used to obtain the switching event sets of all loads, and the power consumption curves of independent unknown electrical appliances in a period are disaggregated by utilizing comprehensive features. A linear discriminant classifier group based on multi-feature global similarity is used for load identification. The uniqueness of our algorithm is that it designs an event detector based on steady-state segmentation and a linear discriminant classifier group based on multi-feature global similarity. The simulation is carried out on an open source data set. The results demonstrate the effectiveness and high accuracy of the multi-feature integrated classification (MFIC) algorithm by using the state-of-the-art NILM methods as benchmarks.

Download Full-text

Principal component of explained variance: An efficient and optimal data dimension reduction framework for association studies

Statistical Methods in Medical Research ◽

10.1177/0962280216660128 ◽

2016 ◽

Vol 27 (5) ◽

pp. 1331-1350 ◽

Cited By ~ 4

Author(s):

Maxime Turgeon ◽

Karim Oualkacha ◽

Antonio Ciampi ◽

Hanane Miftah ◽

Golsa Dehghan ◽

...

Keyword(s):

Dimension Reduction ◽

Association Studies ◽

Computational Cost ◽

Principal Component ◽

Original Method ◽

High Dimensional ◽

Testing Procedures ◽

Simple Strategy ◽

Reduction Techniques ◽

Explained Variance

The genomics era has led to an increase in the dimensionality of data collected in the investigation of biological questions. In this context, dimension-reduction techniques can be used to summarise high-dimensional signals into low-dimensional ones, to further test for association with one or more covariates of interest. This paper revisits one such approach, previously known as principal component of heritability and renamed here as principal component of explained variance (PCEV). As its name suggests, the PCEV seeks a linear combination of outcomes in an optimal manner, by maximising the proportion of variance explained by one or several covariates of interest. By construction, this method optimises power; however, due to its computational complexity, it has unfortunately received little attention in the past. Here, we propose a general analytical PCEV framework that builds on the assets of the original method, i.e. conceptually simple and free of tuning parameters. Moreover, our framework extends the range of applications of the original procedure by providing a computationally simple strategy for high-dimensional outcomes, along with exact and asymptotic testing procedures that drastically reduce its computational cost. We investigate the merits of the PCEV using an extensive set of simulations. Furthermore, the use of the PCEV approach is illustrated using three examples taken from the fields of epigenetics and brain imaging.

Download Full-text

EVALUATING UNIFORM MANIFOLD APPROXIMATION AND PROJECTION FOR DIMENSION REDUCTION AND VISUALIZATION OF POLINSAR FEATURES

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-1-2021-39-2021 ◽

2021 ◽

Vol V-1-2021 ◽

pp. 39-46

Author(s):

S. Schmitz ◽

U. Weidner ◽

H. Hammer ◽

A. Thiele

Keyword(s):

Dimension Reduction ◽

Visual Analysis ◽

Principal Component ◽

Feature Space ◽

Decomposition Methods ◽

High Dimensional ◽

Feature Representations ◽

Wide Range ◽

Nonlinear Dimension ◽

Low Dimensional

Abstract. In this paper, the nonlinear dimension reduction algorithm Uniform Manifold Approximation and Projection (UMAP) is investigated to visualize information contained in high dimensional feature representations of Polarimetric Interferometric Synthetic Aperture Radar (PolInSAR) data. Based on polarimetric parameters, target decomposition methods and interferometric coherences a wide range of features is extracted that spans the high dimensional feature space. UMAP is applied to determine a representation of the data in 2D and 3D euclidean space, preserving local and global structures of the data and still suited for classification. The performance of UMAP in terms of generating expressive visualizations is evaluated on PolInSAR data acquired by the F-SAR sensor and compared to that of Principal Component Analysis (PCA), Laplacian Eigenmaps (LE) and t-distributed Stochastic Neighbor embedding (t-SNE). For this purpose, a visual analysis of 2D embeddings is performed. In addition, a quantitative analysis is provided for evaluating the preservation of information in low dimensional representations with respect to separability of different land cover classes. The results show that UMAP exceeds the capability of PCA and LE in these regards and is competitive with t-SNE.

Download Full-text

Nonlinear Model for Condition Monitoring and Fault Detection Based on Nonlocal Kernel Orthogonal Preserving Embedding

Shock and Vibration ◽

10.1155/2018/5794513 ◽

2018 ◽

Vol 2018 ◽

pp. 1-16

Author(s):

Bo She ◽

Fuqing Tian ◽

Weige Liang ◽

Gang Zhang

Keyword(s):

Fault Detection ◽

Dimension Reduction ◽

Condition Monitoring ◽

Data Structures ◽

Principal Component ◽

Feature Space ◽

Kernel Principal Component Analysis ◽

Local Data ◽

Data Set ◽

Global And Local

The dimension reduction methods have been proved powerful and practical to extract latent features in the signal for process monitoring. A linear dimension reduction method called nonlocal orthogonal preserving embedding (NLOPE) and its nonlinear form named nonlocal kernel orthogonal preserving embedding (NLKOPE) are proposed and applied for condition monitoring and fault detection. Different from kernel orthogonal neighborhood preserving embedding (KONPE) and kernel principal component analysis (KPCA), the NLOPE and NLKOPE models aim at preserving global and local data structures simultaneously by constructing a dual-objective optimization function. In order to adjust the trade-off between global and local data structures, a weighted parameter is introduced to balance the objective function. Compared with KONPE and KPCA, NLKOPE combines both the advantages of KONPE and KPCA, and NLKOPE is also more powerful in extracting potential useful features in nonlinear data set than NLOPE. For the purpose of condition monitoring and fault detection, monitoring statistics are constructed in feature space. Finally, three case studies on the gearbox and bearing test rig are carried out to demonstrate the effectiveness of the proposed nonlinear fault detection method.

Download Full-text

Geometric Algebra Neuron for SAR Automation Target Recognition

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.187.319 ◽

2011 ◽

Vol 187 ◽

pp. 319-325

Author(s):

Wen Ming Cao ◽

Xiong Feng Li ◽

Li Juan Pu

Keyword(s):

Geometric Algebra ◽

Target Recognition ◽

Dimensional Space ◽

Feature Space ◽

Automatic Target Recognition ◽

High Dimensional ◽

Small Data ◽

Data Set ◽

High Dimensional Space ◽

Stationary Target

Biometric Pattern Recognition aim at finding the best coverage of per kind of sample’s distribution in the feature space. This paper employed geometric algebra to determine local continuum (connected) direction and connected path of same kind of target of SAR images of the complex geometrical body in high dimensional space. We researched the property of the GA Neuron of the coverage body in high dimensional space and studied a kind of SAR ATR(SAR automatic target recognition) technique which works with small data amount and result to high recognizing rate. Finally, we verified our algorithm with MSTAR (Moving and Stationary Target Acquisition and Recognition) [1] data set.

Download Full-text

A SURVEY ON THE CURES FOR THE CURSE OF DIMENSIONALITY IN BIG DATA

Asian Journal of Pharmaceutical and Clinical Research ◽

10.22159/ajpcr.2017.v10s1.19755 ◽

2017 ◽

Vol 10 (13) ◽

pp. 355 ◽

Cited By ~ 1

Author(s):

Reshma Remesh ◽

Pattabiraman. V

Keyword(s):

Dimensionality Reduction ◽

Input Data ◽

Principal Component ◽

Kernel Principal Component Analysis ◽

High Dimensional ◽

Data Sets ◽

Learning Approaches ◽

Data Set ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques

Dimensionality reduction techniques are used to reduce the complexity for analysis of high dimensional data sets. The raw input data set may have large dimensions and it might consume time and lead to wrong predictions if unnecessary data attributes are been considered for analysis. So using dimensionality reduction techniques one can reduce the dimensions of input data towards accurate prediction with less cost. In this paper the different machine learning approaches used for dimensionality reductions such as PCA, SVD, LDA, Kernel Principal Component Analysis and Artificial Neural Network have been studied.

Download Full-text

An Error Minimizing Partitioning Method for the Nearest Neighbor Search

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.321-324.2165 ◽

2013 ◽

Vol 321-324 ◽

pp. 2165-2170

Author(s):

Seung Hoon Lee ◽

Jaek Wang Kim ◽

Jae Dong Lee ◽

Jee Hyong Lee

Keyword(s):

Nearest Neighbor ◽

Dimensional Space ◽

Computational Cost ◽

Nearest Neighbor Search ◽

High Dimensional ◽

Index Structures ◽

Cost Index ◽

Data Set ◽

High Dimensional Space ◽

Neighbor Search

The nearest neighbor search in high-dimensional space is an important operation in many applications, such as data mining and multimedia databases. Evaluating similarity in high-dimensional space requires high computational cost; index-structures are frequently used for reducing computational cost. Most of these index-structures are built by partitioning the data set. However, the partitioning approaches potentially have the problem of failing to find the nearest neighbor that is caused by partitions. In this paper, we propose the Error Minimizing Partitioning (EMP) method with a novel tree structure that minimizes the failures of finding the nearest neighbors. EMP divides the data into subsets with considering the distribution of data sets. For partitioning a data set, the proposed method finds the line that minimizes the summation of distance to data points. The method then finds the median of the data set. Finally, our proposed method determines the partitioning hyper-plane that passes the median and is perpendicular to the line. We also make a comparative study between existing methods and the proposed method to verify the effectiveness of our method.

Download Full-text

Prewhitening High-Dimensional fMRI Data Sets Without Eigendecomposition

Neural Computation ◽

10.1162/neco_a_00578 ◽

2014 ◽

Vol 26 (5) ◽

pp. 907-919 ◽

Cited By ~ 8

Author(s):

Abd-Krim Seghouane ◽

Yousef Saad

Keyword(s):

Covariance Matrix ◽

Mean Squared Error ◽

Computational Cost ◽

Original Data ◽

Fmri Data ◽

High Dimensional ◽

Data Sets ◽

Data Set ◽

Low Computational Cost ◽

Eigenvectors And Eigenvalues

This letter proposes an algorithm for linear whitening that minimizes the mean squared error between the original and whitened data without using the truncated eigendecomposition (ED) of the covariance matrix of the original data. This algorithm uses Lanczos vectors to accurately approximate the major eigenvectors and eigenvalues of the covariance matrix of the original data. The major advantage of the proposed whitening approach is its low computational cost when compared with that of the truncated ED. This gain comes without sacrificing accuracy, as illustrated with an experiment of whitening a high-dimensional fMRI data set.

Download Full-text