scholarly journals Semi-Supervised Cross-Modal Retrieval Based on Discriminative Comapping

Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-13 ◽  
Author(s):  
Li Liu ◽  
Xiao Dong ◽  
Tianshi Wang

Most cross-modal retrieval methods based on subspace learning just focus on learning the projection matrices that map different modalities to a common subspace and pay less attention to the retrieval task specificity and class information. To address the two limitations and make full use of unlabelled data, we propose a novel semi-supervised method for cross-modal retrieval named modal-related retrieval based on discriminative comapping (MRRDC). The projection matrices are obtained to map multimodal data into a common subspace for different tasks. In the process of projection matrix learning, a linear discriminant constraint is introduced to preserve the original class information in different modal spaces. An iterative optimization algorithm based on label propagation is presented to solve the proposed joint learning formulations. The experimental results on several datasets demonstrate the superiority of our method compared with state-of-the-art subspace methods.

2006 ◽  
Vol 03 (01) ◽  
pp. 45-51
Author(s):  
YANWEI PANG ◽  
ZHENGKAI LIU ◽  
YUEFANG SUN

Subspace-based face recognition method aims to find a low-dimensional subspace of face appearance embedded in a high-dimensional image space. The differences between different methods lie in their different motivations and objective functions. The objective function of the proposed method is formed by combining the ideas of linear Laplacian eigenmaps and linear discriminant analysis. The actual computation of the subspace reduces to a maximum eigenvalue problem. Major advantage of the proposed method over traditional methods is that it utilizes both local manifold structure information and discriminant information of the training data. Experimental results on the AR face databases demonstrate the effectiveness of the proposed method.


2018 ◽  
Vol 27 (08) ◽  
pp. 1850121 ◽  
Author(s):  
Zhe Sun ◽  
Zheng-Ping Hu ◽  
Raymond Chiong ◽  
Meng Wang ◽  
Wei He

Recent research has demonstrated the effectiveness of deep subspace learning networks, including the principal component analysis network (PCANet) and linear discriminant analysis network (LDANet), since they can extract high-level features and better represent abstract semantics of given data. However, their representation does not consider the nonlinear relationship of data and limits the use of features with nonlinear metrics. In this paper, we propose a novel architecture combining the kernel collaboration representation with deep subspace learning based on the PCANet and LDANet for facial expression recognition. First, the PCANet and LDANet are employed to learn abstract features. These features are then mapped to the kernel space to effectively capture their nonlinear similarities. Finally, we develop a simple yet effective classification method with squared [Formula: see text]-regularization, which improves the recognition accuracy and reduces time complexity. Comprehensive experimental results based on the JAFFE, CK[Formula: see text], KDEF and CMU Multi-PIE datasets confirm that our proposed approach has superior performance not just in terms of accuracy, but it is also robust against block occlusion and varying parameter configurations.


2015 ◽  
Vol 7 (7) ◽  
pp. 9253-9268 ◽  
Author(s):  
Chun Liu ◽  
Junjun Yin ◽  
Jian Yang ◽  
Wei Gao

One key problem for the classification of multi-frequency polarimetric SAR images is to extract target features simultaneously in the aspects of frequency, polarization and spatial texture. This paper proposes a new classification method for multi-frequency polarimetric SAR data based on tensor representation and multi-linear subspace learning (MLS). Firstly, each cell of the SAR images is represented by a third-order tensor in the frequency, polarization and spatial domains, with each order of tensor corresponding to one domain. Then, two main MLS methods, i.e., multi-linear principal component analysis (MPCA) and multi-linear extension of linear discriminant analysis (MLDA), are used to learn the third-order tensors. MPCA is used to analyze the principal component of the tensors. MLDA is applied to improve the discrimination between different land covers. Finally, the lower dimension subtensor features extracted by the MPCA and MLDA algorithms are classified with a neural network (NN) classifier. The classification scheme is accessed using multi-band polarimetric SAR images (C-, L- and P-band) acquired by the Airborne Synthetic Aperture Radar (AIRSAR) sensor of the Jet Propulsion Laboratory (JPL) over the Flevoland area. Experimental results demonstrate that the proposed method has good classification performance in comparison with the classic multi-band Wishart classifier. The overall classification accuracy is close to 99%, even when the number of training samples is small.


2021 ◽  
Vol 13 (4) ◽  
pp. 755
Author(s):  
Jianqiao Luo ◽  
Yihan Wang ◽  
Yang Ou ◽  
Biao He ◽  
Bailin Li

Many aerial images with similar appearances have different but correlated scene labels, which causes the label ambiguity. Label distribution learning (LDL) can express label ambiguity by giving each sample a label distribution. Thus, a sample contributes to the learning of its ground-truth label as well as correlated labels, which improve data utilization. LDL has gained success in many fields, such as age estimation, in which label ambiguity can be easily modeled on the basis of the prior knowledge about local sample similarity and global label correlations. However, LDL has never been applied to scene classification, because there is no knowledge about the local similarity and label correlations and thus it is hard to model label ambiguity. In this paper, we uncover the sample neighbors that cause label ambiguity by jointly capturing the local similarity and label correlations and propose neighbor-based LDL (N-LDL) for aerial scene classification. We define a subspace learning problem, which formulates the neighboring relations as a coefficient matrix that is regularized by a sparse constraint and label correlations. The sparse constraint provides a few nearest neighbors, which captures local similarity. The label correlations are predefined according to the confusion matrices on validation sets. During subspace learning, the neighboring relations are encouraged to agree with the label correlations, which ensures that the uncovered neighbors have correlated labels. Finally, the label propagation among the neighbors forms the label distributions, which leads to label smoothing in terms of label ambiguity. The label distributions are used to train convolutional neural networks (CNNs). Experiments on the aerial image dataset (AID) and NWPU_RESISC45 (NR) datasets demonstrate that using the label distributions clearly improves the classification performance by assisting feature learning and mitigating over-fitting problems, and our method achieves state-of-the-art performance.


Author(s):  
Yudan Qi ◽  
◽  
Huaxiang Zhang

The heterogeneity of multimodal data is the main challenge in cross-media retrieval; many methods have already been developed to address the problem. At present, subspace learning is one of the mainstream approaches for cross-media retrieval; its aim is to learn a latent shared subspace so that similarities within cross-modal data can be measured in this subspace. However, most existing subspace learning algorithms only focus on supervised information, using labeled data for training to obtain one pair of mapping matrices. In this paper, we propose joint graph regularization based on semi-supervised learning cross-media retrieval (JGRHS), which makes full use of labeled and unlabeled data. We jointly considered correlation analysis and semantic information when learning projection matrices to maintain the closeness of pairwise data and semantic consistency; graph regularization is used to make learned transformation consistent with similarity constraints in both modalities. In addition, the retrieval results on three datasets indicate that the proposed method achieves good efficiency in theoretical research and practical applications.


Symmetry ◽  
2018 ◽  
Vol 10 (10) ◽  
pp. 487 ◽  
Author(s):  
Yeong-Hyeon Byeon ◽  
Jae-Neung Lee ◽  
Sung-Bum Pan ◽  
Keun-Chang Kwak

In this study, we present a third-order tensor-based multilinear eigenECG (MEECG) and multilinear Fisher ECG (MFECG) for individual identification based on the information obtained by an electrocardiogram (ECG) sensor. MEECG and MFECG are based on multilinear principal component analysis (MPCA) and multilinear linear discriminant analysis (MLDA) in the field of multilinear subspace learning (MSL), respectively. MSL directly extracts features without the vectorization of input data, while MSL extracts features without vectorizing the input data while maintaining most of the correlations shown in the original structure. In contrast with unsupervised linear subspace learning (LSL) techniques such as PCA (Principal Component Analysis) and LDA (Linear Discriminant Analysis), it is less susceptible to small-data problems because it learns more compact and potentially useful representations, and it can efficiently handle large tensors. Here, the third-order tensor is formed by reordering the one-dimensional ECG signal into a two-dimensional matrix, considering the time frame. The MSL consists of four steps. The first step is preprocessing, in which input samples are centered. The second step is initialization, in which eigen decomposition is performed and the most significant eigenvectors are selected. The third step is local optimization, in which input data is applied by eigenvectors from the second step, and new eigenvectors are calculated using the applied input data. The final step is projection, in which the resultant feature tensors after projection are obtained. The experiments are performed on two databases for performance evaluation. The Physikalisch-Technische Bundesanstalt (PTB)-ECG is a well-known database, and Chosun University (CU)-ECG is directly built for this study using the developed ECG sensor. The experimental results revealed that the tensor-based MEECG and MFECG showed good identification performance in comparison to PCA and LDA of LSL.


Author(s):  
Xingrui Zhang ◽  
Yulian Zhu ◽  
Xiaohong Chen

Face recognition, as a research hot topic, still faces many challenges. This paper proposes a new face recognition method by fusing the advantages of fuzzy set theory, sub-image method and random sampling technique. In this method, we partition an original image into some sub-images to improve the robustness to different facial variations, and extract local features from each sub-image by using fuzzy 2D-Linear Discriminant analyzis (LDA) which makes use of the class information hidden in neighbor samples. In order to increase the diversity of component classifiers and retain as much as the structural information of the row vectors, we further randomly sample row vectors from each sub-image before performing fuzzy 2D-LDA. Experimental results on Yale A, ORL, AR and Extended Yale B face databases show its superiority to other related state-of-the-art methods on the different variations such as illumination, occlusion and facial expression. Furthermore, we analyze the diversity of our proposed method by virtue of Kappa diversity-error analyzis and frequency histogram and results show that the proposed method can construct more diverse component classifiers than other methods.


Sign in / Sign up

Export Citation Format

Share Document