scholarly journals Cross Modal Few-Shot Contextual Transfer for Heterogenous Image Classification

2021 ◽  
Vol 15 ◽  
Author(s):  
Zhikui Chen ◽  
Xu Zhang ◽  
Wei Huang ◽  
Jing Gao ◽  
Suhua Zhang

Deep transfer learning aims at dealing with challenges in new tasks with insufficient samples. However, when it comes to few-shot learning scenarios, due to the low diversity of several known training samples, they are prone to be dominated by specificity, thus leading to one-sidedness local features instead of the reliable global feature of the actual categories they belong to. To alleviate the difficulty, we propose a cross-modal few-shot contextual transfer method that leverages the contextual information as a supplement and learns context awareness transfer in few-shot image classification scenes, which fully utilizes the information in heterogeneous data. The similarity measure in the image classification task is reformulated via fusing textual semantic modal information and visual semantic modal information extracted from images. This performs as a supplement and helps to inhibit the sample specificity. Besides, to better extract local visual features and reorganize the recognition pattern, the deep transfer scheme is also used for reusing a powerful extractor from the pre-trained model. Simulation experiments show that the introduction of cross-modal and intra-modal contextual information can effectively suppress the deviation of defining category features with few samples and improve the accuracy of few-shot image classification tasks.

Sensors ◽  
2019 ◽  
Vol 19 (11) ◽  
pp. 2609 ◽  
Author(s):  
Jianhang Zhou ◽  
Bob Zhang

Collaborative representation based classification (CRC) is an efficient classifier in image classification. By using l 2 regularization, the collaborative representation based classifier holds competitive performances compared with the sparse representation based classifier using less computational time. However, each of the elements calculated from the training samples are utilized for representation without selection, which can lead to poor performances in some classification tasks. To resolve this issue, in this paper, we propose a novel collaborative representation by directly using non-negative representations to represent a test sample collaboratively, termed Non-negative Collaborative Representation-based Classifier (NCRC). To collect all non-negative collaborative representations, we introduce a Rectified Linear Unit (ReLU) function to perform filtering on the coefficients obtained by l 2 minimization according to CRC’s objective function. Next, we represent the test sample by using a linear combination of these representations. Lastly, the nearest subspace classifier is used to perform classification on the test samples. The experiments performed on four different databases including face and palmprint showed the promising results of the proposed method. Accuracy comparisons with other state-of-art sparse representation-based classifiers demonstrated the effectiveness of NCRC at image classification. In addition, the proposed NCRC consumes less computational time, further illustrating the efficiency of NCRC.


2020 ◽  
Vol 2020 ◽  
pp. 1-10 ◽  
Author(s):  
Chaohui Tang ◽  
Qingxin Zhu ◽  
Wenjun Wu ◽  
Wenlin Huang ◽  
Chaoqun Hong ◽  
...  

In the past few years, deep learning has become a research hotspot and has had a profound impact on computer vision. Deep CNN has been proven to be the most important and effective model for image processing, but due to the lack of training samples and huge number of learning parameters, it is easy to tend to overfit. In this work, we propose a new two-stage CNN image classification network, named “Improved Convolutional Neural Networks with Image Enhancement for Image Classification” and PLANET in abbreviation, which uses a new image data enhancement method called InnerMove to enhance images and augment the number of training samples. InnerMove is inspired by the “object movement” scene in computer vision and can improve the generalization ability of deep CNN models for image classification tasks. Sufficient experiment results show that PLANET utilizing InnerMove for image enhancement outperforms the comparative algorithms, and InnerMove has a more significant effect than the comparative data enhancement methods for image classification tasks.


2017 ◽  
Vol 17 (02) ◽  
pp. 1750007 ◽  
Author(s):  
Chunwei Tian ◽  
Guanglu Sun ◽  
Qi Zhang ◽  
Weibing Wang ◽  
Teng Chen ◽  
...  

Collaborative representation classification (CRC) is an important sparse method, which is easy to carry out and uses a linear combination of training samples to represent a test sample. CRC method utilizes the offset between representation result of each class and the test sample to implement classification. However, the offset usually cannot well express the difference between every class and the test sample. In this paper, we propose a novel representation method for image recognition to address the above problem. This method not only fuses sparse representation and CRC method to improve the accuracy of image recognition, but also has novel fusion mechanism to classify images. The implementations of the proposed method have the following steps. First of all, it produces collaborative representation of the test sample. That is, a linear combination of all the training samples is first determined to represent the test sample. Then, it gets the sparse representation classification (SRC) of the test sample. Finally, the proposed method respectively uses CRC and SRC representations to obtain two kinds of scores of the test sample and fuses them to recognize the image. The experiments of face recognition show that the combination of CRC and SRC has satisfactory performance for image classification.


2013 ◽  
Vol 2013 ◽  
pp. 1-8
Author(s):  
Teng Li ◽  
Huan Chang ◽  
Jun Wu

This paper presents a novel algorithm to numerically decompose mixed signals in a collaborative way, given supervision of the labels that each signal contains. The decomposition is formulated as an optimization problem incorporating nonnegative constraint. A nonnegative data factorization solution is presented to yield the decomposed results. It is shown that the optimization is efficient and decreases the objective function monotonically. Such a decomposition algorithm can be applied on multilabel training samples for pattern classification. The real-data experimental results show that the proposed algorithm can significantly facilitate the multilabel image classification performance with weak supervision.


Author(s):  
P. Zhong ◽  
Z. Q. Gong ◽  
C. Schönlieb

In recent years, researches in remote sensing demonstrated that deep architectures with multiple layers can potentially extract abstract and invariant features for better hyperspectral image classification. Since the usual real-world hyperspectral image classification task cannot provide enough training samples for a supervised deep model, such as convolutional neural networks (CNNs), this work turns to investigate the deep belief networks (DBNs), which allow unsupervised training. The DBN trained over limited training samples usually has many “dead” (never responding) or “potential over-tolerant” (always responding) latent factors (neurons), which decrease the DBN’s description ability and thus finally decrease the hyperspectral image classification performance. This work proposes a new diversified DBN through introducing a diversity promoting prior over the latent factors during the DBN pre-training and fine-tuning procedures. The diversity promoting prior in the training procedures will encourage the latent factors to be uncorrelated, such that each latent factor focuses on modelling unique information, and all factors will be summed up to capture a large proportion of information and thus increase description ability and classification performance of the diversified DBNs. The proposed method was evaluated over the well-known real-world hyperspectral image dataset. The experiments demonstrate that the diversified DBNs can obtain much better results than original DBNs and comparable or even better performances compared with other recent hyperspectral image classification methods.


2021 ◽  
Vol 1 (1) ◽  
pp. 71-84
Author(s):  
Mohamed Fawzy ◽  
Farag Khodary ◽  
Yasser Mostafa

Author(s):  
P. Burai ◽  
T. Tomor ◽  
L. Bekő ◽  
B. Deák

In our study we classified grassland vegetation types of an alkali landscape (Eastern Hungary), using different image classification methods for hyperspectral data. Our aim was to test the applicability of hyperspectral data in this complex system using various image classification methods. To reach the highest classification accuracy, we compared the performance of traditional image classifiers, machine learning algorithm, feature extraction (MNF-transformation) and various sizes of training dataset. Hyperspectral images were acquired by an AISA EAGLE II hyperspectral sensor of 128 contiguous bands (400–1000 nm), a spectral sampling of 5 nm bandwidth and a ground pixel size of 1 m. We used twenty vegetation classes which were compiled based on the characteristic dominant species, canopy height, and total vegetation cover. Image classification was applied to the original and MNF (minimum noise fraction) transformed dataset using various training sample sizes between 10 and 30 pixels. In the case of the original bands, both SVM and RF classifiers provided high accuracy for almost all classes irrespectively of the number of the training pixels. We found that SVM and RF produced the best accuracy with the first nine MNF transformed bands. Our results suggest that in complex open landscapes, application of SVM can be a feasible solution, as this method provides higher accuracies compared to RF and MLC. SVM was not sensitive for the size of the training samples, which makes it an adequate tool for cases when the available number of training pixels are limited for some classes.


Sign in / Sign up

Export Citation Format

Share Document