scholarly journals TAE-Net: Task-Adaptive Embedding Network for Few-Shot Remote Sensing Scene Classification

2021 ◽  
Vol 14 (1) ◽  
pp. 111
Author(s):  
Wendong Huang ◽  
Zhengwu Yuan ◽  
Aixia Yang ◽  
Chan Tang ◽  
Xiaobo Luo

Recently, approaches based on deep learning are quite prevalent in the area of remote sensing scene classification. Though significant success has been achieved, these approaches are still subject to an excess of parameters and extremely dependent on a large quantity of labeled data. In this study, few-shot learning is used for remote sensing scene classification tasks. The goal of few-shot learning is to recognize unseen scene categories given extremely limited labeled samples. For this purpose, a novel task-adaptive embedding network is proposed to facilitate few-shot scene classification of remote sensing images, referred to as TAE-Net. A feature encoder is first trained on the base set to learn embedding features of input images in the pre-training phase. Then in the meta-training phase, a new task-adaptive attention module is designed to yield the task-specific attention, which can adaptively select informative embedding features among the whole task. In the end, in the meta-testing phase, the query image derived from the novel set is predicted by the meta-trained model with limited support images. Extensive experiments are carried out on three public remote sensing scene datasets: UC Merced, WHU-RS19, and NWPU-RESISC45. The experimental results illustrate that our proposed TAE-Net achieves new state-of-the-art performance for few-shot remote sensing scene classification.

2021 ◽  
Vol 13 (20) ◽  
pp. 4143
Author(s):  
Jianrong Zhang ◽  
Hongwei Zhao ◽  
Jiao Li

Remote sensing scene classification remains challenging due to the complexity and variety of scenes. With the development of attention-based methods, Convolutional Neural Networks (CNNs) have achieved competitive performance in remote sensing scene classification tasks. As an important method of the attention-based model, the Transformer has achieved great success in the field of natural language processing. Recently, the Transformer has been used for computer vision tasks. However, most existing methods divide the original image into multiple patches and encode the patches as the input of the Transformer, which limits the model’s ability to learn the overall features of the image. In this paper, we propose a new remote sensing scene classification method, Remote Sensing Transformer (TRS), a powerful “pure CNNs→Convolution + Transformer → pure Transformers” structure. First, we integrate self-attention into ResNet in a novel way, using our proposed Multi-Head Self-Attention layer instead of 3 × 3 spatial revolutions in the bottleneck. Then we connect multiple pure Transformer encoders to further improve the representation learning performance completely depending on attention. Finally, we use a linear classifier for classification. We train our model on four public remote sensing scene datasets: UC-Merced, AID, NWPU-RESISC45, and OPTIMAL-31. The experimental results show that TRS exceeds the state-of-the-art methods and achieves higher accuracy.


Author(s):  
Pei Zhang ◽  
Guoliang Fan ◽  
Chanyue Wu ◽  
Dong Wang ◽  
Ying Li

The central goal of few-shot scene classification is to learn a model that can generalize well to a novel scene category (UNSEEN) from only one or a few labeled examples. Recent works in the remote sensing (RS) community tackle this challenge by developing algorithms in a meta-learning manner. However, most prior approaches have either focused on rapidly optimizing a meta-learner or aimed at finding good similarity metrics while overlooking the embedding power. Here we propose a novel Task-Adaptive Embedding Learning (TAEL) framework that complements the existing methods by giving full play to feature embedding’s dual roles in few-shot scene classification - representing images and constructing classifiers in the embedding space. First, we design a lightweight network that enriches the diversity and expressive capacity of embeddings by dynamically fusing information from multiple kernels. Second, we present a task-adaptive strategy that helps to generate more discriminative representations by transforming the universal embeddings into task-specific embeddings via a self-attention mechanism. We evaluate our model in the standard few-shot learning setting on two challenging datasets: NWPU-RESISC4 and RSD46-WHU. Experimental results demonstrate that, on all tasks, our method achieves state-of-the-art performance by a significant margin.


2021 ◽  
Vol 13 (10) ◽  
pp. 1950
Author(s):  
Cuiping Shi ◽  
Xin Zhao ◽  
Liguo Wang

In recent years, with the rapid development of computer vision, increasing attention has been paid to remote sensing image scene classification. To improve the classification performance, many studies have increased the depth of convolutional neural networks (CNNs) and expanded the width of the network to extract more deep features, thereby increasing the complexity of the model. To solve this problem, in this paper, we propose a lightweight convolutional neural network based on attention-oriented multi-branch feature fusion (AMB-CNN) for remote sensing image scene classification. Firstly, we propose two convolution combination modules for feature extraction, through which the deep features of images can be fully extracted with multi convolution cooperation. Then, the weights of the feature are calculated, and the extracted deep features are sent to the attention mechanism for further feature extraction. Next, all of the extracted features are fused by multiple branches. Finally, depth separable convolution and asymmetric convolution are implemented to greatly reduce the number of parameters. The experimental results show that, compared with some state-of-the-art methods, the proposed method still has a great advantage in classification accuracy with very few parameters.


2021 ◽  
Author(s):  
Anh Nguyen ◽  
Khoa Pham ◽  
Dat Ngo ◽  
Thanh Ngo ◽  
Lam Pham

This paper provides an analysis of state-of-the-art activation functions with respect to supervised classification of deep neural network. These activation functions comprise of Rectified Linear Units (ReLU), Exponential Linear Unit (ELU), Scaled Exponential Linear Unit (SELU), Gaussian Error Linear Unit (GELU), and the Inverse Square Root Linear Unit (ISRLU). To evaluate, experiments over two deep learning network architectures integrating these activation functions are conducted. The first model, basing on Multilayer Perceptron (MLP), is evaluated with MNIST dataset to perform these activation functions.Meanwhile, the second model, likely VGGish-based architecture, is applied for Acoustic Scene Classification (ASC) Task 1A in DCASE 2018 challenge, thus evaluate whether these activation functions work well in different datasets as well as different network architectures.


2019 ◽  
Vol 11 (5) ◽  
pp. 518 ◽  
Author(s):  
Bao-Di Liu ◽  
Jie Meng ◽  
Wen-Yang Xie ◽  
Shuai Shao ◽  
Ye Li ◽  
...  

At present, nonparametric subspace classifiers, such as collaborative representation-based classification (CRC) and sparse representation-based classification (SRC), are widely used in many pattern-classification and -recognition tasks. Meanwhile, the spatial pyramid matching (SPM) scheme, which considers spatial information in representing the image, is efficient for image classification. However, for SPM, the weights to evaluate the representation of different subregions are fixed. In this paper, we first introduce the spatial pyramid matching scheme to remote-sensing (RS)-image scene-classification tasks to improve performance. Then, we propose a weighted spatial pyramid matching collaborative-representation-based classification method, combining the CRC method with the weighted spatial pyramid matching scheme. The proposed method is capable of learning the weights of different subregions in representing an image. Finally, extensive experiments on several benchmark remote-sensing-image datasets were conducted and clearly demonstrate the superior performance of our proposed algorithm when compared with state-of-the-art approaches.


2018 ◽  
Vol 2018 (16) ◽  
pp. 1650-1657
Author(s):  
Xu Jiaqing ◽  
Lv Qi ◽  
Liu Hongjun ◽  
He Jie

Sign in / Sign up

Export Citation Format

Share Document