A Discriminative Neighborhood-Based Collaborative Learning for Remote Sensing Scene Classification

The bag-of-words (BoW) model is one of the most popular representation methods for image classification. However, the lack of spatial information, change of illumination, and inter-class similarity among scene categories impair its performance in the remote-sensing domain. To alleviate these issues, this paper proposes to explore the spatial dependencies between different image regions and introduce a neighborhood-based collaborative learning (NBCL) for remote-sensing scene classification. Particularly, our proposed method employs multilevel features learning based on small, medium, and large neighborhood regions to enhance the discriminative power of image representation. To achieve this, image patches are selected through a fixed-size sliding window where each image is represented by four independent image region sequences. Apart from multilevel learning, we explicitly impose Gaussian pyramids to magnify the visual information of the scene images and optimize their position and scale parameters locally. Motivated by this, a local descriptor is exploited to extract multilevel and multiscale features that we represent in terms of codewords histogram by performing k-means clustering. Finally, a simple fusion strategy is proposed to balance the contribution of these features, and the fused features are incorporated into a Bidirectional Long Short-Term Memory (BiLSTM) network for constructing the final representation for classification. Experimental results on NWPU-RESISC45, AID, UC-Merced, and WHU-RS datasets demonstrate that the proposed approach not only surpasses the conventional bag-of-words approaches but also yields significantly higher classification performance than the existing state-of-the-art deep learning methods used nowadays.

Download Full-text

A Discriminative Neighborhood-Based Collaborative Learning for Remote Sensing Scene Classification

10.36227/techrxiv.16441593.v1 ◽

2021 ◽

Author(s):

Usman Muhammad ◽

Md. Ziaul Hoque ◽

Weiqiang Wang ◽

Mourad Oussalah

Keyword(s):

Remote Sensing ◽

Collaborative Learning ◽

Visual Information ◽

Spatial Information ◽

Short Term Memory ◽

Classification Performance ◽

Bag Of Words ◽

Scene Classification ◽

Image Region ◽

Local Descriptor

Download Full-text

Remote Sensing Image Scene Classification Using CNN-CapsNet

Remote Sensing ◽

10.3390/rs11050494 ◽

2019 ◽

Vol 11 (5) ◽

pp. 494 ◽

Cited By ~ 45

Author(s):

Wei Zhang ◽

Ping Tang ◽

Lijun Zhao

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Network Architecture ◽

Spatial Information ◽

Feature Learning ◽

Remote Sensing Image ◽

Classification Performance ◽

Scene Classification ◽

Feature Maps ◽

Fully Connected

Remote sensing image scene classification is one of the most challenging problems in understanding high-resolution remote sensing images. Deep learning techniques, especially the convolutional neural network (CNN), have improved the performance of remote sensing image scene classification due to the powerful perspective of feature learning and reasoning. However, several fully connected layers are always added to the end of CNN models, which is not efficient in capturing the hierarchical structure of the entities in the images and does not fully consider the spatial information that is important to classification. Fortunately, capsule network (CapsNet), which is a novel network architecture that uses a group of neurons as a capsule or vector to replace the neuron in the traditional neural network and can encode the properties and spatial information of features in an image to achieve equivariance, has become an active area in the classification field in the past two years. Motivated by this idea, this paper proposes an effective remote sensing image scene classification architecture named CNN-CapsNet to make full use of the merits of these two models: CNN and CapsNet. First, a CNN without fully connected layers is used as an initial feature maps extractor. In detail, a pretrained deep CNN model that was fully trained on the ImageNet dataset is selected as a feature extractor in this paper. Then, the initial feature maps are fed into a newly designed CapsNet to obtain the final classification result. The proposed architecture is extensively evaluated on three public challenging benchmark remote sensing image datasets: the UC Merced Land-Use dataset with 21 scene categories, AID dataset with 30 scene categories, and the NWPU-RESISC45 dataset with 45 challenging scene categories. The experimental results demonstrate that the proposed method can lead to a competitive classification performance compared with the state-of-the-art methods.

Download Full-text

A Multi-Branch Feature Fusion Strategy Based on an Attention Mechanism for Remote Sensing Image Scene Classification

Remote Sensing ◽

10.3390/rs13101950 ◽

2021 ◽

Vol 13 (10) ◽

pp. 1950

Author(s):

Cuiping Shi ◽

Xin Zhao ◽

Liguo Wang

Keyword(s):

Remote Sensing ◽

Feature Extraction ◽

Classification Accuracy ◽

Feature Fusion ◽

State Of The Art ◽

Rapid Development ◽

Remote Sensing Image ◽

Classification Performance ◽

Attention Mechanism ◽

Scene Classification

In recent years, with the rapid development of computer vision, increasing attention has been paid to remote sensing image scene classification. To improve the classification performance, many studies have increased the depth of convolutional neural networks (CNNs) and expanded the width of the network to extract more deep features, thereby increasing the complexity of the model. To solve this problem, in this paper, we propose a lightweight convolutional neural network based on attention-oriented multi-branch feature fusion (AMB-CNN) for remote sensing image scene classification. Firstly, we propose two convolution combination modules for feature extraction, through which the deep features of images can be fully extracted with multi convolution cooperation. Then, the weights of the feature are calculated, and the extracted deep features are sent to the attention mechanism for further feature extraction. Next, all of the extracted features are fused by multiple branches. Finally, depth separable convolution and asymmetric convolution are implemented to greatly reduce the number of parameters. The experimental results show that, compared with some state-of-the-art methods, the proposed method still has a great advantage in classification accuracy with very few parameters.

Download Full-text

Weighted Spatial Pyramid Matching Collaborative Representation for Remote-Sensing-Image Scene Classification

Remote Sensing ◽

10.3390/rs11050518 ◽

2019 ◽

Vol 11 (5) ◽

pp. 518 ◽

Cited By ~ 13

Author(s):

Bao-Di Liu ◽

Jie Meng ◽

Wen-Yang Xie ◽

Shuai Shao ◽

Ye Li ◽

...

Keyword(s):

Remote Sensing ◽

Spatial Information ◽

Remote Sensing Image ◽

Superior Performance ◽

Collaborative Representation ◽

Scene Classification ◽

Spatial Pyramid Matching ◽

Classification Tasks ◽

Pyramid Matching ◽

Spatial Pyramid

At present, nonparametric subspace classifiers, such as collaborative representation-based classification (CRC) and sparse representation-based classification (SRC), are widely used in many pattern-classification and -recognition tasks. Meanwhile, the spatial pyramid matching (SPM) scheme, which considers spatial information in representing the image, is efficient for image classification. However, for SPM, the weights to evaluate the representation of different subregions are fixed. In this paper, we first introduce the spatial pyramid matching scheme to remote-sensing (RS)-image scene-classification tasks to improve performance. Then, we propose a weighted spatial pyramid matching collaborative-representation-based classification method, combining the CRC method with the weighted spatial pyramid matching scheme. The proposed method is capable of learning the weights of different subregions in representing an image. Finally, extensive experiments on several benchmark remote-sensing-image datasets were conducted and clearly demonstrate the superior performance of our proposed algorithm when compared with state-of-the-art approaches.

Download Full-text

An Efficient and Lightweight Convolutional Neural Network for Remote Sensing Image Scene Classification

Sensors ◽

10.3390/s20071999 ◽

2020 ◽

Vol 20 (7) ◽

pp. 1999 ◽

Cited By ~ 6

Author(s):

Donghang Yu ◽

Qing Xu ◽

Haitao Guo ◽

Chuan Zhao ◽

Yuzhun Lin ◽

...

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Convolutional Neural Network ◽

Visual Recognition ◽

Feature Fusion ◽

Remote Sensing Image ◽

Classification Performance ◽

Image Features ◽

Training Dataset ◽

Scene Classification

Classifying remote sensing images is vital for interpreting image content. Presently, remote sensing image scene classification methods using convolutional neural networks have drawbacks, including excessive parameters and heavy calculation costs. More efficient and lightweight CNNs have fewer parameters and calculations, but their classification performance is generally weaker. We propose a more efficient and lightweight convolutional neural network method to improve classification accuracy with a small training dataset. Inspired by fine-grained visual recognition, this study introduces a bilinear convolutional neural network model for scene classification. First, the lightweight convolutional neural network, MobileNetv2, is used to extract deep and abstract image features. Each feature is then transformed into two features with two different convolutional layers. The transformed features are subjected to Hadamard product operation to obtain an enhanced bilinear feature. Finally, the bilinear feature after pooling and normalization is used for classification. Experiments are performed on three widely used datasets: UC Merced, AID, and NWPU-RESISC45. Compared with other state-of-art methods, the proposed method has fewer parameters and calculations, while achieving higher accuracy. By including feature fusion with bilinear pooling, performance and accuracy for remote scene classification can greatly improve. This could be applied to any remote sensing image classification task.

Download Full-text

Deep Discriminative Representation Learning with Attention Map for Scene Classification

Remote Sensing ◽

10.3390/rs12091366 ◽

2020 ◽

Vol 12 (9) ◽

pp. 1366 ◽

Cited By ~ 5

Author(s):

Jun Li ◽

Daoyu Lin ◽

Yang Wang ◽

Guangluan Xu ◽

Yunyan Zhang ◽

...

Keyword(s):

Remote Sensing ◽

Feature Fusion ◽

Representation Learning ◽

Classification Performance ◽

Great Success ◽

Scene Classification ◽

Remote Sensing Images ◽

Discriminative Ability ◽

Feature Representations ◽

Benchmark Datasets

In recent years, convolutional neural networks (CNNs) have shown great success in the scene classification of computer vision images. Although these CNNs can achieve excellent classification accuracy, the discriminative ability of feature representations extracted from CNNs is still limited in distinguishing more complex remote sensing images. Therefore, we propose a unified feature fusion framework based on attention mechanism in this paper, which is called Deep Discriminative Representation Learning with Attention Map (DDRL-AM). Firstly, by applying Gradient-weighted Class Activation Mapping (Grad-CAM) algorithm, attention maps associated with the predicted results are generated in order to make CNNs focus on the most salient parts of the image. Secondly, a spatial feature transformer (SFT) is designed to extract discriminative features from attention maps. Then an innovative two-channel CNN architecture is proposed by the fusion of features extracted from attention maps and the RGB (red green blue) stream. A new objective function that considers both center and cross-entropy loss are optimized to decrease the influence of inter-class dispersion and within-class variance. In order to show its effectiveness in classifying remote sensing images, the proposed DDRL-AM method is evaluated on four public benchmark datasets. The experimental results demonstrate the competitive scene classification performance of the DDRL-AM approach. Moreover, the visualization of features extracted by the proposed DDRL-AM method can prove that the discriminative ability of features has been increased.

Download Full-text

A Lightweight Convolutional Neural Network Based on Group-Wise Hybrid Attention for Remote Sensing Scene Classification

Remote Sensing ◽

10.3390/rs14010161 ◽

2021 ◽

Vol 14 (1) ◽

pp. 161

Author(s):

Cuiping Shi ◽

Xinlei Zhang ◽

Jingwei Sun ◽

Liguo Wang

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Convolutional Neural Network ◽

Spatial Attention ◽

Classification Performance ◽

Model Parameters ◽

Scene Classification ◽

Spatial Dimensions ◽

Work First ◽

Channel Dimension

With the development of computer vision, attention mechanisms have been widely studied. Although the introduction of an attention module into a network model can help to improve e classification performance on remote sensing scene images, the direct introduction of an attention module can increase the number of model parameters and amount of calculation, resulting in slower model operations. To solve this problem, we carried out the following work. First, a channel attention module and spatial attention module were constructed. The input features were enhanced through channel attention and spatial attention separately, and the features recalibrated by the attention modules were fused to obtain the features with hybrid attention. Then, to reduce the increase in parameters caused by the attention module, a group-wise hybrid attention module was constructed. The group-wise hybrid attention module divided the input features into four groups along the channel dimension, then used the hybrid attention mechanism to enhance the features in the channel and spatial dimensions for each group, then fused the features of the four groups along the channel dimension. Through the use of the group-wise hybrid attention module, the number of parameters and computational burden of the network were greatly reduced, and the running time of the network was shortened. Finally, a lightweight convolutional neural network was constructed based on the group-wise hybrid attention (LCNN-GWHA) for remote sensing scene image classification. Experiments on four open and challenging remote sensing scene datasets demonstrated that the proposed method has great advantages, in terms of classification accuracy, even with a very low number of parameters.

Download Full-text

Convolutional Neural Networks with Deep Supervised Feature Learning for Remote Sensing Scene Classification

10.20944/preprints202008.0113.v1 ◽

2020 ◽

Author(s):

Grigorios Tsagkatakis ◽

Panagiotis Tsakalides

Keyword(s):

Remote Sensing ◽

Feature Learning ◽

Ground Truth ◽

Classification Performance ◽

Cross Entropy ◽

Scene Classification ◽

Feature Representations ◽

Benchmark Datasets ◽

Low Dimensional ◽

Fully Connected

State-of-the-art remote sensing scene classification methods employ different Convolutional Neural Network architectures for achieving very high classification performance. A trait shared by the majority of these methods is that the class associated with each example is ascertained by examining the activations of the last fully connected layer, and the networks are trained to minimize the cross-entropy between predictions extracted from this layer and ground-truth annotations. In this work, we extend this paradigm by introducing an additional output branch which maps the inputs to low dimensional representations, effectively extracting additional feature representations of the inputs. The proposed model imposes additional distance constrains on these representations with respect to identified class representatives, in addition to the traditional categorical cross-entropy between predictions and ground-truth. By extending the typical cross-entropy loss function with a distance learning function, our proposed approach achieves significant gains across a wide set of benchmark datasets in terms of classification, while providing additional evidence related to class membership and classification confidence.

Download Full-text

Scene Classification Based on Improved Spatial Partition Model

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.527.339 ◽

2014 ◽

Vol 527 ◽

pp. 339-342

Author(s):

Zhi Yuan Liu ◽

Jin He ◽

Jin Long Wang ◽

Fei Zhao

Keyword(s):

Spatial Information ◽

Classification Performance ◽

Image Features ◽

Natural Scene ◽

Scene Classification ◽

Partition Model ◽

Spatial Partition ◽

Improved Model ◽

Local Image

In order to make full use of the spatial information of images in the classification of natural scene, we use the spatial partition model. But mechanically space division caused the abuse of spatial information. So spatial partition model must be properly improved to make the different categories of images were more diversity, so that the classification performance is improved. In addition, to further improve the performance, we use FAN-SIFT as local image features. Experiments made on 8 scenes image dataset and Caltech101 dataset show that the improved model can obtain better classification performance.

Download Full-text

Land use/cover classification in the Brazilian Amazon using satellite images

Pesquisa Agropecuária Brasileira ◽

10.1590/s0100-204x2012000900004 ◽

2012 ◽

Vol 47 (9) ◽

pp. 1185-1208 ◽

Cited By ~ 19

Author(s):

Dengsheng Lu ◽

Mateus Batistella ◽

Guiying Li ◽

Emilio Moran ◽

Scott Hetrick ◽

...

Keyword(s):

Remote Sensing ◽

Land Use ◽

Data Fusion ◽

Spatial Information ◽

Classification Tree ◽

Remote Sensing Data ◽

Brazilian Amazon ◽

Classification Performance ◽

Sensor Data ◽

Sensing Data

Land use/cover classification is one of the most important applications in remote sensing. However, mapping accurate land use/cover spatial distribution is a challenge, particularly in moist tropical regions, due to the complex biophysical environment and limitations of remote sensing data per se. This paper reviews experiments related to land use/cover classification in the Brazilian Amazon for a decade. Through comprehensive analysis of the classification results, it is concluded that spatial information inherent in remote sensing data plays an essential role in improving land use/cover classification. Incorporation of suitable textural images into multispectral bands and use of segmentation‑based method are valuable ways to improve land use/cover classification, especially for high spatial resolution images. Data fusion of multi‑resolution images within optical sensor data is vital for visual interpretation, but may not improve classification performance. In contrast, integration of optical and radar data did improve classification performance when the proper data fusion method was used. Among the classification algorithms available, the maximum likelihood classifier is still an important method for providing reasonably good accuracy, but nonparametric algorithms, such as classification tree analysis, have the potential to provide better results. However, they often require more time to achieve parametric optimization. Proper use of hierarchical‑based methods is fundamental for developing accurate land use/cover classification, mainly from historical remotely sensed data.

Download Full-text