Discriminative high-level representations for scene classification

Author(s):  
Lei Zhang ◽  
Shouzhi Xie ◽  
Xiantong Zhen
IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 4629-4640 ◽  
Author(s):  
Wenhua Liu ◽  
Yidong Li ◽  
Qi Wu

2021 ◽  
Vol 13 (16) ◽  
pp. 3113
Author(s):  
Ming Li ◽  
Lin Lei ◽  
Yuqi Tang ◽  
Yuli Sun ◽  
Gangyao Kuang

Remote sensing image scene classification (RSISC) has broad application prospects, but related challenges still exist and urgently need to be addressed. One of the most important challenges is how to learn a strong discriminative scene representation. Recently, convolutional neural networks (CNNs) have shown great potential in RSISC due to their powerful feature learning ability; however, their performance may be restricted by the complexity of remote sensing images, such as spatial layout, varying scales, complex backgrounds, category diversity, etc. In this paper, we propose an attention-guided multilayer feature aggregation network (AGMFA-Net) that attempts to improve the scene classification performance by effectively aggregating features from different layers. Specifically, to reduce the discrepancies between different layers, we employed the channel–spatial attention on multiple high-level convolutional feature maps to capture more accurately semantic regions that correspond to the content of the given scene. Then, we utilized the learned semantic regions as guidance to aggregate the valuable information from multilayer convolutional features, so as to achieve stronger scene features for classification. Experimental results on three remote sensing scene datasets indicated that our approach achieved competitive classification performance in comparison to the baselines and other state-of-the-art methods.


Scene classification is basic problem in robotics and computer vision application. In Scene classification focused on complete view or event that contains both low and high level features. The main purpose of scene classification is to diminish the semantic gap in between social life & computer system. The main issue in scene classification is recognizing tall buildings, mountain, open country and inside city. We applied combination algorithms of feature extraction on trained datasets. Our proposed algorithm is hybrid combination of SIFT+ HOG named as HFCNN. As compare with the existing CNN architecture, HFCNN perform betters with high accuracy rate. Accuracy rate for proposed architecture is more than 96% as calculated with better time consumption and cost effective.


2012 ◽  
Vol 220-223 ◽  
pp. 2188-2191
Author(s):  
Wen Gang Feng ◽  
Xue Chen

In this paper, the problem of scene representation is modeled by simultaneously considering the stimulus-driven and instance-related factors in a probabilistic framework. In this framework, a stimulus-driven component simulates the low-level processes in human vision system using semantic constrain; while a instance-related component simulate the high-level processes to bias the competition of the input features. We interpret the synergetic multi-semantic multi-instance learning on five scene database of LabelMe benchmark, and validate scene classification on the fifteen scene database via the SVM inference with comparison to the state-of-arts methods.


2014 ◽  
Vol 678 ◽  
pp. 147-150
Author(s):  
Yu Liang Du ◽  
Ling Feng Yuan ◽  
Wei Bing Wan

Nature Scene classification is a fundamental problem in image understanding. Human can recognize the scene instantly after only a glance. This is mainly because that our visual attention is easily attracted by the salient objects in scene. And these objects are always representative in the natural scene. It is unclear how humans achieve rapid scene categorization. But this kind of high-level cognitive behavior can be reflected by the eye movement. To identify this ability, we propose a model with the guidance of eye movement. It combines the bag of words (BOW) and spatial pyramid matching (SPM) methods to train and test our model on support vector machine (SVM). The eye movement experiments were employed to validate our model. We found that the subjects could recognize the scenes correctly even if given only a few saliency patches with less than one second. These results suggest that the eye tracking saliency patches play an important role for human scene categorization.


Author(s):  
S. Jiang ◽  
H. Zhao ◽  
W. Wu ◽  
Q. Tan

High resolution remote sensing (HRRS) images scene classification aims to label an image with a specific semantic category. HRRS images contain more details of the ground objects and their spatial distribution patterns than low spatial resolution images. Scene classification can bridge the gap between low-level features and high-level semantics. It can be applied in urban planning, target detection and other fields. This paper proposes a novel framework for HRRS images scene classification. This framework combines the convolutional neural network (CNN) and XGBoost, which utilizes CNN as feature extractor and XGBoost as a classifier. Then, this framework is evaluated on two different HRRS images datasets: UC-Merced dataset and NWPU-RESISC45 dataset. Our framework achieved satisfying accuracies on two datasets, which is 95.57 % and 83.35 % respectively. From the experiments result, our framework has been proven to be effective for remote sensing images classification. Furthermore, we believe this framework will be more practical for further HRRS scene classification, since it costs less time on training stage.


2021 ◽  
Vol 14 (1) ◽  
pp. 53
Author(s):  
Weining An ◽  
Xinqi Zhang ◽  
Hang Wu ◽  
Wenchang Zhang ◽  
Yaohua Du ◽  
...  

At present, the classification accuracy of high-resolution Remote Sensing Image Scene Classification (RSISC) has reached a quite high level on standard datasets. However, when coming to practical application, the intrinsic noise of satellite sensors and the disturbance of atmospheric environment often degrade real Remote Sensing (RS) images. It introduces defects to them, which affects the performance and reduces the robustness of RSISC methods. Moreover, due to the restriction of memory and power consumption, the methods also need a small number of parameters and fast computing speed to be implemented on small portable systems such as unmanned aerial vehicles. In this paper, a Lightweight Progressive Inpainting Network (LPIN) and a novel combined approach of LPIN and the existing RSISC methods are proposed to improve the robustness of RSISC tasks and satisfy the requirement of methods on portable systems. The defects in real RS images are inpainted by LPIN to provide a purified input for classification. With the combined approach, the classification accuracy on RS images with defects can be improved to the original level of those without defects. The LPIN is designed on the consideration of lightweight model. Measures are adopted to ensure a high gradient transmission efficiency while reducing the number of network parameters. Multiple loss functions are used to get reasonable and realistic inpainting results. Extensive tests of image inpainting of LPIN and classification tests with the combined approach on NWPU-RESISC45, UC Merced Land-Use and AID datasets are carried out which indicate that the LPIN achieves a state-of-the-art inpainting quality with less parameters and a faster inpainting speed. Furthermore, the combined approach keeps the comparable classification accuracy level on RS images with defects as that without defects, which will improve the robustness of high-resolution RSISC tasks.


Sign in / Sign up

Export Citation Format

Share Document