scholarly journals A New Benchmark and an Attribute-Guided Multilevel Feature Representation Network for Fine-Grained Ship Classification in Optical Remote Sensing Images

Author(s):  
Xiaohan Zhang ◽  
Yafei Lv ◽  
Libo Yao ◽  
Wei Xiong ◽  
Chunlong Fu
2021 ◽  
Vol 13 (4) ◽  
pp. 747
Author(s):  
Yanghua Di ◽  
Zhiguo Jiang ◽  
Haopeng Zhang

Fine-grained visual categorization (FGVC) is an important and challenging problem due to large intra-class differences and small inter-class differences caused by deformation, illumination, angles, etc. Although major advances have been achieved in natural images in the past few years due to the release of popular datasets such as the CUB-200-2011, Stanford Cars and Aircraft datasets, fine-grained ship classification in remote sensing images has been rarely studied because of relative scarcity of publicly available datasets. In this paper, we investigate a large amount of remote sensing image data of sea ships and determine most common 42 categories for fine-grained visual categorization. Based our previous DSCR dataset, a dataset for ship classification in remote sensing images, we collect more remote sensing images containing warships and civilian ships of various scales from Google Earth and other popular remote sensing image datasets including DOTA, HRSC2016, NWPU VHR-10, We call our dataset FGSCR-42, meaning a dataset for Fine-Grained Ship Classification in Remote sensing images with 42 categories. The whole dataset of FGSCR-42 contains 9320 images of most common types of ships. We evaluate popular object classification algorithms and fine-grained visual categorization algorithms to build a benchmark. Our FGSCR-42 dataset is publicly available at our webpages.


2019 ◽  
Vol 11 (18) ◽  
pp. 2095 ◽  
Author(s):  
Kun Fu ◽  
Zhuo Chen ◽  
Yue Zhang ◽  
Xian Sun

In recent years, deep learning has led to a remarkable breakthrough in object detection in remote sensing images. In practice, two-stage detectors perform well regarding detection accuracy but are slow. On the other hand, one-stage detectors integrate the detection pipeline of two-stage detectors to simplify the detection process, and are faster, but with lower detection accuracy. Enhancing the capability of feature representation may be a way to improve the detection accuracy of one-stage detectors. For this goal, this paper proposes a novel one-stage detector with enhanced capability of feature representation. The enhanced capability benefits from two proposed structures: dual top-down module and dense-connected inception module. The former efficiently utilizes multi-scale features from multiple layers of the backbone network. The latter both widens and deepens the network to enhance the ability of feature representation with limited extra computational cost. To evaluate the effectiveness of proposed structures, we conducted experiments on horizontal bounding box detection tasks on the challenging DOTA dataset and gained 73.49% mean Average Precision (mAP), achieving state-of-the-art performance. Furthermore, our method ran significantly faster than the best public two-stage detector on the DOTA dataset.


2020 ◽  
Vol 12 (24) ◽  
pp. 4187
Author(s):  
Wei Liang ◽  
Jihao Li ◽  
Wenhui Diao ◽  
Xian Sun ◽  
Kun Fu ◽  
...  

Fine-grained aircraft type recognition in remote sensing images, aiming to distinguish different types of the same parent category aircraft, is quite a significant task. In recent decades, with the development of deep learning, the solution scheme for this problem has shifted from handcrafted feature design to model architecture design. Although a great progress has been achieved, this paradigm generally needs strong expert knowledge and rich expert experience. It is still an extremely laborious work and the automation level is relatively low. In this paper, inspired by Neural Architecture Search (NAS), we explore a novel differentiable automatic architecture design framework for fine-grained aircraft type recognition in remote sensing images. In our framework, the search process is divided into several phases. Network architecture deepens at each phase while the number of candidate functions gradually decreases. To achieve it, we adopt different pruning strategies. Then, the network architecture is determined through a potentiality judgment after an architecture heating process. This approach can not only search deeper network, but also reduce the computational complexity, especially for relatively large size of remote sensing images. When all differentiable search phases are finished, the searched model called Fine-Grained Aircraft Type Recognition Net (FGATR-Net) is obtained. Compared with previous NAS, ours are more suitable for relatively large and complex remote sensing images. Experiments on Multitype Aircraft Remote Sensing Images (MTARSI) and Aircraft17 validate that FGATR-Net possesses a strong capability of feature extraction and feature representation. Besides, it is also compact enough, i.e., parameter quantity is relatively small. This powerfully indicates the feasibility and effectiveness of the proposed automatic network architecture design method.


2021 ◽  
Vol 13 (3) ◽  
pp. 441
Author(s):  
Han Fu ◽  
Bihong Fu ◽  
Pilong Shi

The South China Karst, a United Nations Educational, Scientific and Cultural Organization (UNESCO) natural heritage site, is one of the world’s most spectacular examples of humid tropical to subtropical karst landscapes. The Libo cone karst in the southern Guizhou Province is considered as the world reference site for these types of karst, forming a distinctive and beautiful landscape. Geomorphic information and spatial distribution of cone karst is essential for conservation and management for Libo heritage site. In this study, a deep learning (DL) method based on DeepLab V3+ network was proposed to document the cone karst landscape in Libo by multi-source data, including optical remote sensing images and digital elevation model (DEM) data. The training samples were generated by using Landsat remote sensing images and their combination with satellite derived DEM data. Each group of training dataset contains 898 samples. The input module of DeepLab V3+ network was improved to accept four-channel input data, i.e., combination of Landsat RGB images and DEM data. Our results suggest that the mean intersection over union (MIoU) using the four-channel data as training samples by a new DL-based pixel-level image segmentation approach is the highest, which can reach 95.5%. The proposed method can accomplish automatic extraction of cone karst landscape by self-learning of deep neural network, and therefore it can also provide a powerful and automatic tool for documenting other type of geological landscapes worldwide.


Sign in / Sign up

Export Citation Format

Share Document