Recognition of small targets in remote sensing image using multi-scale feature fusion-based shot multi-box detector

2021 ◽  
Vol 29 (11) ◽  
pp. 2672-2682
Author(s):  
Xin CHEN ◽  
◽  
Min-jie WAN ◽  
Chao MA ◽  
Qian CHEN ◽  
...  
2021 ◽  
Vol 2078 (1) ◽  
pp. 012008
Author(s):  
Hui Liu ◽  
Keyang Cheng

Abstract Aiming at the problem of false detection and missed detection of small targets and occluded targets in the process of pedestrian detection, a pedestrian detection algorithm based on improved multi-scale feature fusion is proposed. First, for the YOLOv4 multi-scale feature fusion module PANet, which does not consider the interaction relationship between scales, PANet is improved to reduce the semantic gap between scales, and the attention mechanism is introduced to learn the importance of different layers to strengthen feature fusion; then, dilated convolution is introduced. Dilated convolution reduces the problem of information loss during the downsampling process; finally, the K-means clustering algorithm is used to redesign the anchor box and modify the loss function to detect a single category. The experimental results show that the improved pedestrian detection algorithm in the INRIA and WiderPerson data sets under different congestion conditions, the AP reaches 96.83% and 59.67%, respectively. Compared with the pedestrian detection results of the YOLOv4 model, the algorithm improves by 2.41% and 1.03%, respectively. The problem of false detection and missed detection of small targets and occlusion has been significantly improved.


2021 ◽  
Vol 58 (2) ◽  
pp. 0228001
Author(s):  
马天浩 Ma Tianhao ◽  
谭海 Tan Hai ◽  
李天琪 Li Tianqi ◽  
吴雅男 Wu Yanan ◽  
刘祺 Liu Qi

2019 ◽  
Vol 56 (12) ◽  
pp. 121003
Author(s):  
金秋含 Qiuhan Jin ◽  
王阳萍 Yangping Wang ◽  
杨景玉 Jingyu Yang

2021 ◽  
Vol 13 (5) ◽  
pp. 847
Author(s):  
Wei Huang ◽  
Guanyi Li ◽  
Qiqiang Chen ◽  
Ming Ju ◽  
Jiantao Qu

In the wake of developments in remote sensing, the application of target detection of remote sensing is of increasing interest. Unfortunately, unlike natural image processing, remote sensing image processing involves dealing with large variations in object size, which poses a great challenge to researchers. Although traditional multi-scale detection networks have been successful in solving problems with such large variations, they still have certain limitations: (1) The traditional multi-scale detection methods note the scale of features but ignore the correlation between feature levels. Each feature map is represented by a single layer of the backbone network, and the extracted features are not comprehensive enough. For example, the SSD network uses the features extracted from the backbone network at different scales directly for detection, resulting in the loss of a large amount of contextual information. (2) These methods combine with inherent backbone classification networks to perform detection tasks. RetinaNet is just a combination of the ResNet-101 classification network and FPN network to perform the detection tasks; however, there are differences in object classification and detection tasks. To address these issues, a cross-scale feature fusion pyramid network (CF2PN) is proposed. First and foremost, a cross-scale fusion module (CSFM) is introduced to extract sufficiently comprehensive semantic information from features for performing multi-scale fusion. Moreover, a feature pyramid for target detection utilizing thinning U-shaped modules (TUMs) performs the multi-level fusion of the features. Eventually, a focal loss in the prediction section is used to control the large number of negative samples generated during the feature fusion process. The new architecture of the network proposed in this paper is verified by DIOR and RSOD dataset. The experimental results show that the performance of this method is improved by 2–12% in the DIOR dataset and RSOD dataset compared with the current SOTA target detection methods.


2019 ◽  
Vol 11 (5) ◽  
pp. 594 ◽  
Author(s):  
Shuo Zhuang ◽  
Ping Wang ◽  
Boran Jiang ◽  
Gang Wang ◽  
Cong Wang

With the rapid advances in remote-sensing technologies and the larger number of satellite images, fast and effective object detection plays an important role in understanding and analyzing image information, which could be further applied to civilian and military fields. Recently object detection methods with region-based convolutional neural network have shown excellent performance. However, these two-stage methods contain region proposal generation and object detection procedures, resulting in low computation speed. Because of the expensive manual costs, the quantity of well-annotated aerial images is scarce, which also limits the progress of geospatial object detection in remote sensing. In this paper, on the one hand, we construct and release a large-scale remote-sensing dataset for geospatial object detection (RSD-GOD) that consists of 5 different categories with 18,187 annotated images and 40,990 instances. On the other hand, we design a single shot detection framework with multi-scale feature fusion. The feature maps from different layers are fused together through the up-sampling and concatenation blocks to predict the detection results. High-level features with semantic information and low-level features with fine details are fully explored for detection tasks, especially for small objects. Meanwhile, a soft non-maximum suppression strategy is put into practice to select the final detection results. Extensive experiments have been conducted on two datasets to evaluate the designed network. Results show that the proposed approach achieves a good detection performance and obtains the mean average precision value of 89.0% on a newly constructed RSD-GOD dataset and 83.8% on the Northwestern Polytechnical University very high spatial resolution-10 (NWPU VHR-10) dataset at 18 frames per second (FPS) on a NVIDIA GTX-1080Ti GPU.


Sensors ◽  
2020 ◽  
Vol 20 (4) ◽  
pp. 1142
Author(s):  
Xinying Wang ◽  
Yingdan Wu ◽  
Yang Ming ◽  
Hui Lv

Due to increasingly complex factors of image degradation, inferring high-frequency details of remote sensing imagery is more difficult compared to ordinary digital photos. This paper proposes an adaptive multi-scale feature fusion network (AMFFN) for remote sensing image super-resolution. Firstly, the features are extracted from the original low-resolution image. Then several adaptive multi-scale feature extraction (AMFE) modules, the squeeze-and-excited and adaptive gating mechanisms are adopted for feature extraction and fusion. Finally, the sub-pixel convolution method is used to reconstruct the high-resolution image. Experiments are performed on three datasets, the key characteristics, such as the number of AMFEs and the gating connection way are studied, and super-resolution of remote sensing imagery of different scale factors are qualitatively and quantitatively analyzed. The results show that our method outperforms the classic methods, such as Super-Resolution Convolutional Neural Network(SRCNN), Efficient Sub-Pixel Convolutional Network (ESPCN), and multi-scale residual CNN(MSRN).


2019 ◽  
Vol 49 ◽  
pp. 89-99 ◽  
Author(s):  
Yanling Du ◽  
Wei Song ◽  
Qi He ◽  
Dongmei Huang ◽  
Antonio Liotta ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document