Multi-scale Feature Fusion UAV Image Object Detection Method Based on Dilated Convolution and Attention Mechanism

Author(s):  
Yuanzhu Liu ◽  
Zhiming Ding ◽  
Yang Cao ◽  
Mengmeng Chang
IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Keyou Guo ◽  
Xue Li ◽  
Mo Zhang ◽  
Qichao Bao ◽  
Min Yang

2021 ◽  
Vol 2078 (1) ◽  
pp. 012008
Author(s):  
Hui Liu ◽  
Keyang Cheng

Abstract Aiming at the problem of false detection and missed detection of small targets and occluded targets in the process of pedestrian detection, a pedestrian detection algorithm based on improved multi-scale feature fusion is proposed. First, for the YOLOv4 multi-scale feature fusion module PANet, which does not consider the interaction relationship between scales, PANet is improved to reduce the semantic gap between scales, and the attention mechanism is introduced to learn the importance of different layers to strengthen feature fusion; then, dilated convolution is introduced. Dilated convolution reduces the problem of information loss during the downsampling process; finally, the K-means clustering algorithm is used to redesign the anchor box and modify the loss function to detect a single category. The experimental results show that the improved pedestrian detection algorithm in the INRIA and WiderPerson data sets under different congestion conditions, the AP reaches 96.83% and 59.67%, respectively. Compared with the pedestrian detection results of the YOLOv4 model, the algorithm improves by 2.41% and 1.03%, respectively. The problem of false detection and missed detection of small targets and occlusion has been significantly improved.


2020 ◽  
Vol 16 (3) ◽  
pp. 132-145
Author(s):  
Gang Liu ◽  
Chuyi Wang

Neural network models have been widely used in the field of object detecting. The region proposal methods are widely used in the current object detection networks and have achieved well performance. The common region proposal methods hunt the objects by generating thousands of the candidate boxes. Compared to other region proposal methods, the region proposal network (RPN) method improves the accuracy and detection speed with several hundred candidate boxes. However, since the feature maps contains insufficient information, the ability of RPN to detect and locate small-sized objects is poor. A novel multi-scale feature fusion method for region proposal network to solve the above problems is proposed in this article. The proposed method is called multi-scale region proposal network (MS-RPN) which can generate suitable feature maps for the region proposal network. In MS-RPN, the selected feature maps at multiple scales are fine turned respectively and compressed into a uniform space. The generated fusion feature maps are called refined fusion features (RFFs). RFFs incorporate abundant detail information and context information. And RFFs are sent to RPN to generate better region proposals. The proposed approach is evaluated on PASCAL VOC 2007 and MS COCO benchmark tasks. MS-RPN obtains significant improvements over the comparable state-of-the-art detection models.


2021 ◽  
Vol 66 (3) ◽  
pp. 2493-2507
Author(s):  
Hyun Kyu Shin ◽  
Si Woon Lee ◽  
Goo Pyo Hong ◽  
Sael Lee ◽  
Sang Hyo Lee ◽  
...  

Author(s):  
Zhenjian Yang ◽  
Jiamei Shang ◽  
Zhongwei Zhang ◽  
Yan Zhang ◽  
Shudong Liu

Traditional image dehazing algorithms based on prior knowledge and deep learning rely on the atmospheric scattering model and are easy to cause color distortion and incomplete dehazing. To solve these problems, an end-to-end image dehazing algorithm based on residual attention mechanism is proposed in this paper. The network includes four modules: encoder, multi-scale feature extraction, feature fusion and decoder. The encoder module encodes the input haze image into feature map, which is convenient for subsequent feature extraction and reduces memory consumption; the multi-scale feature extraction module includes residual smoothed dilated convolution module, residual block and efficient channel attention, which can expand the receptive field and extract different scale features by filtering and weighting; the feature fusion module with efficient channel attention adjusts the channel weight dynamically, acquires rich context information and suppresses redundant information so as to enhance the ability to extract haze density image of the network; finally, the encoder module maps the fused feature nonlinearly to obtain the haze density image and then restores the haze free image. The qualitative and quantitative tests based on SOTS test set and natural haze images show good objective and subjective evaluation results. This algorithm improves the problems of color distortion and incomplete dehazing effectively.


Sign in / Sign up

Export Citation Format

Share Document