Adaptive feature fusion with attention mechanism for multi-scale target detection

Author(s):  
Moran Ju ◽  
Jiangning Luo ◽  
Zhongbo Wang ◽  
Haibo Luo
Author(s):  
Zhenjian Yang ◽  
Jiamei Shang ◽  
Zhongwei Zhang ◽  
Yan Zhang ◽  
Shudong Liu

Traditional image dehazing algorithms based on prior knowledge and deep learning rely on the atmospheric scattering model and are easy to cause color distortion and incomplete dehazing. To solve these problems, an end-to-end image dehazing algorithm based on residual attention mechanism is proposed in this paper. The network includes four modules: encoder, multi-scale feature extraction, feature fusion and decoder. The encoder module encodes the input haze image into feature map, which is convenient for subsequent feature extraction and reduces memory consumption; the multi-scale feature extraction module includes residual smoothed dilated convolution module, residual block and efficient channel attention, which can expand the receptive field and extract different scale features by filtering and weighting; the feature fusion module with efficient channel attention adjusts the channel weight dynamically, acquires rich context information and suppresses redundant information so as to enhance the ability to extract haze density image of the network; finally, the encoder module maps the fused feature nonlinearly to obtain the haze density image and then restores the haze free image. The qualitative and quantitative tests based on SOTS test set and natural haze images show good objective and subjective evaluation results. This algorithm improves the problems of color distortion and incomplete dehazing effectively.


2021 ◽  
Vol 13 (5) ◽  
pp. 847
Author(s):  
Wei Huang ◽  
Guanyi Li ◽  
Qiqiang Chen ◽  
Ming Ju ◽  
Jiantao Qu

In the wake of developments in remote sensing, the application of target detection of remote sensing is of increasing interest. Unfortunately, unlike natural image processing, remote sensing image processing involves dealing with large variations in object size, which poses a great challenge to researchers. Although traditional multi-scale detection networks have been successful in solving problems with such large variations, they still have certain limitations: (1) The traditional multi-scale detection methods note the scale of features but ignore the correlation between feature levels. Each feature map is represented by a single layer of the backbone network, and the extracted features are not comprehensive enough. For example, the SSD network uses the features extracted from the backbone network at different scales directly for detection, resulting in the loss of a large amount of contextual information. (2) These methods combine with inherent backbone classification networks to perform detection tasks. RetinaNet is just a combination of the ResNet-101 classification network and FPN network to perform the detection tasks; however, there are differences in object classification and detection tasks. To address these issues, a cross-scale feature fusion pyramid network (CF2PN) is proposed. First and foremost, a cross-scale fusion module (CSFM) is introduced to extract sufficiently comprehensive semantic information from features for performing multi-scale fusion. Moreover, a feature pyramid for target detection utilizing thinning U-shaped modules (TUMs) performs the multi-level fusion of the features. Eventually, a focal loss in the prediction section is used to control the large number of negative samples generated during the feature fusion process. The new architecture of the network proposed in this paper is verified by DIOR and RSOD dataset. The experimental results show that the performance of this method is improved by 2–12% in the DIOR dataset and RSOD dataset compared with the current SOTA target detection methods.


2020 ◽  
Vol 40 (13) ◽  
pp. 1315002
Author(s):  
鞠默然 Ju Moran ◽  
罗江宁 Luo Jiangning ◽  
王仲博 Wang Zhongbo ◽  
罗海波 Luo Haibo

2019 ◽  
Vol 9 (18) ◽  
pp. 3775 ◽  
Author(s):  
Ju ◽  
Luo ◽  
Wang ◽  
Hui ◽  
Chang

Target detection is one of the most important research directions in computer vision. Recently, a variety of target detection algorithms have been proposed. Since the targets have varying sizes in a scene, it is essential to be able to detect the targets at different scales. To improve the detection performance of targets with different sizes, a multi-scale target detection algorithm was proposed involving improved YOLO (You Only Look Once) V3. The main contributions of our work include: (1) a mathematical derivation method based on Intersection over Union (IOU) was proposed to select the number and the aspect ratio dimensions of the candidate anchor boxes for each scale of the improved YOLO V3; (2) To further improve the detection performance of the network, the detection scales of YOLO V3 have been extended from 3 to 4 and the feature fusion target detection layer downsampled by 4× is established to detect the small targets; (3) To avoid gradient fading and enhance the reuse of the features, the six convolutional layers in front of the output detection layer are transformed into two residual units. The experimental results upon PASCAL VOC dataset and KITTI dataset show that the proposed method has obtained better performance than other state-of-the-art target detection algorithms.


2021 ◽  
Vol 1873 (1) ◽  
pp. 012020
Author(s):  
Xiaofeng Zhao ◽  
Yebin Xu ◽  
Fei Wu ◽  
Wei Cai ◽  
Zhili Zhang

Sign in / Sign up

Export Citation Format

Share Document