Adaptive Weighted Multi-Level Fusion of Multi-Scale Features: A New Approach to Pedestrian Detection

Great achievements have been made in pedestrian detection through deep learning. For detectors based on deep learning, making better use of features has become the key to their detection effect. While current pedestrian detectors have made efforts in feature utilization to improve their detection performance, the feature utilization is still inadequate. To solve the problem of inadequate feature utilization, we proposed the Multi-Level Feature Fusion Module (MFFM) and its Multi-Scale Feature Fusion Unit (MFFU) sub-module, which connect feature maps of the same scale and different scales by using horizontal and vertical connections and shortcut structures. All of these connections are accompanied by weights that can be learned; thus, they can be used as adaptive multi-level and multi-scale feature fusion modules to fuse the best features. Then, we built a complete pedestrian detector, the Adaptive Feature Fusion Detector (AFFDet), which is an anchor-free one-stage pedestrian detector that can make full use of features for detection. As a result, compared with other methods, our method has better performance on the challenging Caltech Pedestrian Detection Benchmark (Caltech) and has quite competitive speed. It is the current state-of-the-art one-stage pedestrian detection method.

Download Full-text

Deep learning-based tool wear prediction and its application for machining process using multi-scale feature fusion and channel attention mechanism

Measurement ◽

10.1016/j.measurement.2021.109254 ◽

2021 ◽

Vol 177 ◽

pp. 109254

Author(s):

Xingwei Xu ◽

Jianwen Wang ◽

Bingfu Zhong ◽

Weiwei Ming ◽

Ming Chen

Keyword(s):

Deep Learning ◽

Tool Wear ◽

Feature Fusion ◽

Attention Mechanism ◽

Machining Process ◽

Wear Prediction ◽

Scale Feature ◽

Multi Scale ◽

Tool Wear Prediction

Download Full-text

MUFFIN: multi-scale feature fusion for drug–drug interaction prediction

Bioinformatics ◽

10.1093/bioinformatics/btab169 ◽

2021 ◽

Author(s):

Yujie Chen ◽

Tengfei Ma ◽

Xixi Yang ◽

Jianmin Wang ◽

Bosheng Song ◽

...

Keyword(s):

Molecular Structure ◽

Deep Learning ◽

Medical Information ◽

Feature Fusion ◽

Molecular Graph ◽

Knowledge Graph ◽

Sequence Information ◽

Learning Models ◽

Scale Feature ◽

Multi Scale

Abstract Motivation Adverse drug–drug interactions (DDIs) are crucial for drug research and mainly cause morbidity and mortality. Thus, the identification of potential DDIs is essential for doctors, patients and the society. Existing traditional machine learning models rely heavily on handcraft features and lack generalization. Recently, the deep learning approaches that can automatically learn drug features from the molecular graph or drug-related network have improved the ability of computational models to predict unknown DDIs. However, previous works utilized large labeled data and merely considered the structure or sequence information of drugs without considering the relations or topological information between drug and other biomedical objects (e.g. gene, disease and pathway), or considered knowledge graph (KG) without considering the information from the drug molecular structure. Results Accordingly, to effectively explore the joint effect of drug molecular structure and semantic information of drugs in knowledge graph for DDI prediction, we propose a multi-scale feature fusion deep learning model named MUFFIN. MUFFIN can jointly learn the drug representation based on both the drug-self structure information and the KG with rich bio-medical information. In MUFFIN, we designed a bi-level cross strategy that includes cross- and scalar-level components to fuse multi-modal features well. MUFFIN can alleviate the restriction of limited labeled data on deep learning models by crossing the features learned from large-scale KG and drug molecular graph. We evaluated our approach on three datasets and three different tasks including binary-class, multi-class and multi-label DDI prediction tasks. The results showed that MUFFIN outperformed other state-of-the-art baselines. Availability and implementation The source code and data are available at https://github.com/xzenglab/MUFFIN.

Download Full-text

Pedestrian detection algorithm based on improved muti-scale feature fusion

Journal of Physics Conference Series ◽

10.1088/1742-6596/2078/1/012008 ◽

2021 ◽

Vol 2078 (1) ◽

pp. 012008

Author(s):

Hui Liu ◽

Keyang Cheng

Keyword(s):

Clustering Algorithm ◽

Feature Fusion ◽

Pedestrian Detection ◽

Detection Algorithm ◽

Data Sets ◽

False Detection ◽

Scale Feature ◽

Multi Scale ◽

Dilated Convolution ◽

Small Targets

Abstract Aiming at the problem of false detection and missed detection of small targets and occluded targets in the process of pedestrian detection, a pedestrian detection algorithm based on improved multi-scale feature fusion is proposed. First, for the YOLOv4 multi-scale feature fusion module PANet, which does not consider the interaction relationship between scales, PANet is improved to reduce the semantic gap between scales, and the attention mechanism is introduced to learn the importance of different layers to strengthen feature fusion; then, dilated convolution is introduced. Dilated convolution reduces the problem of information loss during the downsampling process; finally, the K-means clustering algorithm is used to redesign the anchor box and modify the loss function to detect a single category. The experimental results show that the improved pedestrian detection algorithm in the INRIA and WiderPerson data sets under different congestion conditions, the AP reaches 96.83% and 59.67%, respectively. Compared with the pedestrian detection results of the YOLOv4 model, the algorithm improves by 2.41% and 1.03%, respectively. The problem of false detection and missed detection of small targets and occlusion has been significantly improved.

Download Full-text

Region-to-boundary deep learning model with multi-scale feature fusion for medical image segmentation

Biomedical Signal Processing and Control ◽

10.1016/j.bspc.2021.103165 ◽

2022 ◽

Vol 71 ◽

pp. 103165

Author(s):

Xiaowei Liu ◽

Lei Yang ◽

Jianguo Chen ◽

Siyang Yu ◽

Keqin Li

Keyword(s):

Image Segmentation ◽

Deep Learning ◽

Medical Image ◽

Feature Fusion ◽

Learning Model ◽

Medical Image Segmentation ◽

Scale Feature ◽

Multi Scale ◽

Deep Learning Model

Download Full-text

A Novel Multi-Scale Feature Fusion Method for Region Proposal Network in Fast Object Detection

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2020070107 ◽

2020 ◽

Vol 16 (3) ◽

pp. 132-145

Author(s):

Gang Liu ◽

Chuyi Wang

Keyword(s):

Object Detection ◽

Multiple Scales ◽

Feature Fusion ◽

Uniform Space ◽

Fusion Method ◽

Well Performance ◽

Feature Maps ◽

Neural Network Models ◽

Scale Feature ◽

Multi Scale

Neural network models have been widely used in the field of object detecting. The region proposal methods are widely used in the current object detection networks and have achieved well performance. The common region proposal methods hunt the objects by generating thousands of the candidate boxes. Compared to other region proposal methods, the region proposal network (RPN) method improves the accuracy and detection speed with several hundred candidate boxes. However, since the feature maps contains insufficient information, the ability of RPN to detect and locate small-sized objects is poor. A novel multi-scale feature fusion method for region proposal network to solve the above problems is proposed in this article. The proposed method is called multi-scale region proposal network (MS-RPN) which can generate suitable feature maps for the region proposal network. In MS-RPN, the selected feature maps at multiple scales are fine turned respectively and compressed into a uniform space. The generated fusion feature maps are called refined fusion features (RFFs). RFFs incorporate abundant detail information and context information. And RFFs are sent to RPN to generate better region proposals. The proposed approach is evaluated on PASCAL VOC 2007 and MS COCO benchmark tasks. MS-RPN obtains significant improvements over the comparable state-of-the-art detection models.

Download Full-text

Glassboxing Deep Learning to Enhance Aircraft Detection from SAR Imagery

Remote Sensing ◽

10.3390/rs13183650 ◽

2021 ◽

Vol 13 (18) ◽

pp. 3650

Author(s):

Ru Luo ◽

Jin Xing ◽

Lifu Chen ◽

Zhouhao Pan ◽

Xingmin Cai ◽

...

Keyword(s):

Deep Learning ◽

Feature Fusion ◽

Detection Performance ◽

Great Success ◽

Sar Image ◽

Feature Maps ◽

Multi Scale ◽

Learning Techniques ◽

Sar Imagery ◽

Aircraft Detection

Although deep learning has achieved great success in aircraft detection from SAR imagery, its blackbox behavior has been criticized for low comprehensibility and interpretability. Such challenges have impeded the trustworthiness and wide application of deep learning techniques in SAR image analytics. In this paper, we propose an innovative eXplainable Artificial Intelligence (XAI) framework to glassbox deep neural networks (DNN) by using aircraft detection as a case study. This framework is composed of three parts: hybrid global attribution mapping (HGAM) for backbone network selection, path aggregation network (PANet), and class-specific confidence scores mapping (CCSM) for visualization of the detector. HGAM integrates the local and global XAI techniques to evaluate the effectiveness of DNN feature extraction; PANet provides advanced feature fusion to generate multi-scale prediction feature maps; while CCSM relies on visualization methods to examine the detection performance with given DNN and input SAR images. This framework can select the optimal backbone DNN for aircraft detection and map the detection performance for better understanding of the DNN. We verify its effectiveness with experiments using Gaofen-3 imagery. Our XAI framework offers an explainable approach to design, develop, and deploy DNN for SAR image analytics.

Download Full-text

Pedestrian detection algorithm based on multi-scale feature extraction and attention feature fusion

Digital Signal Processing ◽

10.1016/j.dsp.2021.103311 ◽

2021 ◽

pp. 103311

Author(s):

Hao Xia ◽

Jun Ma ◽

Jiayu Ou ◽

Xinyao Lv ◽

Chengjie Bai

Keyword(s):

Feature Extraction ◽

Feature Fusion ◽

Pedestrian Detection ◽

Detection Algorithm ◽

Scale Feature ◽

Multi Scale

Download Full-text

Application of Multi-Scale Feature Fusion and Deep Learning in Detection of Steel Strip Surface Defect

2019 International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM) ◽

10.1109/aiam48774.2019.00136 ◽

2019 ◽

Cited By ~ 2

Author(s):

Kangyu Li ◽

Xifeng Wang ◽

Lijuan Ji

Keyword(s):

Deep Learning ◽

Surface Defect ◽

Feature Fusion ◽

Steel Strip ◽

Strip Surface ◽

Scale Feature ◽

Multi Scale

Download Full-text

Attention Fusion for One-Stage Multispectral Pedestrian Detection

Sensors ◽

10.3390/s21124184 ◽

2021 ◽

Vol 21 (12) ◽

pp. 4184

Author(s):

Zhiwei Cao ◽

Huihua Yang ◽

Juan Zhao ◽

Shuhong Guo ◽

Lingqiao Li

Keyword(s):

Feature Fusion ◽

State Of The Art ◽

Pedestrian Detection ◽

Complementary Information ◽

Deep Convolutional Neural Networks ◽

One Stage ◽

Current State ◽

Fusion Methods ◽

Feature Information ◽

Bounding Boxes

Multispectral pedestrian detection, which consists of a color stream and thermal stream, is essential under conditions of insufficient illumination because the fusion of the two streams can provide complementary information for detecting pedestrians based on deep convolutional neural networks (CNNs). In this paper, we introduced and adapted a simple and efficient one-stage YOLOv4 to replace the current state-of-the-art two-stage fast-RCNN for multispectral pedestrian detection and to directly predict bounding boxes with confidence scores. To further improve the detection performance, we analyzed the existing multispectral fusion methods and proposed a novel multispectral channel feature fusion (MCFF) module for integrating the features from the color and thermal streams according to the illumination conditions. Moreover, several fusion architectures, such as Early Fusion, Halfway Fusion, Late Fusion, and Direct Fusion, were carefully designed based on the MCFF to transfer the feature information from the bottom to the top at different stages. Finally, the experimental results on the KAIST and Utokyo pedestrian benchmarks showed that Halfway Fusion was used to obtain the best performance of all architectures and the MCFF could adapt fused features in the two modalities. The log-average miss rate (MR) for the two modalities with reasonable settings were 4.91% and 23.14%, respectively.

Download Full-text

MS-AFF: A Novel Semantic Segmentation Approach for Buried Object Based on Multi-scale Attentional Feature Fusion

10.21203/rs.3.rs-193757/v1 ◽

2021 ◽

Author(s):

Chao Lu ◽

Fansheng Chen ◽

Xiaofeng Su ◽

Dan Zeng

Keyword(s):

Deep Learning ◽

Spatial Information ◽

Feature Fusion ◽

Infrared Image ◽

Semantic Segmentation ◽

Target Object ◽

Infrared Images ◽

Feature Maps ◽

Multi Scale ◽

Visible Images

Abstract Infrared technology is a widely used in precision guidance and mine detection since it can capture the heat radiated outward from the target object. We use infrared (IR) thermography to get the infrared image of the buried obje cts. Compared to the visible images, infrared images present poor resolution, low contrast, and fuzzy visual effect, which make it difficult to segment the target object, specifically in the complex backgrounds. In this condition, traditional segmentation methods cannot perform well in infrared images since they are easily disturbed by the noise and non-target objects in the images. With the advance of deep convolutional neural network (CNN), the deep learning-based methods have made significant improvements in semantic segmentation task. However, few of them research Infrared image semantic segmentation, which is a more challenging scenario compared to visible images. Moreover, the lack of an Infrared image dataset is also a problem for current methods based on deep learning. We raise a multi-scale attentional feature fusion (MS-AFF) module for infrared image semantic segmentation to solve this problem. Precisely, we integrate a series of feature maps from different levels by an atrous spatial pyramid structure. In this way, the model can obtain rich representation ability on the infrared images. Besides, a global spatial information attention module is employed to let the model focus on the target region and reduce disturbance in infrared images' background. In addition, we propose an infrared segmentation dataset based on the infrared thermal imaging system. Extensive experiments conducted in the infrared image segmentation dataset show the superiority of our method.

Download Full-text