Real-time Robust Object Detection Using an Adjacent Feature Fusion-based Single Shot Multibox Detector

The high requirements for computing and memory are the biggest challenges in deploying existing object detection networks to embedded devices. Living lightweight object detectors directly use lightweight neural network architectures such as MobileNet or ShuffleNet pre-trained on large-scale classification datasets, which results in poor network structure flexibility and is not suitable for some specific scenarios. In this paper, we propose a lightweight object detection network Single-Shot MultiBox Detector (SSD)7-Feature Fusion and Attention Mechanism (FFAM), which saves storage space and reduces the amount of calculation by reducing the number of convolutional layers. We offer a novel Feature Fusion and Attention Mechanism (FFAM) method to improve detection accuracy. Firstly, the FFAM method fuses high-level semantic information-rich feature maps with low-level feature maps to improve small objects’ detection accuracy. The lightweight attention mechanism cascaded by channels and spatial attention modules is employed to enhance the target’s contextual information and guide the network to focus on its easy-to-recognize features. The SSD7-FFAM achieves 83.7% mean Average Precision (mAP), 1.66 MB parameters, and 0.033 s average running time on the NWPU VHR-10 dataset. The results indicate that the proposed SSD7-FFAM is more suitable for deployment to embedded devices for real-time object detection.

Download Full-text

An Approach to Improve SSD through Skip Connection of Multiscale Feature Maps

Computational Intelligence and Neuroscience ◽

10.1155/2020/2936920 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Xiaoguo Zhang ◽

Ye Gao ◽

Fei Ye ◽

Qihan Liu ◽

Kaixin Zhang

Keyword(s):

Object Detection ◽

Real Time ◽

Semantic Information ◽

Feature Fusion ◽

Poor Performance ◽

Detection Performance ◽

Single Shot ◽

Small Object ◽

Feature Maps ◽

Input Size

SSD (Single Shot MultiBox Detector) is one of the best object detection algorithms and is able to provide high accurate object detection performance in real time. However, SSD shows relatively poor performance on small object detection because its shallow prediction layer, which is responsible for detecting small objects, lacks enough semantic information. To overcome this problem, SKIPSSD, an improved SSD with a novel skip connection of multiscale feature maps, is proposed in this paper to enhance the semantic information and the details of the prediction layers through skippingly fusing high-level and low-level feature maps. For the detail of the fusion methods, we design two feature fusion modules and multiple fusion strategies to improve the SSD detector’s sensitivity and perception ability. Experimental results on the PASCAL VOC2007 test set demonstrate that SKIPSSD significantly improves the detection performance and outperforms lots of state-of-the-art object detectors. With an input size of 300 × 300, SKIPSSD achieves 79.0% mAP (mean average precision) at 38.7 FPS (frame per second) on a single 1080 GPU, 1.8% higher than the mAP of SSD while still keeping the real-time detection speed.

Download Full-text

Tiny SSD: A Tiny Single-Shot Detection Deep Convolutional Neural Network for Real-Time Embedded Object Detection

2018 15th Conference on Computer and Robot Vision (CRV) ◽

10.1109/crv.2018.00023 ◽

2018 ◽

Cited By ~ 23

Author(s):

Alexander Womg ◽

Mohammad Javad Shafiee ◽

Francis Li ◽

Brendan Chwyl

Keyword(s):

Neural Network ◽

Object Detection ◽

Convolutional Neural Network ◽

Real Time ◽

Deep Convolutional Neural Network ◽

Single Shot ◽

Shot Detection

Download Full-text

Real Time Multi Object Detection for Blind Using Single Shot Multibox Detector

Wireless Personal Communications ◽

10.1007/s11277-019-06294-1 ◽

2019 ◽

Vol 107 (1) ◽

pp. 651-661 ◽

Cited By ~ 5

Author(s):

Adwitiya Arora ◽

Atul Grover ◽

Raksha Chugh ◽

S. Sofana Reka

Keyword(s):

Object Detection ◽

Real Time ◽

Single Shot

Download Full-text

Multi Scale Object Detection Based on Single Shot Multibox Detector with Feature Fusion and Inception Network

The Journal of Korean Institute of Information Technology ◽

10.14801/jkiit.2018.16.10.93 ◽

2018 ◽

Vol 16 (10) ◽

pp. 93-100 ◽

Cited By ~ 1

Author(s):

Md Foysal Haque ◽

Dae-Seong Kang

Keyword(s):

Object Detection ◽

Feature Fusion ◽

Single Shot ◽

Multi Scale

Download Full-text

Road Object Detection: A Comparative Study of Deep Learning-Based Algorithms

Electronics ◽

10.3390/electronics10161932 ◽

2021 ◽

Vol 10 (16) ◽

pp. 1932

Author(s):

Malik Haris ◽

Adam Glowacz

Keyword(s):

Image Processing ◽

Deep Learning ◽

Object Detection ◽

Real Time ◽

Large Scale ◽

Single Shot ◽

Automated Driving ◽

Convolutional Network ◽

Image Processing Algorithms ◽

Processing Algorithms

Automated driving and vehicle safety systems need object detection. It is important that object detection be accurate overall and robust to weather and environmental conditions and run in real-time. As a consequence of this approach, they require image processing algorithms to inspect the contents of images. This article compares the accuracy of five major image processing algorithms: Region-based Fully Convolutional Network (R-FCN), Mask Region-based Convolutional Neural Networks (Mask R-CNN), Single Shot Multi-Box Detector (SSD), RetinaNet, and You Only Look Once v4 (YOLOv4). In this comparative analysis, we used a large-scale Berkeley Deep Drive (BDD100K) dataset. Their strengths and limitations are analyzed based on parameters such as accuracy (with/without occlusion and truncation), computation time, precision-recall curve. The comparison is given in this article helpful in understanding the pros and cons of standard deep learning-based algorithms while operating under real-time deployment restrictions. We conclude that the YOLOv4 outperforms accurately in detecting difficult road target objects under complex road scenarios and weather conditions in an identical testing environment.

Download Full-text

Object detection in real time based on improved single shot multi-box detector algorithm

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-020-01826-x ◽

2020 ◽

Vol 2020 (1) ◽

Author(s):

Ashwani Kumar ◽

Zuopeng Justin Zhang ◽

Hongbo Lyu

Keyword(s):

Neural Networks ◽

Object Detection ◽

Real Time ◽

Convolutional Neural Networks ◽

Single Layer ◽

Single Shot ◽

Convolutional Network ◽

Detection Techniques ◽

Computational Performance ◽

Speed Up

Abstract In today’s scenario, the fastest algorithm which uses a single layer of convolutional network to detect the objects from the image is single shot multi-box detector (SSD) algorithm. This paper studies object detection techniques to detect objects in real time on any device running the proposed model in any environment. In this paper, we have increased the classification accuracy of detecting objects by improving the SSD algorithm while keeping the speed constant. These improvements have been done in their convolutional layers, by using depth-wise separable convolution along with spatial separable convolutions generally called multilayer convolutional neural networks. The proposed method uses these multilayer convolutional neural networks to develop a system model which consists of multilayers to classify the given objects into any of the defined classes. The schemes then use multiple images and detect the objects from these images, labeling them with their respective class label. To speed up the computational performance, the proposed algorithm is applied along with the multilayer convolutional neural network which uses a larger number of default boxes and results in more accurate detection. The accuracy in detecting the objects is checked by different parameters such as loss function, frames per second (FPS), mean average precision (mAP), and aspect ratio. Experimental results confirm that our proposed improved SSD algorithm has high accuracy.

Download Full-text

Ship Detection Based on YOLOv2 for SAR Imagery

Remote Sensing ◽

10.3390/rs11070786 ◽

2019 ◽

Vol 11 (7) ◽

pp. 786 ◽

Cited By ~ 41

Author(s):

Yang-Lang Chang ◽

Amare Anagaw ◽

Lena Chang ◽

Yi Wang ◽

Chih-Yu Hsiao ◽

...

Keyword(s):

Deep Learning ◽

Object Detection ◽

Real Time ◽

Experimental Results ◽

Detection Methods ◽

Computational Time ◽

Detection Accuracy ◽

Single Shot ◽

Ship Detection ◽

Sar Imagery

Synthetic aperture radar (SAR) imagery has been used as a promising data source for monitoring maritime activities, and its application for oil and ship detection has been the focus of many previous research studies. Many object detection methods ranging from traditional to deep learning approaches have been proposed. However, majority of them are computationally intensive and have accuracy problems. The huge volume of the remote sensing data also brings a challenge for real time object detection. To mitigate this problem a high performance computing (HPC) method has been proposed to accelerate SAR imagery analysis, utilizing the GPU based computing methods. In this paper, we propose an enhanced GPU based deep learning method to detect ship from the SAR images. The You Only Look Once version 2 (YOLOv2) deep learning framework is proposed to model the architecture and training the model. YOLOv2 is a state-of-the-art real-time object detection system, which outperforms Faster Region-Based Convolutional Network (Faster R-CNN) and Single Shot Multibox Detector (SSD) methods. Additionally, in order to reduce computational time with relatively competitive detection accuracy, we develop a new architecture with less number of layers called YOLOv2-reduced. In the experiment, we use two types of datasets: A SAR ship detection dataset (SSDD) dataset and a Diversified SAR Ship Detection Dataset (DSSDD). These two datasets were used for training and testing purposes. YOLOv2 test results showed an increase in accuracy of ship detection as well as a noticeable reduction in computational time compared to Faster R-CNN. From the experimental results, the proposed YOLOv2 architecture achieves an accuracy of 90.05% and 89.13% on the SSDD and DSSDD datasets respectively. The proposed YOLOv2-reduced architecture has a similarly competent detection performance as YOLOv2, but with less computational time on a NVIDIA TITAN X GPU. The experimental results shows that the deep learning can make a big leap forward in improving the performance of SAR image ship detection.

Download Full-text

Application of Two New Feature Fusion Networks to Improve Real-time Prostate Capsula Detection

Current Medical Imaging Formerly Current Medical Imaging Reviews ◽

10.2174/1573405617666210129110832 ◽

2021 ◽

Vol 17 ◽

Author(s):

Shixiao Wu ◽

Chengcheng Guo ◽

Xinghuan Wang

Keyword(s):

Dimension Reduction ◽

Real Time ◽

Image Denoising ◽

Transient Stability ◽

Feature Fusion ◽

Median Filter ◽

Principal Component ◽

Single Shot ◽

Bilinear Interpolation ◽

New Feature

Background: Excess prostate tissue is trimmed near the prostate capsula boundary during transurethral plasma kinetic enucleation of prostate (PKEP) and transurethral bipolar plasmakinetic resection of prostate (PKRP) surgeries. If too much tissue is removed, a prostate capsula perforation can potentially occur. As such, real-time accurate prostate capsula (PC) detection is critical for the prevention of these perforations. Objective: This study investigated the potential for using image denoising, image dimension reduction and feature fusion to improve real-time prostate capsula detection with two objective. First, this paper mainly studied feature selection and input dimension reduction. Second, image denoising were evaluated, as they are of paramount importance to transient stability assessment based on neural networks. Method: Two new feature fusion techniques, maxpooling bilinear interpolation single-shot multibox detector (PBSSD) and bilinear interpolation single shot multibox detector (BSSD) were proposed. Before original images were sent to the neural network, they were processed by principal component analysis (PCA) and adaptive median filter (AMF) for dimension reduction and image denoising. Results: The results showed that application of PCA and AMF with PBSSD increased the mean average precision (mAP) for prostate capsula images by 8.55% and reached 80.15%, compared with single shot multibox detector (SSD) alone. Application of PCA with BSSD increased the mAP for prostate capsula images by 4.6% compared with SSD alone. Conclusion: Compared with other methods, ours were proven to be more accurate for real-time prostate capsula detection. The improved mAP results suggest that the proposed approaches are powerful tools for improving SSD networks.

Download Full-text

FM-Mnet Real-Time Object Detection Model for Video Feature Fusion Processing

Lecture Notes in Electrical Engineering - Genetic and Evolutionary Computing ◽

10.1007/978-981-16-8430-2_23 ◽

2022 ◽

pp. 249-258

Author(s):

Tiehua Zhou ◽

Yifan Zhang ◽

Yuan Li ◽

Ling Wang

Keyword(s):

Object Detection ◽

Real Time ◽

Feature Fusion ◽

Detection Model ◽

Video Feature

Download Full-text