Enhanced Single Shot Multiple Detection for Real-Time Object Detection in Multiple Scenes

Automated driving and vehicle safety systems need object detection. It is important that object detection be accurate overall and robust to weather and environmental conditions and run in real-time. As a consequence of this approach, they require image processing algorithms to inspect the contents of images. This article compares the accuracy of five major image processing algorithms: Region-based Fully Convolutional Network (R-FCN), Mask Region-based Convolutional Neural Networks (Mask R-CNN), Single Shot Multi-Box Detector (SSD), RetinaNet, and You Only Look Once v4 (YOLOv4). In this comparative analysis, we used a large-scale Berkeley Deep Drive (BDD100K) dataset. Their strengths and limitations are analyzed based on parameters such as accuracy (with/without occlusion and truncation), computation time, precision-recall curve. The comparison is given in this article helpful in understanding the pros and cons of standard deep learning-based algorithms while operating under real-time deployment restrictions. We conclude that the YOLOv4 outperforms accurately in detecting difficult road target objects under complex road scenarios and weather conditions in an identical testing environment.

Download Full-text

Object detection in real time based on improved single shot multi-box detector algorithm

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-020-01826-x ◽

2020 ◽

Vol 2020 (1) ◽

Author(s):

Ashwani Kumar ◽

Zuopeng Justin Zhang ◽

Hongbo Lyu

Keyword(s):

Neural Networks ◽

Object Detection ◽

Real Time ◽

Convolutional Neural Networks ◽

Single Layer ◽

Single Shot ◽

Convolutional Network ◽

Detection Techniques ◽

Computational Performance ◽

Speed Up

Abstract In today’s scenario, the fastest algorithm which uses a single layer of convolutional network to detect the objects from the image is single shot multi-box detector (SSD) algorithm. This paper studies object detection techniques to detect objects in real time on any device running the proposed model in any environment. In this paper, we have increased the classification accuracy of detecting objects by improving the SSD algorithm while keeping the speed constant. These improvements have been done in their convolutional layers, by using depth-wise separable convolution along with spatial separable convolutions generally called multilayer convolutional neural networks. The proposed method uses these multilayer convolutional neural networks to develop a system model which consists of multilayers to classify the given objects into any of the defined classes. The schemes then use multiple images and detect the objects from these images, labeling them with their respective class label. To speed up the computational performance, the proposed algorithm is applied along with the multilayer convolutional neural network which uses a larger number of default boxes and results in more accurate detection. The accuracy in detecting the objects is checked by different parameters such as loss function, frames per second (FPS), mean average precision (mAP), and aspect ratio. Experimental results confirm that our proposed improved SSD algorithm has high accuracy.

Download Full-text

Ship Detection Based on YOLOv2 for SAR Imagery

Remote Sensing ◽

10.3390/rs11070786 ◽

2019 ◽

Vol 11 (7) ◽

pp. 786 ◽

Cited By ~ 41

Author(s):

Yang-Lang Chang ◽

Amare Anagaw ◽

Lena Chang ◽

Yi Wang ◽

Chih-Yu Hsiao ◽

...

Keyword(s):

Deep Learning ◽

Object Detection ◽

Real Time ◽

Experimental Results ◽

Detection Methods ◽

Computational Time ◽

Detection Accuracy ◽

Single Shot ◽

Ship Detection ◽

Sar Imagery

Synthetic aperture radar (SAR) imagery has been used as a promising data source for monitoring maritime activities, and its application for oil and ship detection has been the focus of many previous research studies. Many object detection methods ranging from traditional to deep learning approaches have been proposed. However, majority of them are computationally intensive and have accuracy problems. The huge volume of the remote sensing data also brings a challenge for real time object detection. To mitigate this problem a high performance computing (HPC) method has been proposed to accelerate SAR imagery analysis, utilizing the GPU based computing methods. In this paper, we propose an enhanced GPU based deep learning method to detect ship from the SAR images. The You Only Look Once version 2 (YOLOv2) deep learning framework is proposed to model the architecture and training the model. YOLOv2 is a state-of-the-art real-time object detection system, which outperforms Faster Region-Based Convolutional Network (Faster R-CNN) and Single Shot Multibox Detector (SSD) methods. Additionally, in order to reduce computational time with relatively competitive detection accuracy, we develop a new architecture with less number of layers called YOLOv2-reduced. In the experiment, we use two types of datasets: A SAR ship detection dataset (SSDD) dataset and a Diversified SAR Ship Detection Dataset (DSSDD). These two datasets were used for training and testing purposes. YOLOv2 test results showed an increase in accuracy of ship detection as well as a noticeable reduction in computational time compared to Faster R-CNN. From the experimental results, the proposed YOLOv2 architecture achieves an accuracy of 90.05% and 89.13% on the SSDD and DSSDD datasets respectively. The proposed YOLOv2-reduced architecture has a similarly competent detection performance as YOLOv2, but with less computational time on a NVIDIA TITAN X GPU. The experimental results shows that the deep learning can make a big leap forward in improving the performance of SAR image ship detection.

Download Full-text

Lightweight Feature Enhancement Network for Single-Shot Object Detection

Sensors ◽

10.3390/s21041066 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1066

Author(s):

Peng Jia ◽

Fuxiang Liu

Keyword(s):

Receptive Field ◽

Object Detection ◽

Real Time ◽

Detection System ◽

Field Enhancement ◽

Detection Performance ◽

Feature Representation ◽

Data Sets ◽

Single Shot ◽

Enhancement Method

At present, the one-stage detector based on the lightweight model can achieve real-time speed, but the detection performance is challenging. To enhance the discriminability and robustness of the model extraction features and improve the detector’s detection performance for small objects, we propose two modules in this work. First, we propose a receptive field enhancement method, referred to as adaptive receptive field fusion (ARFF). It enhances the model’s feature representation ability by adaptively learning the fusion weights of different receptive field branches in the receptive field module. Then, we propose an enhanced up-sampling (EU) module to reduce the information loss caused by up-sampling on the feature map. Finally, we assemble ARFF and EU modules on top of YOLO v3 to build a real-time, high-precision and lightweight object detection system referred to as the ARFF-EU network. We achieve a state-of-the-art speed and accuracy trade-off on both the Pascal VOC and MS COCO data sets, reporting 83.6% AP at 37.5 FPS and 42.5% AP at 33.7 FPS, respectively. The experimental results show that our proposed ARFF and EU modules improve the detection performance of the ARFF-EU network and achieve the development of advanced, very deep detectors while maintaining real-time speed.

Download Full-text

A modified Single-Shot multibox Detector for beyond Real-Time Object Detection

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9413300 ◽

2021 ◽

Author(s):

Georgios Orfanidis ◽

Konstantinos Ioannidis ◽

Stefanos Vrochidis ◽

Anastasios Tefas ◽

Ioannis Kompatsiaris

Keyword(s):

Object Detection ◽

Real Time ◽

Single Shot

Download Full-text

SSD7-FFAM: A Real-Time Object Detection Network Friendly to Embedded Devices from Scratch

Applied Sciences ◽

10.3390/app11031096 ◽

2021 ◽

Vol 11 (3) ◽

pp. 1096

Author(s):

Qing Li ◽

Yingcheng Lin ◽

Wei He

Keyword(s):

Object Detection ◽

Real Time ◽

Large Scale ◽

Feature Fusion ◽

Contextual Information ◽

Attention Mechanism ◽

Detection Accuracy ◽

Single Shot ◽

Feature Maps ◽

Embedded Devices

The high requirements for computing and memory are the biggest challenges in deploying existing object detection networks to embedded devices. Living lightweight object detectors directly use lightweight neural network architectures such as MobileNet or ShuffleNet pre-trained on large-scale classification datasets, which results in poor network structure flexibility and is not suitable for some specific scenarios. In this paper, we propose a lightweight object detection network Single-Shot MultiBox Detector (SSD)7-Feature Fusion and Attention Mechanism (FFAM), which saves storage space and reduces the amount of calculation by reducing the number of convolutional layers. We offer a novel Feature Fusion and Attention Mechanism (FFAM) method to improve detection accuracy. Firstly, the FFAM method fuses high-level semantic information-rich feature maps with low-level feature maps to improve small objects’ detection accuracy. The lightweight attention mechanism cascaded by channels and spatial attention modules is employed to enhance the target’s contextual information and guide the network to focus on its easy-to-recognize features. The SSD7-FFAM achieves 83.7% mean Average Precision (mAP), 1.66 MB parameters, and 0.033 s average running time on the NWPU VHR-10 dataset. The results indicate that the proposed SSD7-FFAM is more suitable for deployment to embedded devices for real-time object detection.

Download Full-text

Exploring RGBDepth Fusion for Real-Time Object Detection

Sensors ◽

10.3390/s19040866 ◽

2019 ◽

Vol 19 (4) ◽

pp. 866 ◽

Cited By ~ 8

Author(s):

Tanguy Ophoff ◽

Kristof Van Beeck ◽

Toon Goedemé

Keyword(s):

Object Detection ◽

Real Time ◽

Network Architecture ◽

Depth Information ◽

Special Focus ◽

Single Shot ◽

Depth Sensing ◽

Real Time Processing ◽

Depth Cameras ◽

Optimal Fusion

In this paper, we investigate whether fusing depth information on top of normal RGB data for camera-based object detection can help to increase the performance of current state-of-the-art single-shot detection networks. Indeed, depth sensing is easily acquired using depth cameras such as a Kinect or stereo setups. We investigate the optimal manner to perform this sensor fusion with a special focus on lightweight single-pass convolutional neural network (CNN) architectures, enabling real-time processing on limited hardware. For this, we implement a network architecture allowing us to parameterize at which network layer both information sources are fused together. We performed exhaustive experiments to determine the optimal fusion point in the network, from which we can conclude that fusing towards the mid to late layers provides the best results. Our best fusion models significantly outperform the baseline RGB network in both accuracy and localization of the detections.

Download Full-text

Object Detection using OpenCV and Deep Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35880 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3920-3923

Author(s):

Balaji G V

Keyword(s):

Deep Learning ◽

Object Detection ◽

Real Time ◽

Classification Algorithm ◽

Video Data ◽

Entire Population ◽

Single Shot ◽

Bounding Box ◽

Real Time Detection ◽

Over Time

Object Detection using SSD (Single Shot Detector) and MobileNets are efficient because this technique detects objects quickly with less resourses without sacrificing performance. In this every class of item for which the classification algorithm has been trained generates a bounding box and an annotation describing that class of object. This provides the foundation for creating several types of analytical features such as the volume of traffic in a certain area over time or the entire population in an area is real-time detection and categorization of objects from video data.

Download Full-text