Single Shot Feature Aggregation Network for Underwater Object Detection

In this study, we designed a four-dimensional (4D) audiovisual entertainment system called Sense. This system comprises a scene recognition system and hardware modules that provide haptic sensations for users when they watch movies and animations at home. In the scene recognition system, we used Google Cloud Vision to detect common scene elements in a video, such as fire, explosions, wind, and rain, and further determine whether the scene depicts hot weather, rain, or snow. Additionally, for animated videos, we applied deep learning with a single shot multibox detector to detect whether the animated video contained scenes of fire-related objects. The hardware module was designed to provide six types of haptic sensations set as line-symmetry to provide a better user experience. After the system considers the results of object detection via the scene recognition system, the system generates corresponding haptic sensations. The system integrates deep learning, auditory signals, and haptic sensations to provide an enhanced viewing experience.

Download Full-text

Investigating the Potential of Network Optimization for a Constrained Object Detection Problem

Journal of Imaging ◽

10.3390/jimaging7040064 ◽

2021 ◽

Vol 7 (4) ◽

pp. 64

Author(s):

Tanguy Ophoff ◽

Cédric Gullentops ◽

Kristof Van Beeck ◽

Toon Goedemé

Keyword(s):

Computational Complexity ◽

Object Detection ◽

Network Optimization ◽

Real Life ◽

Optimization Techniques ◽

Training Data ◽

Single Shot ◽

Standard Object ◽

Number Of Classes

Object detection models are usually trained and evaluated on highly complicated, challenging academic datasets, which results in deep networks requiring lots of computations. However, a lot of operational use-cases consist of more constrained situations: they have a limited number of classes to be detected, less intra-class variance, less lighting and background variance, constrained or even fixed camera viewpoints, etc. In these cases, we hypothesize that smaller networks could be used without deteriorating the accuracy. However, there are multiple reasons why this does not happen in practice. Firstly, overparameterized networks tend to learn better, and secondly, transfer learning is usually used to reduce the necessary amount of training data. In this paper, we investigate how much we can reduce the computational complexity of a standard object detection network in such constrained object detection problems. As a case study, we focus on a well-known single-shot object detector, YoloV2, and combine three different techniques to reduce the computational complexity of the model without reducing its accuracy on our target dataset. To investigate the influence of the problem complexity, we compare two datasets: a prototypical academic (Pascal VOC) and a real-life operational (LWIR person detection) dataset. The three optimization steps we exploited are: swapping all the convolutions for depth-wise separable convolutions, perform pruning and use weight quantization. The results of our case study indeed substantiate our hypothesis that the more constrained a problem is, the more the network can be optimized. On the constrained operational dataset, combining these optimization techniques allowed us to reduce the computational complexity with a factor of 349, as compared to only a factor 9.8 on the academic dataset. When running a benchmark on an Nvidia Jetson AGX Xavier, our fastest model runs more than 15 times faster than the original YoloV2 model, whilst increasing the accuracy by 5% Average Precision (AP).

Download Full-text

Deep Sensor Fusion Based on Frustum Point Single Shot Multibox Detector for 3D Object Detection

10.1109/icip42928.2021.9506167 ◽

2021 ◽

Author(s):

Yu Wang ◽

Ye Zhang ◽

Shaohua Zhai ◽

Hao Chen ◽

Shaoqi Shi ◽

...

Keyword(s):

Object Detection ◽

Sensor Fusion ◽

Single Shot ◽

3D Object ◽

3D Object Detection

Download Full-text

Single-shot Weakly-supervised Object Detection Guided by Empirical Saliency Model

Neurocomputing ◽

10.1016/j.neucom.2021.03.047 ◽

2021 ◽

Author(s):

Danpei Zhao ◽

Zhichao Yuan ◽

Zhenwei Shi ◽

Fengying Xie

Keyword(s):

Object Detection ◽

Single Shot ◽

Saliency Model ◽

Weakly Supervised

Download Full-text

Small Object Detection in Remote Sensing Images with Residual Feature Aggregation-Based Super-Resolution and Object Detector Network

Remote Sensing ◽

10.3390/rs13091854 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1854

Author(s):

Syed Muhammad Arsalan Bashir ◽

Yi Wang

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Resolution Enhancement ◽

Super Resolution ◽

Detection Performance ◽

Image Resolution ◽

Small Object ◽

Remote Sensing Images ◽

Feature Aggregation ◽

Image Super Resolution

This paper deals with detecting small objects in remote sensing images from satellites or any aerial vehicle by utilizing the concept of image super-resolution for image resolution enhancement using a deep-learning-based detection method. This paper provides a rationale for image super-resolution for small objects by improving the current super-resolution (SR) framework by incorporating a cyclic generative adversarial network (GAN) and residual feature aggregation (RFA) to improve detection performance. The novelty of the method is threefold: first, a framework is proposed, independent of the final object detector used in research, i.e., YOLOv3 could be replaced with Faster R-CNN or any object detector to perform object detection; second, a residual feature aggregation network was used in the generator, which significantly improved the detection performance as the RFA network detected complex features; and third, the whole network was transformed into a cyclic GAN. The image super-resolution cyclic GAN with RFA and YOLO as the detection network is termed as SRCGAN-RFA-YOLO, which is compared with the detection accuracies of other methods. Rigorous experiments on both satellite images and aerial images (ISPRS Potsdam, VAID, and Draper Satellite Image Chronology datasets) were performed, and the results showed that the detection performance increased by using super-resolution methods for spatial resolution enhancement; for an IoU of 0.10, AP of 0.7867 was achieved for a scale factor of 16.

Download Full-text