Hyperparameters Tuning of Faster R-CNN Deep Learning Transfer for Persistent Object Detection in Radar Images

Introduction:Saliency detection is a fundamental task of computer vision. Its ultimate aim is to localize the objects of interest that grab human visual attention with respect to the rest of the image. A great variety of saliency models based on different approaches was developed since 1990s. In recent years, the saliency detection has become one of actively studied topic in the theory of Convolutional Neural Network (CNN). Many original decisions using CNNs were proposed for salient object detection and, even, event detection.Purpose:A detailed survey of saliency detection methods in deep learning era allows to understand the current possibilities of CNN approach for visual analysis conducted by the human eyes’ tracking and digital image processing.Results:A survey reflects the recent advances in saliency detection using CNNs. Different models available in literature, such as static and dynamic 2D CNNs for salient object detection and 3D CNNs for salient event detection are discussed in the chronological order. It is worth noting that automatic salient event detection in durable videos became possible using the recently appeared 3D CNN combining with 2D CNN for salient audio detection. Also in this article, we have presented a short description of public image and video datasets with annotated salient objects or events, as well as the often used metrics for the results’ evaluation.Practical relevance:This survey is considered as a contribution in the study of rapidly developed deep learning methods with respect to the saliency detection in the images and videos.

Download Full-text

COMPARATIVE ANALYSIS OF DEEP LEARNING METHODS FOR OBJECT DETECTION

Advances in Mathematics: Scientific Journal ◽

10.37418/amsj.9.6.54 ◽

2020 ◽

Vol 9 (6) ◽

pp. 3759-3775

Author(s):

K. Gill ◽

V. Mangat

Keyword(s):

Deep Learning ◽

Comparative Analysis ◽

Object Detection ◽

Learning Methods

Download Full-text

Design of Desktop Audiovisual Entertainment System with Deep Learning and Haptic Sensations

Symmetry ◽

10.3390/sym12101718 ◽

2020 ◽

Vol 12 (10) ◽

pp. 1718

Author(s):

Chien-Hsing Chou ◽

Yu-Sheng Su ◽

Che-Ju Hsu ◽

Kong-Chang Lee ◽

Ping-Hsuan Han

Keyword(s):

Deep Learning ◽

Object Detection ◽

User Experience ◽

Recognition System ◽

Scene Recognition ◽

Single Shot ◽

Auditory Signals ◽

Hot Weather ◽

Viewing Experience ◽

At Home

In this study, we designed a four-dimensional (4D) audiovisual entertainment system called Sense. This system comprises a scene recognition system and hardware modules that provide haptic sensations for users when they watch movies and animations at home. In the scene recognition system, we used Google Cloud Vision to detect common scene elements in a video, such as fire, explosions, wind, and rain, and further determine whether the scene depicts hot weather, rain, or snow. Additionally, for animated videos, we applied deep learning with a single shot multibox detector to detect whether the animated video contained scenes of fire-related objects. The hardware module was designed to provide six types of haptic sensations set as line-symmetry to provide a better user experience. After the system considers the results of object detection via the scene recognition system, the system generates corresponding haptic sensations. The system integrates deep learning, auditory signals, and haptic sensations to provide an enhanced viewing experience.

Download Full-text

U-Infuse: Democratization of Customizable Deep Learning for Object Detection

Sensors ◽

10.3390/s21082611 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2611

Author(s):

Andrew Shepley ◽

Greg Falzon ◽

Christopher Lawson ◽

Paul Meek ◽

Paul Kwan

Keyword(s):

Deep Learning ◽

Intellectual Property ◽

Object Detection ◽

Image Data ◽

Learning Technologies ◽

Training Data ◽

Learning Models ◽

Ecological Data ◽

Single Class ◽

Large Numbers

Image data is one of the primary sources of ecological data used in biodiversity conservation and management worldwide. However, classifying and interpreting large numbers of images is time and resource expensive, particularly in the context of camera trapping. Deep learning models have been used to achieve this task but are often not suited to specific applications due to their inability to generalise to new environments and inconsistent performance. Models need to be developed for specific species cohorts and environments, but the technical skills required to achieve this are a key barrier to the accessibility of this technology to ecologists. Thus, there is a strong need to democratize access to deep learning technologies by providing an easy-to-use software application allowing non-technical users to train custom object detectors. U-Infuse addresses this issue by providing ecologists with the ability to train customised models using publicly available images and/or their own images without specific technical expertise. Auto-annotation and annotation editing functionalities minimize the constraints of manually annotating and pre-processing large numbers of images. U-Infuse is a free and open-source software solution that supports both multiclass and single class training and object detection, allowing ecologists to access deep learning technologies usually only available to computer scientists, on their own device, customised for their application, without sharing intellectual property or sensitive data. It provides ecological practitioners with the ability to (i) easily achieve object detection within a user-friendly GUI, generating a species distribution report, and other useful statistics, (ii) custom train deep learning models using publicly available and custom training data, (iii) achieve supervised auto-annotation of images for further training, with the benefit of editing annotations to ensure quality datasets. Broad adoption of U-Infuse by ecological practitioners will improve ecological image analysis and processing by allowing significantly more image data to be processed with minimal expenditure of time and resources, particularly for camera trap images. Ease of training and use of transfer learning means domain-specific models can be trained rapidly, and frequently updated without the need for computer science expertise, or data sharing, protecting intellectual property and privacy.

Download Full-text

Realtime Object Detection via Deep Learning-based Pipelines

Proceedings of the 28th ACM International Conference on Information and Knowledge Management - CIKM '19 ◽

10.1145/3357384.3360320 ◽

2019 ◽

Cited By ~ 1

Author(s):

James G. Shanahan ◽

Liang Dai

Keyword(s):

Deep Learning ◽

Object Detection

Download Full-text

Real-Time Deep Learning-Based Object Detection Framework

2020 IEEE Symposium Series on Computational Intelligence (SSCI) ◽

10.1109/ssci47803.2020.9308493 ◽

2020 ◽

Author(s):

William Tarimo ◽

Moustafa M.Sabra ◽

Shonan Hendre

Keyword(s):

Deep Learning ◽

Object Detection ◽

Real Time

Download Full-text

Research on Object Detection of Traffic Scene Based on Deep Learning

Proceedings of the 2020 Artificial Intelligence and Complex Systems Conference ◽

10.1145/3407703.3407728 ◽

2020 ◽

Author(s):

Zhou Yan ◽

Zhou Jun ◽

Gui Wei

Keyword(s):

Deep Learning ◽

Object Detection

Download Full-text

Foot Gesture Recognition Using High-Compression Radar Signature Image and Deep Learning

Sensors ◽

10.3390/s21113937 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3937

Author(s):

Seungeon Song ◽

Bongseok Kim ◽

Sangdong Kim ◽

Jonghun Lee

Keyword(s):

Deep Learning ◽

Gesture Recognition ◽

Smart Home ◽

Doppler Radar ◽

Radar Images ◽

Radar Signature ◽

Memory Efficiency ◽

High Compression ◽

Value Decomposition ◽

Deep Learning Model

Recently, Doppler radar-based foot gesture recognition has attracted attention as a hands-free tool. Doppler radar-based recognition for various foot gestures is still very challenging. So far, no studies have yet dealt deeply with recognition of various foot gestures based on Doppler radar and a deep learning model. In this paper, we propose a method of foot gesture recognition using a new high-compression radar signature image and deep learning. By means of a deep learning AlexNet model, a new high-compression radar signature is created by extracting dominant features via Singular Value Decomposition (SVD) processing; four different foot gestures including kicking, swinging, sliding, and tapping are recognized. Instead of using an original radar signature, the proposed method improves the memory efficiency required for deep learning training by using a high-compression radar signature. Original and reconstructed radar images with high compression values of 90%, 95%, and 99% were applied for the deep learning AlexNet model. As experimental results, movements of all four different foot gestures and of a rolling baseball were recognized with an accuracy of approximately 98.64%. In the future, due to the radar’s inherent robustness to the surrounding environment, this foot gesture recognition sensor using Doppler radar and deep learning will be widely useful in future automotive and smart home industry fields.

Download Full-text

A Survey on Deep Learning Based Methods and Datasets for Monocular 3D Object Detection

Electronics ◽

10.3390/electronics10040517 ◽

2021 ◽

Vol 10 (4) ◽

pp. 517

Author(s):

Seong-heum Kim ◽

Youngbae Hwang

Keyword(s):

Deep Learning ◽

Object Detection ◽

Low Cost ◽

Detection Methods ◽

Future Research ◽

3D Object ◽

Practical Applications ◽

Depth Sensors ◽

Significant Research ◽

3D Object Detection

Owing to recent advancements in deep learning methods and relevant databases, it is becoming increasingly easier to recognize 3D objects using only RGB images from single viewpoints. This study investigates the major breakthroughs and current progress in deep learning-based monocular 3D object detection. For relatively low-cost data acquisition systems without depth sensors or cameras at multiple viewpoints, we first consider existing databases with 2D RGB photos and their relevant attributes. Based on this simple sensor modality for practical applications, deep learning-based monocular 3D object detection methods that overcome significant research challenges are categorized and summarized. We present the key concepts and detailed descriptions of representative single-stage and multiple-stage detection solutions. In addition, we discuss the effectiveness of the detection models on their baseline benchmarks. Finally, we explore several directions for future research on monocular 3D object detection.

Download Full-text

Small Object Detection of Table Tennis Based on Deep Learning Network

2020 International Conference on Computer Science and Management Technology (ICCSMT) ◽

10.1109/iccsmt51754.2020.00036 ◽

2020 ◽

Author(s):

Weijian Li ◽

Xiaofeng Tan ◽

Zhijie Wang

Keyword(s):

Deep Learning ◽

Object Detection ◽

Small Object ◽

Table Tennis ◽

Learning Network ◽

Deep Learning Network ◽

Small Object Detection

Download Full-text