Deep Learning Approaches for Object Detection

In computer vision, object detection is a very important, exciting and mind-blowing study. Object detection work in numerous fields such as observing security, independently/autonomous driving and etc. Deep-learning based object detection techniques have developed at a very fast pace and have attracted the attention of many researchers. The main focus of the 21st century is the development of the object-detection framework, comprehensively and genuinely. In this investigation, we initially investigate and evaluate the various object detection approaches and designate the benchmark datasets. We also delivered the wide-ranging general idea of object detection approaches in an organized way. We covered the first and second stage detectors of object detection methods. And lastly, we consider the construction of these object detection approaches to give dimensions for further research.

Download Full-text

Challenges of Deep Learning-based Text Detection in the Wild

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3543 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-5

Author(s):

Zobeir Raisi ◽

Mohamed A. Naiel ◽

Paul Fieguth ◽

Steven Wardell ◽

John Zelek

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Text Detection ◽

Detection Methods ◽

Learning Approaches ◽

Plane Rotation ◽

The Past ◽

Perspective Distortion ◽

Benchmark Datasets ◽

In The Wild

The reported accuracy of recent state-of-the-art text detection methods, mostly deep learning approaches, is in the order of 80% to 90% on standard benchmark datasets. These methods have relaxed some of the restrictions of structured text and environment (i.e., "in the wild") which are usually required for classical OCR to properly function. Even with this relaxation, there are still circumstances where these state-of-the-art methods fail. Several remaining challenges in wild images, like in-plane-rotation, illumination reflection, partial occlusion, complex font styles, and perspective distortion, cause exciting methods to perform poorly. In order to evaluate current approaches in a formal way, we standardize the datasets and metrics for comparison which had made comparison between these methods difficult in the past. We use three benchmark datasets for our evaluations: ICDAR13, ICDAR15, and COCO-Text V2.0. The objective of the paper is to quantify the current shortcomings and to identify the challenges for future text detection research.

Download Full-text

Investigations of Object Detection in Images/Videos Using Various Deep Learning Techniques and Embedded Platforms—A Comprehensive Review

Applied Sciences ◽

10.3390/app10093280 ◽

2020 ◽

Vol 10 (9) ◽

pp. 3280 ◽

Cited By ~ 3

Author(s):

Chinthakindi Balaram Murthy ◽

Mohammad Farukh Hashmi ◽

Neeraj Dhanraj Bokde ◽

Zong Woo Geem

Keyword(s):

Deep Learning ◽

Object Detection ◽

Pedestrian Detection ◽

Detection Methods ◽

Art Object ◽

Current State ◽

Learning Techniques ◽

Specific Object ◽

Benchmark Datasets ◽

Speed Up

In recent years there has been remarkable progress in one computer vision application area: object detection. One of the most challenging and fundamental problems in object detection is locating a specific object from the multiple objects present in a scene. Earlier traditional detection methods were used for detecting the objects with the introduction of convolutional neural networks. From 2012 onward, deep learning-based techniques were used for feature extraction, and that led to remarkable breakthroughs in this area. This paper shows a detailed survey on recent advancements and achievements in object detection using various deep learning techniques. Several topics have been included, such as Viola–Jones (VJ), histogram of oriented gradient (HOG), one-shot and two-shot detectors, benchmark datasets, evaluation metrics, speed-up techniques, and current state-of-art object detectors. Detailed discussions on some important applications in object detection areas, including pedestrian detection, crowd detection, and real-time object detection on Gpu-based embedded systems have been presented. At last, we conclude by identifying promising future directions.

Download Full-text

A comprehensive survey of LIDAR-based 3D object detection methods with Deep learning for autonomous driving

Computers & Graphics ◽

10.1016/j.cag.2021.07.003 ◽

2021 ◽

Author(s):

Georgios Zamanakos ◽

Lazaros Tsochatzidis ◽

Angelos Amanatiadis ◽

Ioannis Pratikakis

Keyword(s):

Deep Learning ◽

Object Detection ◽

Autonomous Driving ◽

Detection Methods ◽

3D Object ◽

Comprehensive Survey ◽

3D Object Detection

Download Full-text

Saliency detection in deep learning era: trends of development

Information and Control Systems ◽

10.31799/1684-8853-2019-3-10-36 ◽

2019 ◽

pp. 10-36 ◽

Cited By ~ 2

Author(s):

M. N. Favorskaya ◽

L. C. Jain

Keyword(s):

Deep Learning ◽

Object Detection ◽

Event Detection ◽

Visual Analysis ◽

Saliency Detection ◽

Salient Object Detection ◽

Public Image ◽

Detection Methods ◽

Salient Object ◽

Salient Event

Introduction:Saliency detection is a fundamental task of computer vision. Its ultimate aim is to localize the objects of interest that grab human visual attention with respect to the rest of the image. A great variety of saliency models based on different approaches was developed since 1990s. In recent years, the saliency detection has become one of actively studied topic in the theory of Convolutional Neural Network (CNN). Many original decisions using CNNs were proposed for salient object detection and, even, event detection.Purpose:A detailed survey of saliency detection methods in deep learning era allows to understand the current possibilities of CNN approach for visual analysis conducted by the human eyes’ tracking and digital image processing.Results:A survey reflects the recent advances in saliency detection using CNNs. Different models available in literature, such as static and dynamic 2D CNNs for salient object detection and 3D CNNs for salient event detection are discussed in the chronological order. It is worth noting that automatic salient event detection in durable videos became possible using the recently appeared 3D CNN combining with 2D CNN for salient audio detection. Also in this article, we have presented a short description of public image and video datasets with annotated salient objects or events, as well as the often used metrics for the results’ evaluation.Practical relevance:This survey is considered as a contribution in the study of rapidly developed deep learning methods with respect to the saliency detection in the images and videos.

Download Full-text

A Survey on Deep Learning Based Methods and Datasets for Monocular 3D Object Detection

Electronics ◽

10.3390/electronics10040517 ◽

2021 ◽

Vol 10 (4) ◽

pp. 517

Author(s):

Seong-heum Kim ◽

Youngbae Hwang

Keyword(s):

Deep Learning ◽

Object Detection ◽

Low Cost ◽

Detection Methods ◽

Future Research ◽

3D Object ◽

Practical Applications ◽

Depth Sensors ◽

Significant Research ◽

3D Object Detection

Owing to recent advancements in deep learning methods and relevant databases, it is becoming increasingly easier to recognize 3D objects using only RGB images from single viewpoints. This study investigates the major breakthroughs and current progress in deep learning-based monocular 3D object detection. For relatively low-cost data acquisition systems without depth sensors or cameras at multiple viewpoints, we first consider existing databases with 2D RGB photos and their relevant attributes. Based on this simple sensor modality for practical applications, deep learning-based monocular 3D object detection methods that overcome significant research challenges are categorized and summarized. We present the key concepts and detailed descriptions of representative single-stage and multiple-stage detection solutions. In addition, we discuss the effectiveness of the detection models on their baseline benchmarks. Finally, we explore several directions for future research on monocular 3D object detection.

Download Full-text

A Set of Single YOLO Modalities to Detect Occluded Entities via Viewpoint Conversion

Applied Sciences ◽

10.3390/app11136016 ◽

2021 ◽

Vol 11 (13) ◽

pp. 6016

Author(s):

Jinsoo Kim ◽

Jeongho Cho

Keyword(s):

Object Detection ◽

Autonomous Vehicles ◽

Autonomous Driving ◽

Detection Algorithm ◽

Detection Accuracy ◽

Cloud Data ◽

Detection Techniques ◽

Bounding Boxes ◽

Partially Occluded ◽

Rgb Image

For autonomous vehicles, it is critical to be aware of the driving environment to avoid collisions and drive safely. The recent evolution of convolutional neural networks has contributed significantly to accelerating the development of object detection techniques that enable autonomous vehicles to handle rapid changes in various driving environments. However, collisions in an autonomous driving environment can still occur due to undetected obstacles and various perception problems, particularly occlusion. Thus, we propose a robust object detection algorithm for environments in which objects are truncated or occluded by employing RGB image and light detection and ranging (LiDAR) bird’s eye view (BEV) representations. This structure combines independent detection results obtained in parallel through “you only look once” networks using an RGB image and a height map converted from the BEV representations of LiDAR’s point cloud data (PCD). The region proposal of an object is determined via non-maximum suppression, which suppresses the bounding boxes of adjacent regions. A performance evaluation of the proposed scheme was performed using the KITTI vision benchmark suite dataset. The results demonstrate the detection accuracy in the case of integration of PCD BEV representations is superior to when only an RGB camera is used. In addition, robustness is improved by significantly enhancing detection accuracy even when the target objects are partially occluded when viewed from the front, which demonstrates that the proposed algorithm outperforms the conventional RGB-based model.

Download Full-text

An Overview of Deep Learning Based Object Detection Techniques

2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT) ◽

10.1109/iciict1.2019.8741359 ◽

2019 ◽

Cited By ~ 3

Author(s):

C. Bhagya ◽

A. Shyna

Keyword(s):

Deep Learning ◽

Object Detection ◽

Detection Techniques

Download Full-text

A Survey of Graphical Page Object Detection with Deep Neural Networks

10.20944/preprints202104.0739.v1 ◽

2021 ◽

Author(s):

Jwalin Bhatt ◽

Khurram Azeem Hashmi ◽

Muhammad Zeshan Afzal ◽

Didier Stricker

Keyword(s):

Deep Learning ◽

Object Detection ◽

Conceptual Understanding ◽

Deep Neural Networks ◽

State Of The Art ◽

Learning Approaches ◽

Document Images ◽

Essential Information ◽

Current State ◽

High Level

In any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document analysis pipelines is to detect these graphical components. It leads to a high-level conceptual understanding of the documents that makes digitization of documents viable. Since the advent of deep learning, the performance of deep learning-based object detection has improved many folds. In this work, we outline and summarize the deep learning approaches for detecting graphical page objects in the document images. Therefore, we discuss the most relevant deep learning-based approaches and state-of-the-art graphical page object detection in document images. This work provides a comprehensive understanding of the current state-of-the-art and related challenges. Furthermore, we discuss leading datasets along with the quantitative evaluation. Moreover, it discusses briefly the promising directions that can be utilized for further improvements.

Download Full-text

Design of robust deep learning-based object detection and classification model for autonomous driving applications

Soft Computing ◽

10.1007/s00500-021-06706-0 ◽

2022 ◽

Author(s):

Mesfer Al Duhayyim ◽

Fahd N. Al-Wesabi ◽

Anwer Mustafa Hilal ◽

Manar Ahmed Hamza ◽

Shalini Goel ◽

...

Keyword(s):

Deep Learning ◽

Object Detection ◽

Autonomous Driving ◽

Classification Model

Download Full-text

SuperVAE: Superpixelwise Variational Autoencoder for Salient Object Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018569 ◽

2019 ◽

Vol 33 ◽

pp. 8569-8576 ◽

Cited By ~ 2

Author(s):

Bo Li ◽

Zhengxing Sun ◽

Yuqi Guo

Keyword(s):

Deep Learning ◽

Object Detection ◽

Saliency Detection ◽

Salient Object Detection ◽

Salient Object ◽

Image Saliency ◽

Spatial Consistency ◽

Variational Autoencoder ◽

Benchmark Datasets ◽

Supervised Methods

Image saliency detection has recently witnessed rapid progress due to deep neural networks. However, there still exist many important problems in the existing deep learning based methods. Pixel-wise convolutional neural network (CNN) methods suffer from blurry boundaries due to the convolutional and pooling operations. While region-based deep learning methods lack spatial consistency since they deal with each region independently. In this paper, we propose a novel salient object detection framework using a superpixelwise variational autoencoder (SuperVAE) network. We first use VAE to model the image background and then separate salient objects from the background through the reconstruction residuals. To better capture semantic and spatial contexts information, we also propose a perceptual loss to take advantage from deep pre-trained CNNs to train our SuperVAE network. Without the supervision of mask-level annotated data, our method generates high quality saliency results which can better preserve object boundaries and maintain the spatial consistency. Extensive experiments on five wildly-used benchmark datasets show that the proposed method achieves superior or competitive performance compared to other algorithms including the very recent state-of-the-art supervised methods.

Download Full-text