Knowledge-Driven Network for Object Detection

Object detection is a challenging computer vision task with numerous real-world applications. In recent years, the concept of the object relationship model has become helpful for object detection and has been verified and realized in deep learning. Nonetheless, most approaches to modeling object relations are limited to using the anchor-based algorithms; they cannot be directly migrated to the anchor-free frameworks. The reason is that the anchor-free algorithms are used to eliminate the complex design of anchors and predict heatmaps to represent the locations of keypoints of different object categories, without considering the relationship between keypoints. Therefore, to better fuse the information between the heatmap channels, it is important to model the visual relationship between keypoints. In this paper, we present a knowledge-driven network (KDNet)—a new architecture that can aggregate and model keypoint relations to augment object features for detection. Specifically, it processes a set of keypoints simultaneously through interactions between their local and geometric features, thereby allowing the modeling of their relationship. Finally, the updated heatmaps were used to obtain the corners of the objects and determine their positions. The experimental results conducted on the RIDER dataset confirm the effectiveness of the proposed KDNet, which significantly outperformed other state-of-the-art object detection methods.

Download Full-text

Optical Prior-Based Underwater Object Detection with Active Imaging

Complexity ◽

10.1155/2021/6656166 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Jie Shen ◽

Zhenxin Xu ◽

Zhe Chen ◽

Huibin Wang ◽

Xiaotao Shi

Keyword(s):

Object Detection ◽

Detection Methods ◽

Art Object ◽

Image Prior ◽

Detection Approach ◽

Training Samples ◽

Active Imaging ◽

Underwater Object ◽

Informative Content ◽

Object Features

Underwater object detection plays an important role in research and practice, as it provides condensed and informative content that represents underwater objects. However, detecting objects from underwater images is challenging because underwater environments significantly degenerate image quality and distort the contrast between the object and background. To address this problem, this paper proposes an optical prior-based underwater object detection approach that takes advantage of optical principles to identify optical collimation over underwater images, providing valuable guidance for extracting object features. Unlike data-driven knowledge, the prior in our method is independent of training samples. The fundamental novelty of our approach lies in the integration of an image prior and the object detection task. This novelty is fundamental to the satisfying performance of our approach in underwater environments, which is demonstrated through comparisons with state-of-the-art object detection methods.

Download Full-text

Investigations of Object Detection in Images/Videos Using Various Deep Learning Techniques and Embedded Platforms—A Comprehensive Review

Applied Sciences ◽

10.3390/app10093280 ◽

2020 ◽

Vol 10 (9) ◽

pp. 3280 ◽

Cited By ~ 3

Author(s):

Chinthakindi Balaram Murthy ◽

Mohammad Farukh Hashmi ◽

Neeraj Dhanraj Bokde ◽

Zong Woo Geem

Keyword(s):

Deep Learning ◽

Object Detection ◽

Pedestrian Detection ◽

Detection Methods ◽

Art Object ◽

Current State ◽

Learning Techniques ◽

Specific Object ◽

Benchmark Datasets ◽

Speed Up

In recent years there has been remarkable progress in one computer vision application area: object detection. One of the most challenging and fundamental problems in object detection is locating a specific object from the multiple objects present in a scene. Earlier traditional detection methods were used for detecting the objects with the introduction of convolutional neural networks. From 2012 onward, deep learning-based techniques were used for feature extraction, and that led to remarkable breakthroughs in this area. This paper shows a detailed survey on recent advancements and achievements in object detection using various deep learning techniques. Several topics have been included, such as Viola–Jones (VJ), histogram of oriented gradient (HOG), one-shot and two-shot detectors, benchmark datasets, evaluation metrics, speed-up techniques, and current state-of-art object detectors. Detailed discussions on some important applications in object detection areas, including pedestrian detection, crowd detection, and real-time object detection on Gpu-based embedded systems have been presented. At last, we conclude by identifying promising future directions.

Download Full-text

Assessment of CNN-Based Methods for Individual Tree Detection on Images Captured by RGB Cameras Attached to UAVs

Sensors ◽

10.3390/s19163595 ◽

2019 ◽

Vol 19 (16) ◽

pp. 3595 ◽

Cited By ~ 19

Author(s):

Anderson Aparecido dos Santos ◽

José Marcato Junior ◽

Márcio Santos Araújo ◽

David Robledo Di Martini ◽

Everton Castelão Tetila ◽

...

Keyword(s):

Neural Network ◽

Object Detection ◽

Convolutional Neural Network ◽

Spatial Resolution ◽

Tree Species ◽

Remote Sensing Data ◽

Target Object ◽

Detection Methods ◽

Individual Tree ◽

Art Object

Detection and classification of tree species from remote sensing data were performed using mainly multispectral and hyperspectral images and Light Detection And Ranging (LiDAR) data. Despite the comparatively lower cost and higher spatial resolution, few studies focused on images captured by Red-Green-Blue (RGB) sensors. Besides, the recent years have witnessed an impressive progress of deep learning methods for object detection. Motivated by this scenario, we proposed and evaluated the usage of Convolutional Neural Network (CNN)-based methods combined with Unmanned Aerial Vehicle (UAV) high spatial resolution RGB imagery for the detection of law protected tree species. Three state-of-the-art object detection methods were evaluated: Faster Region-based Convolutional Neural Network (Faster R-CNN), YOLOv3 and RetinaNet. A dataset was built to assess the selected methods, comprising 392 RBG images captured from August 2018 to February 2019, over a forested urban area in midwest Brazil. The target object is an important tree species threatened by extinction known as Dipteryx alata Vogel (Fabaceae). The experimental analysis delivered average precision around 92% with an associated processing times below 30 miliseconds.

Download Full-text

Real-Time Small Drones Detection Based on Pruned YOLOv4

Sensors ◽

10.3390/s21103374 ◽

2021 ◽

Vol 21 (10) ◽

pp. 3374

Author(s):

Hansen Liu ◽

Kuangang Fan ◽

Qinghua Ouyang ◽

Na Li

Keyword(s):

Object Detection ◽

Real Time ◽

Processing Speed ◽

State Of The Art ◽

Detection Methods ◽

Detection Accuracy ◽

Small Object ◽

Art Object ◽

Real Time Detection ◽

Small Object Detection

To address the threat of drones intruding into high-security areas, the real-time detection of drones is urgently required to protect these areas. There are two main difficulties in real-time detection of drones. One of them is that the drones move quickly, which leads to requiring faster detectors. Another problem is that small drones are difficult to detect. In this paper, firstly, we achieve high detection accuracy by evaluating three state-of-the-art object detection methods: RetinaNet, FCOS, YOLOv3 and YOLOv4. Then, to address the first problem, we prune the convolutional channel and shortcut layer of YOLOv4 to develop thinner and shallower models. Furthermore, to improve the accuracy of small drone detection, we implement a special augmentation for small object detection by copying and pasting small drones. Experimental results verify that compared to YOLOv4, our pruned-YOLOv4 model, with 0.8 channel prune rate and 24 layers prune, achieves 90.5% mAP and its processing speed is increased by 60.4%. Additionally, after small object augmentation, the precision and recall of the pruned-YOLOv4 almost increases by 22.8% and 12.7%, respectively. Experiment results verify that our pruned-YOLOv4 is an effective and accurate approach for drone detection.

Download Full-text

Improving YOLOv5 with Attention Mechanism for Detecting Boulders from Planetary Images

Remote Sensing ◽

10.3390/rs13183776 ◽

2021 ◽

Vol 13 (18) ◽

pp. 3776

Author(s):

Linlin Zhu ◽

Xun Geng ◽

Zheng Li ◽

Chun Liu

Keyword(s):

Object Detection ◽

Detection Method ◽

Feature Fusion ◽

Attention Mechanism ◽

Detection Methods ◽

Visual Object ◽

Art Object ◽

Geological Processes ◽

Landing Sites ◽

New Feature

It is of great significance to apply the object detection methods to automatically detect boulders from planetary images and analyze their distribution. This contributes to the selection of candidate landing sites and the understanding of the geological processes. This paper improves the state-of-the-art object detection method of YOLOv5 with attention mechanism and designs a pyramid based approach to detect boulders from planetary images. A new feature fusion layer has been designed to capture more shallow features of the small boulders. The attention modules implemented by combining the convolutional block attention module (CBAM) and efficient channel attention network (ECA-Net) are also added into YOLOv5 to highlight the information that contribute to boulder detection. Based on the Pascal Visual Object Classes 2007 (VOC2007) dataset which is widely used for object detection evaluations and the boulder dataset that we constructed from the images of Bennu asteroid, the evaluation results have shown that the improvements have increased the performance of YOLOv5 by 3.4% in precision. With the improved YOLOv5 detection method, the pyramid based approach extracts several layers of images with different resolutions from the large planetary images and detects boulders of different scales from different layers. We have also applied the proposed approach to detect the boulders on Bennu asteroid. The distribution of the boulders on Bennu asteroid has been analyzed and presented.

Download Full-text

Self-Adaptive Feature Transformation Networks for Object Detection in low luminance Images

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3480973 ◽

2022 ◽

Vol 13 (1) ◽

pp. 1-11

Author(s):

Shih-Chia Huang ◽

Quoc-Viet Hoang ◽

Da-Wei Jaw

Keyword(s):

Object Detection ◽

Extraction Process ◽

The Self ◽

Detection Methods ◽

Feature Transformation ◽

Detection Techniques ◽

Art Object ◽

Low Luminance ◽

Net Framework ◽

Self Adaptive

Despite the recent improvement of object detection techniques, many of them fail to detect objects in low-luminance images. The blurry and dimmed nature of low-luminance images results in the extraction of vague features and failure to detect objects. In addition, many existing object detection methods are based on models trained on both sufficient- and low-luminance images, which also negatively affect the feature extraction process and detection results. In this article, we propose a framework called Self-adaptive Feature Transformation Network (SFT-Net) to effectively detect objects in low-luminance conditions. The proposed SFT-Net consists of the following three modules: (1) feature transformation module, (2) self-adaptive module, and (3) object detection module. The purpose of the feature transformation module is to enhance the extracted feature through unsupervisely learning a feature domain projection procedure. The self-adaptive module is utilized as a probabilistic module producing appropriate features either from the transformed or the original features to further boost the performance and generalization ability of the proposed framework. Finally, the object detection module is designed to accurately detect objects in both low- and sufficient- luminance images by using the appropriate features produced by the self-adaptive module. The experimental results demonstrate that the proposed SFT-Net framework significantly outperforms the state-of-the-art object detection techniques, achieving an average precision (AP) of up to 6.35 and 11.89 higher on the sufficient- and low- luminance domain, respectively.

Download Full-text

Efficient Object Detection Algorithm in Kitchen Appliance Scene Images Based on Deep Learning

Mathematical Problems in Engineering ◽

10.1155/2020/6641491 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Manhuai Lu ◽

Liqin Chen

Keyword(s):

Object Detection ◽

Detection Method ◽

Specular Reflection ◽

Detection Algorithm ◽

Detection Methods ◽

External Disturbances ◽

Feature Enhancement ◽

Art Object ◽

Testing Dataset ◽

Pattern Information

The accuracy of object detection based on kitchen appliance scene images can suffer severely from external disturbances such as various levels of specular reflection, uneven lighting, and spurious lighting, as well as internal scene-related disturbances such as invalid edges and pattern information unrelated to the object of interest. The present study addresses these unique challenges by proposing an object detection method based on improved faster R-CNN algorithm. The improved method can identify object regions scattered in various areas of complex appliance scenes quickly and automatically. In this paper, we put forward a feature enhancement framework, named deeper region proposal network (D-RPN). In D-RPN, a feature enhancement module is designed to more effectively extract feature information of an object on kitchen appliance scene. Then, we reconstruct a U-shaped network structure using a series of feature enhancement modules. We have evaluated the proposed D-RPN on the dataset we created. It includes all kinds of kitchen appliance control panels captured in nature scene by image collector. In our experiments, the best-performing object detection method obtained a mean average precision mAP value of 89.84% in the testing dataset. The test results show that the proposed improved algorithm achieves higher detecting accuracy than state-of-the-art object detection methods. Finally, our proposed detection method can further be used in text recognition.

Download Full-text

EMPIRICAL EVALUATION OF STATE-OF-THE-ART OBJECT DETECTION METHODS FOR DOCUMENT IMAGE UNDERSTANDING

FAIR - NGHIÊN CỨU CƠ BẢN VÀ ỨNG DỤNG CÔNG NGHỆ THÔNG TIN - 2017 ◽

10.15625/vap.2017.00022 ◽

2017 ◽

Author(s):

Nguyen D. Vo ◽

Khanh Nguyen ◽

Tam V. Nguyen ◽

Khang Nguyen

Keyword(s):

Object Detection ◽

State Of The Art ◽

Empirical Evaluation ◽

Image Understanding ◽

Document Image ◽

Detection Methods ◽

Art Object ◽

Document Image Understanding

Download Full-text

SM-NAS: Structural-to-Modular Neural Architecture Search for Object Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6958 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12661-12668 ◽

Cited By ~ 3

Author(s):

Lewei Yao ◽

Hang Xu ◽

Wei Zhang ◽

Xiaodan Liang ◽

Zhenguo Li

Keyword(s):

Object Detection ◽

Feature Fusion ◽

State Of The Art ◽

Computational Cost ◽

Search Space ◽

Detection Methods ◽

Structural Level ◽

Neural Architecture ◽

Art Object ◽

Searching Strategy

The state-of-the-art object detection method is complicated with various modules such as backbone, RPN, feature fusion neck and RCNN head, where each module may have different designs and structures. How to leverage the computational cost and accuracy trade-off for the structural combination as well as the modular selection of multiple modules? Neural architecture search (NAS) has shown great potential in finding an optimal solution. Existing NAS works for object detection only focus on searching better design of a single module such as backbone or feature fusion neck, while neglecting the balance of the whole system. In this paper, we present a two-stage coarse-to-fine searching strategy named Structural-to-Modular NAS (SM-NAS) for searching a GPU-friendly design of both an efficient combination of modules and better modular-level architecture for object detection. Specifically, Structural-level searching stage first aims to find an efficient combination of different modules; Modular-level searching stage then evolves each specific module and pushes the Pareto front forward to a faster task-specific network. We consider a multi-objective search where the search space covers many popular designs of detection methods. We directly search a detection backbone without pre-trained models or any proxy task by exploring a fast training from scratch strategy. The resulting architectures dominate state-of-the-art object detection systems in both inference time and accuracy and demonstrate the effectiveness on multiple detection datasets, e.g. halving the inference time with additional 1% mAP improvement compared to FPN and reaching 46% mAP with the similar inference time of MaskRCNN.

Download Full-text

A Slimmer Network with Polymorphic and Group Attention Modules for More Efficient Object Detection in Aerial Images

Remote Sensing ◽

10.3390/rs12223750 ◽

2020 ◽

Vol 12 (22) ◽

pp. 3750

Author(s):

Wei Guo ◽

Weihong Li ◽

Zhenghao Li ◽

Weiguo Gong ◽

Jinkai Cui ◽

...

Keyword(s):

Object Detection ◽

Detection Efficiency ◽

Aerial Images ◽

Aerial Image ◽

Detection Methods ◽

Detection Accuracy ◽

Practical Applications ◽

Multi Scale ◽

High Detection Efficiency ◽

Object Features

Object detection is one of the core technologies in aerial image processing and analysis. Although existing aerial image object detection methods based on deep learning have made some progress, there are still some problems remained: (1) Most existing methods fail to simultaneously consider multi-scale and multi-shape object characteristics in aerial images, which may lead to some missing or false detections; (2) high precision detection generally requires a large and complex network structure, which usually makes it difficult to achieve the high detection efficiency and deploy the network on resource-constrained devices for practical applications. To solve these problems, we propose a slimmer network for more efficient object detection in aerial images. Firstly, we design a polymorphic module (PM) for simultaneously learning the multi-scale and multi-shape object features, so as to better detect the hugely different objects in aerial images. Then, we design a group attention module (GAM) for better utilizing the diversiform concatenation features in the network. By designing multiple detection headers with adaptive anchors and the above-mentioned two modules, we propose a one-stage network called PG-YOLO for realizing the higher detection accuracy. Based on the proposed network, we further propose a more efficient channel pruning method, which can slim the network parameters from 63.7 million (M) to 3.3M that decreases the parameter size by 94.8%, so it can significantly improve the detection efficiency for real-time detection. Finally, we execute the comparative experiments on three public aerial datasets, and the experimental results show that the proposed method outperforms the state-of-the-art methods.

Download Full-text