Survey and Performance Analysis of Object Detection in Challenging Environments

Recent progress in deep learning has led to accurate and efficient generic object detection networks. Training of highly reliable models depends on large datasets with highly textured and rich images. However, in real-world scenarios, the performance of the generic object detection system decreases when (i) occlusions hide the objects, (ii) objects are present in low-light images, or (iii) they are merged with background information. In this paper, we refer to all these situations as challenging environments. With the recent rapid development in generic object detection algorithms, notable progress has been observed in the field of object detection in challenging environments. However, there is no consolidated reference to cover state-of-the-art in this domain. To the best of our knowledge, this paper presents the first comprehensive overview, covering recent approaches that have tackled the problem of object detection in challenging environments. Furthermore, we present the quantitative and qualitative performance analysis of these approaches and discuss the currently available challenging datasets. Moreover, this paper investigates the performance of current state-of-the-art generic object detection algorithms by benchmarking results on the three well-known challenging datasets. Finally, we highlight several current shortcomings and outline future directions.

Download Full-text

Survey and Performance Analysis of Deep Learning Based Object Detection in Challenging Environments

Sensors ◽

10.3390/s21155116 ◽

2021 ◽

Vol 21 (15) ◽

pp. 5116

Author(s):

Muhammad Ahmed ◽

Khurram Azeem Hashmi ◽

Alain Pagani ◽

Marcus Liwicki ◽

Didier Stricker ◽

...

Keyword(s):

Deep Learning ◽

Performance Analysis ◽

Object Detection ◽

State Of The Art ◽

Detection System ◽

Rapid Development ◽

Background Information ◽

Comprehensive Overview ◽

Detection Algorithms ◽

And Performance

Recent progress in deep learning has led to accurate and efficient generic object detection networks. Training of highly reliable models depends on large datasets with highly textured and rich images. However, in real-world scenarios, the performance of the generic object detection system decreases when (i) occlusions hide the objects, (ii) objects are present in low-light images, or (iii) they are merged with background information. In this paper, we refer to all these situations as challenging environments. With the recent rapid development in generic object detection algorithms, notable progress has been observed in the field of deep learning-based object detection in challenging environments. However, there is no consolidated reference to cover the state of the art in this domain. To the best of our knowledge, this paper presents the first comprehensive overview, covering recent approaches that have tackled the problem of object detection in challenging environments. Furthermore, we present a quantitative and qualitative performance analysis of these approaches and discuss the currently available challenging datasets. Moreover, this paper investigates the performance of current state-of-the-art generic object detection algorithms by benchmarking results on the three well-known challenging datasets. Finally, we highlight several current shortcomings and outline future directions.

Download Full-text

Parameter Optimization and Performance Analysis of State-of-the-Art Machine Learning Techniques for Intrusion Detection System (IDS)

2020 23rd International Conference on Computer and Information Technology (ICCIT) ◽

10.1109/iccit51783.2020.9392683 ◽

2020 ◽

Author(s):

Rashedun Nobi Chowdhury ◽

Maliha M. Chowdhury ◽

Sujoy Chowdhury ◽

Mohammed Rashedul Islam ◽

Md. Ahsan Ayub ◽

...

Keyword(s):

Machine Learning ◽

Performance Analysis ◽

Parameter Optimization ◽

Intrusion Detection System ◽

State Of The Art ◽

Detection System ◽

Machine Learning Techniques ◽

Learning Techniques ◽

And Performance ◽

Optimization And Performance

Download Full-text

Transcription Alignment of Historical Vietnamese Manuscripts without Human-Annotated Learning Samples

Applied Sciences ◽

10.3390/app11114894 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4894

Author(s):

Anna Scius-Bertrand ◽

Michael Jungo ◽

Beat Wolf ◽

Andreas Fischer ◽

Marc Bui

Keyword(s):

Object Detection ◽

State Of The Art ◽

Positive Impact ◽

Detection System ◽

Training Data ◽

Detection Accuracy ◽

Current State ◽

Alignment Task ◽

Scanned Image ◽

Automatic Transcription

The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages and scripts. Transcription alignment is a simpler task that aims to find a correspondence between text in the scanned image and its existing Unicode counterpart, a correspondence which can then be used as training data. The alignment task can be approached with heuristic methods dedicated to certain types of manuscripts, or with weakly trained systems reducing the required amount of annotations. In this article, we propose a novel learning-based alignment method based on fully convolutional object detection that does not require any human annotation at all. Instead, the object detection system is initially trained on synthetic printed pages using a font and then adapted to the real manuscripts by means of self-training. On a dataset of historical Vietnamese handwriting, we demonstrate the feasibility of annotation-free alignment as well as the positive impact of self-training on the character detection accuracy, reaching a detection accuracy of 96.4% with a YOLOv5m model without using any human annotation.

Download Full-text

Quantitative performance analysis of object detection algorithms on underwater video footage

Proceedings of the 1st ACM international workshop on Multimedia analysis for ecological data - MAED '12 ◽

10.1145/2390832.2390847 ◽

2012 ◽

Cited By ~ 9

Author(s):

Isaak Kavasidis ◽

Simone Palazzo

Keyword(s):

Performance Analysis ◽

Object Detection ◽

Underwater Video ◽

Video Footage ◽

Detection Algorithms ◽

Quantitative Performance

Download Full-text

A Hard Example Mining Approach for Concealed Multi-Object Detection of Active Terahertz Image

Applied Sciences ◽

10.3390/app112311241 ◽

2021 ◽

Vol 11 (23) ◽

pp. 11241

Author(s):

Ling Li ◽

Fei Xue ◽

Dong Liang ◽

Xiaofei Chen

Keyword(s):

Computer Vision ◽

Object Detection ◽

State Of The Art ◽

Terahertz Imaging ◽

Public Security ◽

Counter Terrorism ◽

Detection Algorithms ◽

Public Dataset ◽

The One ◽

Objects Detection

Concealed objects detection in terahertz imaging is an urgent need for public security and counter-terrorism. So far, there is no public terahertz imaging dataset for the evaluation of objects detection algorithms. This paper provides a public dataset for evaluating multi-object detection algorithms in active terahertz imaging. Due to high sample similarity and poor imaging quality, object detection on this dataset is much more difficult than on those commonly used public object detection datasets in the computer vision field. Since the traditional hard example mining approach is designed based on the two-stage detector and cannot be directly applied to the one-stage detector, this paper designs an image-based Hard Example Mining (HEM) scheme based on RetinaNet. Several state-of-the-art detectors, including YOLOv3, YOLOv4, FRCN-OHEM, and RetinaNet, are evaluated on this dataset. Experimental results show that the RetinaNet achieves the best mAP and HEM further enhances the performance of the model. The parameters affecting the detection metrics of individual images are summarized and analyzed in the experiments.

Download Full-text

An Experimental Analysis of Model Compression Techniques for Object Detection

10.5753/kdmile.2020.11958 ◽

2020 ◽

Author(s):

Andrey De Aguiar Salvi ◽

Rodrigo Coelho Barros

Keyword(s):

Object Detection ◽

Experimental Analysis ◽

State Of The Art ◽

Neural Architecture ◽

Model Compression ◽

Processing Power ◽

Benchmark Datasets ◽

The Difference ◽

And Performance ◽

Consumption Constraints

Recent research on Convolutional Neural Networks focuses on how to create models with a reduced number of parameters and a smaller storage size while keeping the model’s ability to perform its task, allowing the use of the best CNN for automating tasks in limited devices, with reduced processing power, memory, or energy consumption constraints. There are many different approaches in the literature: removing parameters, reduction of the floating-point precision, creating smaller models that mimic larger models, neural architecture search (NAS), etc. With all those possibilities, it is challenging to say which approach provides a better trade-off between model reduction and performance, due to the difference between the approaches, their respective models, the benchmark datasets, or variations in training details. Therefore, this article contributes to the literature by comparing three state-of-the-art model compression approaches to reduce a well-known convolutional approach for object detection, namely YOLOv3. Our experimental analysis shows that it is possible to create a reduced version of YOLOv3 with 90% fewer parameters and still outperform the original model by pruning parameters. We also create models that require only 0.43% of the original model’s inference effort.

Download Full-text

EHSOD: CAM-Guided End-to-End Hybrid-Supervised Object Detection with Cascade Refinement

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6707 ◽

2020 ◽

Vol 34 (07) ◽

pp. 10778-10785

Author(s):

Linpu Fang ◽

Hang Xu ◽

Zhili Liu ◽

Sarah Parisot ◽

Zhenguo Li

Keyword(s):

Object Detection ◽

State Of The Art ◽

Detection System ◽

Parameter Tuning ◽

Heat Map ◽

Level Data ◽

End To End ◽

Weakly Supervised ◽

Yield State ◽

Activation Heat

Object detectors trained on fully-annotated data currently yield state of the art performance but require expensive manual annotations. On the other hand, weakly-supervised detectors have much lower performance and cannot be used reliably in a realistic setting. In this paper, we study the hybrid-supervised object detection problem, aiming to train a high quality detector with only a limited amount of fully-annotated data and fully exploiting cheap data with image-level labels. State of the art methods typically propose an iterative approach, alternating between generating pseudo-labels and updating a detector. This paradigm requires careful manual hyper-parameter tuning for mining good pseudo labels at each round and is quite time-consuming. To address these issues, we present EHSOD, an end-to-end hybrid-supervised object detection system which can be trained in one shot on both fully and weakly-annotated data. Specifically, based on a two-stage detector, we proposed two modules to fully utilize the information from both kinds of labels: 1) CAM-RPN module aims at finding foreground proposals guided by a class activation heat-map; 2) hybrid-supervised cascade module further refines the bounding-box position and classification with the help of an auxiliary head compatible with image-level data. Extensive experiments demonstrate the effectiveness of the proposed method and it achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data, e.g. 37.5% mAP on COCO. We will release the code and the trained models.

Download Full-text

Malaria Parasite Detection in Thick Blood Smear Microscopic Images Using Modified YOLOV3 and YOLOV4 Models

10.21203/rs.3.rs-74079/v1 ◽

2020 ◽

Author(s):

Fetulhak Abdurahman ◽

Kinde Fante Anlay ◽

Mohammed Aliy

Keyword(s):

Object Detection ◽

Malaria Parasite ◽

Clustering Algorithm ◽

State Of The Art ◽

Malaria Diagnosis ◽

Learning Ability ◽

Background Information ◽

Small Object ◽

Microscopic Images ◽

Resource Setting

Abstract Background Information: Manual microscopic examination is still the "golden standard" for malaria diagnosis. The challenge in the manual microscopy is the fact that its accuracy, consistency and speed of diagnosis depends on the skill of the laboratory technician. It is difficult to get highly skilled laboratory technicians in the remote areas of developing countries. In order to alleviate this problem, in this paper, we propose and investigate the state-of-the-art one-stage and two-stage object detection algorithms for automated malaria parasite screening from thick blood slides. Methods: YOLOV3 and YOLOV4 are state-of-the-art object detectors both in terms of accuracy and speed; however, they are not optimized for the detection of small objects such as malaria parasite in microscopic images. To deal with these challenges, we have modified YOLOV3 and YOLOV4 models by increasing the feature scale and by adding more detection layers, without notably decreasing their detection speed. We have proposed one modified YOLOV4 model, called YOLOV4-MOD and two modified models for YOLOV3, which are called YOLOV3-MOD1 and YOLOV3-MOD2. In addition, we have generated new anchor box scales and sizes by using the K-means clustering algorithm to exploit small object detection learning ability of the models.Results: The proposed modified YOLOV3 and YOLOV4 algorithms are evaluated on publicly available malaria dataset and achieve state-of-the-art accuracy by exceeding the performance of their original versions, Faster R-CNN and SSD in terms of mean average precision (mAP), recall, precision, F1 score, and average IOU. For 608 x 608 input resolution YOLOV4-MOD achieves the best detection performance among all the other models with mAP of 96.32%. For the same input resolution YOLOV3-MOD2 and YOLOV3-MOD1 achieved mAP of 96.14% and 95.46% respectively. Conclusions: Th experimental results demonstrate that the performance of the proposed modified YOLOV3 and YOLOV4 models are reliable to be applied for detection of malaria parasite from images that can be captured by smartphone camera over the microscope eyepiece. The proposed system can be easily deployed in low-resource setting and it can save lives.

Download Full-text

Deep Learning Object Detector Using a Combination of Convolutional Neural Network (CNN) Architecture (MiniVGGNet) and Classic Object Detection Algorithm

Pertanika Journal of Science and Technology ◽

10.47836/pjst.28.s2.13 ◽

2020 ◽

Vol 28 (S2) ◽

Author(s):

Asmida Ismail ◽

Siti Anom Ahmad ◽

Azura Che Soh ◽

Mohd Khair Hassan ◽

Hazreen Haizi Harith

Keyword(s):

Neural Network ◽

Deep Learning ◽

Object Detection ◽

Convolutional Neural Network ◽

Detection System ◽

Object Classification ◽

Detection Algorithm ◽

Learning Object ◽

Sliding Windows ◽

Detection Algorithms

The object detection system is a computer technology related to image processing and computer vision that detects instances of semantic objects of a certain class in digital images and videos. The system consists of two main processes, which are classification and detection. Once an object instance has been classified and detected, it is possible to obtain further information, including recognizes the specific instance, track the object over an image sequence and extract further information about the object and the scene. This paper presented an analysis performance of deep learning object detector by combining a deep learning Convolutional Neural Network (CNN) for object classification and applies classic object detection algorithms to devise our own deep learning object detector. MiniVGGNet is an architecture network used to train an object classification, and the data used for this purpose was collected from specific indoor environment building. For object detection, sliding windows and image pyramids were used to localize and detect objects at different locations, and non-maxima suppression (NMS) was used to obtain the final bounding box to localize the object location. Based on the experiment result, the percentage of classification accuracy of the network is 80% to 90% and the time for the system to detect the object is less than 15sec/frame. Experimental results show that there are reasonable and efficient to combine classic object detection method with a deep learning classification approach. The performance of this method can work in some specific use cases and effectively solving the problem of the inaccurate classification and detection of typical features.

Download Full-text

PI-RCNN: An Efficient Multi-Sensor 3D Object Detector with Point-Based Attentive Cont-Conv Fusion Module

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6933 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12460-12467

Author(s):

Liang Xie ◽

Chao Xiang ◽

Zhengxu Yu ◽

Guodong Xu ◽

Zheng Yang ◽

...

Keyword(s):

Object Detection ◽

State Of The Art ◽

Point Clouds ◽

Semantic Features ◽

Feature Maps ◽

3D Object ◽

Detection Algorithms ◽

Full Resolution ◽

Fusion Methods ◽

3D Object Detection

LIDAR point clouds and RGB-images are both extremely essential for 3D object detection. So many state-of-the-art 3D detection algorithms dedicate in fusing these two types of data effectively. However, their fusion methods based on Bird's Eye View (BEV) or voxel format are not accurate. In this paper, we propose a novel fusion approach named Point-based Attentive Cont-conv Fusion(PACF) module, which fuses multi-sensor features directly on 3D points. Except for continuous convolution, we additionally add a Point-Pooling and an Attentive Aggregation to make the fused features more expressive. Moreover, based on the PACF module, we propose a 3D multi-sensor multi-task network called Pointcloud-Image RCNN(PI-RCNN as brief), which handles the image segmentation and 3D object detection tasks. PI-RCNN employs a segmentation sub-network to extract full-resolution semantic feature maps from images and then fuses the multi-sensor features via powerful PACF module. Beneficial from the effectiveness of the PACF module and the expressive semantic features from the segmentation module, PI-RCNN can improve much in 3D object detection. We demonstrate the effectiveness of the PACF module and PI-RCNN on the KITTI 3D Detection benchmark, and our method can achieve state-of-the-art on the metric of 3D AP.

Download Full-text