A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit

Rafael Padilla; Wesley L. Passos; Thadeu L. B. Dias; Sergio L. Netto; Eduardo A. B. da Silva

doi:10.3390/electronics10030279

A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit

Electronics ◽

10.3390/electronics10030279 ◽

2021 ◽

Vol 10 (3) ◽

pp. 279

Author(s):

Rafael Padilla ◽

Wesley L. Passos ◽

Thadeu L. B. Dias ◽

Sergio L. Netto ◽

Eduardo A. B. da Silva

Keyword(s):

Comparative Analysis ◽

Object Detection ◽

Open Source ◽

Performance Metrics ◽

Ground Truth ◽

Detection Methods ◽

Detection Algorithms ◽

Spatio Temporal ◽

Bounding Boxes ◽

Temporal Overlap

Recent outstanding results of supervised object detection in competitions and challenges are often associated with specific metrics and datasets. The evaluation of such methods applied in different contexts have increased the demand for annotated datasets. Annotation tools represent the location and size of objects in distinct formats, leading to a lack of consensus on the representation. Such a scenario often complicates the comparison of object detection methods. This work alleviates this problem along the following lines: (i) It provides an overview of the most relevant evaluation methods used in object detection competitions, highlighting their peculiarities, differences, and advantages; (ii) it examines the most used annotation formats, showing how different implementations may influence the assessment results; and (iii) it provides a novel open-source toolkit supporting different annotation formats and 15 performance metrics, making it easy for researchers to evaluate the performance of their detection algorithms in most known datasets. In addition, this work proposes a new metric, also included in the toolkit, for evaluating object detection in videos that is based on the spatio-temporal overlap between the ground-truth and detected bounding boxes.

Download Full-text

A Fast Orientation Invariant Detector Based on the One-stage Method

MATEC Web of Conferences ◽

10.1051/matecconf/201823204036 ◽

2018 ◽

Vol 232 ◽

pp. 04036

Author(s):

Jun Yin ◽

Huadong Pan ◽

Hui Su ◽

Zhonggeng Liu ◽

Zhirong Peng

Keyword(s):

Object Detection ◽

Loss Function ◽

High Efficiency ◽

Detection Method ◽

State Of The Art ◽

Orientation Angle ◽

Detection Methods ◽

Detection Algorithms ◽

Bounding Boxes ◽

The One

We propose an object detection method that predicts the orientation bounding boxes (OBB) to estimate objects locations, scales and orientations based on YOLO (You Only Look Once), which is one of the top detection algorithms performing well both in accuracy and speed. Horizontal bounding boxes(HBB), which are not robust to orientation variances, are used in the existing object detection methods to detect targets. The proposed orientation invariant YOLO (OIYOLO) detector can effectively deal with the bird’s eye viewpoint images where the orientation angles of the objects are arbitrary. In order to estimate the rotated angle of objects, we design a new angle loss function. Therefore, the training of OIYOLO forces the network to learn the annotated orientation angle of objects, making OIYOLO orientation invariances. The proposed approach that predicts OBB can be applied in other detection frameworks. In additional, to evaluate the proposed OIYOLO detector, we create an UAV-DAHUA datasets that annotated with objects locations, scales and orientation angles accurately. Extensive experiments conducted on UAV-DAHUA and DOTA datasets demonstrate that OIYOLO achieves state-of-the-art detection performance with high efficiency comparing with the baseline YOLO algorithms.

Download Full-text

Multi-Scale Feature Integrated Attention-Based Rotation Network for Object Detection in VHR Aerial Images

Sensors ◽

10.3390/s20061686 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1686 ◽

Cited By ~ 3

Author(s):

Feng Yang ◽

Wentong Li ◽

Haiwei Hu ◽

Wanyi Li ◽

Peng Wang

Keyword(s):

Object Detection ◽

Large Scale ◽

Ground Truth ◽

Classification Performance ◽

Aerial Images ◽

Detection Methods ◽

Robust Detection ◽

Scale Feature ◽

Multi Scale ◽

Bounding Boxes

Accurate and robust detection of multi-class objects in very high resolution (VHR) aerial images has been playing a significant role in many real-world applications. The traditional detection methods have made remarkable progresses with horizontal bounding boxes (HBBs) due to CNNs. However, HBB detection methods still exhibit limitations including the missed detection and the redundant detection regions, especially for densely-distributed and strip-like objects. Besides, large scale variations and diverse background also bring in many challenges. Aiming to address these problems, an effective region-based object detection framework named Multi-scale Feature Integration Attention Rotation Network (MFIAR-Net) is proposed for aerial images with oriented bounding boxes (OBBs), which promotes the integration of the inherent multi-scale pyramid features to generate a discriminative feature map. Meanwhile, the double-path feature attention network supervised by the mask information of ground truth is introduced to guide the network to focus on object regions and suppress the irrelevant noise. To boost the rotation regression and classification performance, we present a robust Rotation Detection Network, which can generate efficient OBB representation. Extensive experiments and comprehensive evaluations on two publicly available datasets demonstrate the effectiveness of the proposed framework.

Download Full-text

A Survey on “Object Detection Algorithms for Visually Impaired People”

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-875 ◽

2021 ◽

pp. 62-66

Author(s):

Prof. Pradnya Kasture ◽

Aishwarya Kumkar ◽

Yash Jagtap ◽

Akshay Tangade ◽

Aditya Pole

Keyword(s):

Object Detection ◽

Electronic Devices ◽

Human Interaction ◽

Detection Methods ◽

Disabled People ◽

Convolutional Network ◽

Detection Algorithms ◽

Visually Impaired People ◽

Bounding Boxes ◽

Visually Disabled

Vision is one in every of the foremost necessary human senses and it plays a really necessary role in human interaction with the surrounding objects. Until now many papers have been published on these topics that shows various different computer vision products and services by developing new electronic devices for the visually disabled people. The aim is to study different object detection methods. As compared to other Object detection methods, YOLO method has multiple advantages. In alternative algorithms like CNN, Fast-CNN the algorithmic program won't investigate the image fully however in YOLO the algorithmic program investigate the image fully by predicting the bounding boxes by making use of convolutional network and possibilities for these boxes and detects the image quicker as compared to alternative algorithms.

Download Full-text

Object Detection Algorithm Based on Improved YOLOv3

Electronics ◽

10.3390/electronics9030537 ◽

2020 ◽

Vol 9 (3) ◽

pp. 537 ◽

Cited By ~ 5

Author(s):

Liquan Zhao ◽

Shuaiyang Li

Keyword(s):

Markov Chains ◽

Object Detection ◽

Large Scale ◽

Ground Truth ◽

Detection Algorithm ◽

Detection Methods ◽

Cluster Method ◽

Cluster Center ◽

Initial Cluster ◽

Bounding Boxes

The ‘You Only Look Once’ v3 (YOLOv3) method is among the most widely used deep learning-based object detection methods. It uses the k-means cluster method to estimate the initial width and height of the predicted bounding boxes. With this method, the estimated width and height are sensitive to the initial cluster centers, and the processing of large-scale datasets is time-consuming. In order to address these problems, a new cluster method for estimating the initial width and height of the predicted bounding boxes has been developed. Firstly, it randomly selects a couple of width and height values as one initial cluster center separate from the width and height of the ground truth boxes. Secondly, it constructs Markov chains based on the selected initial cluster and uses the final points of every Markov chain as the other initial centers. In the construction of Markov chains, the intersection-over-union method is used to compute the distance between the selected initial clusters and each candidate point, instead of the square root method. Finally, this method can be used to continually update the cluster center with each new set of width and height values, which are only a part of the data selected from the datasets. Our simulation results show that the new method has faster convergence speed for initializing the width and height of the predicted bounding boxes and that it can select more representative initial widths and heights of the predicted bounding boxes. Our proposed method achieves better performance than the YOLOv3 method in terms of recall, mean average precision, and F1-score.

Download Full-text

Hot Anchors: A Heuristic Anchors Sampling Method in RCNN-Based Object Detection

Sensors ◽

10.3390/s18103415 ◽

2018 ◽

Vol 18 (10) ◽

pp. 3415 ◽

Cited By ~ 2

Author(s):

Jinpeng Zhang ◽

Jinming Zhang ◽

Shan Yu

Keyword(s):

Object Detection ◽

Sampling Method ◽

Ground Truth ◽

Detection Accuracy ◽

Two Stage ◽

Baseline Model ◽

Detection Algorithms ◽

Imbalance Problem ◽

Image Object Detection ◽

Image Object

In the image object detection task, a huge number of candidate boxes are generated to match with a relatively very small amount of ground-truth boxes, and through this method the learning samples can be created. But in fact the vast majority of the candidate boxes do not contain valid object instances and should be recognized and rejected during the training and evaluation of the network. This leads to extra high computation burden and a serious imbalance problem between object and none-object samples, thereby impeding the algorithm’s performance. Here we propose a new heuristic sampling method to generate candidate boxes for two-stage detection algorithms. It is generally applicable to the current two-stage detection algorithms to improve their detection performance. Experiments on COCO dataset showed that, relative to the baseline model, this new method could significantly increase the detection accuracy and efficiency.

Download Full-text

GLE-Net: A Global and Local Ensemble Network for Aerial Object Detection

International Journal of Computational Intelligence Systems ◽

10.1007/s44196-021-00056-3 ◽

2022 ◽

Vol 15 (1) ◽

Author(s):

Jiajia Liao ◽

Yujun Liu ◽

Yingchao Piao ◽

Jinhe Su ◽

Guorong Cai ◽

...

Keyword(s):

Deep Learning ◽

Object Detection ◽

Aerial Images ◽

Visual Object ◽

Detection Algorithms ◽

Integration Strategy ◽

Bounding Boxes ◽

Global And Local ◽

Deep Learning Model

AbstractRecent advances in camera-equipped drone applications increased the demand for visual object detection algorithms with deep learning for aerial images. There are several limitations in accuracy for a single deep learning model. Inspired by ensemble learning can significantly improve the generalization ability of the model in the machine learning field, we introduce a novel integration strategy to combine the inference results of two different methods without non-maximum suppression. In this paper, a global and local ensemble network (GLE-Net) was proposed to increase the quality of predictions by considering the global weights for different models and adjusting the local weights for bounding boxes. Specifically, the global module assigns different weights to models. In the local module, we group the bounding boxes that corresponding to the same object as a cluster. Each cluster generates a final predict box and assigns the highest score in the cluster as the score of the final predict box. Experiments on benchmarks VisDrone2019 show promising performance of GLE-Net compared with the baseline network.

Download Full-text

A Lightweight Keypoint-Based Oriented Object Detection of Remote Sensing Images

Remote Sensing ◽

10.3390/rs13132459 ◽

2021 ◽

Vol 13 (13) ◽

pp. 2459

Author(s):

Yangyang Li ◽

Heting Mao ◽

Ruijiao Liu ◽

Xuan Pei ◽

Licheng Jiao ◽

...

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Large Scale ◽

Detection Methods ◽

Gaussian Kernel ◽

Remote Sensing Images ◽

Computational Overhead ◽

Comparable Performance ◽

Bounding Boxes ◽

Oriented Object

Object detection in remote sensing images has been widely used in military and civilian fields and is a challenging task due to the complex background, large-scale variation, and dense arrangement in arbitrary orientations of objects. In addition, existing object detection methods rely on the increasingly deeper network, which increases a lot of computational overhead and parameters, and is unfavorable to deployment on the edge devices. In this paper, we proposed a lightweight keypoint-based oriented object detector for remote sensing images. First, we propose a semantic transfer block (STB) when merging shallow and deep features, which reduces noise and restores the semantic information. Then, the proposed adaptive Gaussian kernel (AGK) is adapted to objects of different scales, and further improves detection performance. Finally, we propose the distillation loss associated with object detection to obtain a lightweight student network. Experiments on the HRSC2016 and UCAS-AOD datasets show that the proposed method adapts to different scale objects, obtains accurate bounding boxes, and reduces the influence of complex backgrounds. The comparison with mainstream methods proves that our method has comparable performance under lightweight.

Download Full-text

Enhancing AI-guided STEMI detection algorithms by incorporating higher quality fiduciary EKG elements

European Heart Journal ◽

10.1093/ehjci/ehaa946.3554 ◽

2020 ◽

Vol 41 (Supplement_2) ◽

Author(s):

S Mehta ◽

J Avila ◽

S Niklitschek ◽

F Fernandez ◽

C Villagran ◽

...

Keyword(s):

Performance Indicators ◽

Performance Metrics ◽

Ground Truth ◽

Machine Learning Techniques ◽

Dense Layer ◽

Detection Algorithms ◽

Complete Dataset ◽

Low Pass ◽

Normal Branch ◽

Low Pass Filters

Abstract Background As EKG interpretation paradigms to a physician-free milieu, accumulating massive quantities of distilled pre-processed data becomes a must for machine learning techniques. In our pursuit of reducing ischemic times in STEMI management, we have improved our Artificial Intelligence (AI)-guided diagnostic tool by following a three-step approach: 1) Increase accuracy by adding larger clusters of data. 2) Increase the breadth of EKG classifications to provide more precise feedback and further refine the inputs which ultimately reflects in better and more accurate outputs. 3) Improving the algorithms' ability to discern between cardiovascular entities reflected in the EKG records. Purpose To bolster our algorithm's accuracy and reliability for electrocardiographic STEMI recognition. Methods Dataset: A total of 7,286 12-lead EKG records of 10-seconds length with a sampling frequency of 500 Hz obtained from Latin America Telemedicine Infarct Network from April 2014 to December 2019. This included the following balanced classes: angiographically confirmed STEMI, branch blocks, non-specific ST-T abnormalities, normal, and abnormal (200+ CPT codes, excluding the ones included in other classes). Labels of each record were manually checked by cardiologists to ensure precision (Ground truth). Pre-processing: First and last 250 samples were discarded to avoid a standardization pulse. Order 5 digital low pass filters with a 35 Hz cut-off was applied. For each record, the mean was subtracted to each individual lead. Classification: Determined classes were “STEMI” and “Not-STEMI” (A combination of randomly sampled normal, branch blocks, non-specific ST-T abnormalities and abnormal records – 25% of each subclass). Training & Testing: A 1-D Convolutional Neural Network was trained and tested with a dataset proportion of 90/10, respectively. The last dense layer outputs a probability for each record of being STEMI or Not-STEMI. Additional testing was performed with a subset of the original complete dataset of unconfirmed STEMI. Performance indicators (accuracy, sensitivity, and specificity) were calculated for each model and results were compared with our previous findings from past experiments. Results Complete STEMI data: Accuracy: 95.9% Sensitivity: 95.7% Specificity: 96.5%; Confirmed STEMI: Accuracy: 98.1% Sensitivity: 98.1% Specificity: 98.1%; Prior Data obtained in our previous experiments are shown below for comparison. Conclusion(s) After the addition of clustered pre-processed data, all performance indicators for STEMI detection increased considerably between both Confirmed STEMI datasets. On the other hand, the Complete STEMI dataset kept a strong and steady set of performance metrics when compared with past results. These findings not only validate the consistency and reliability of our algorithm but also connotes the importance of creating a pristine dataset for this and any other AI-derived medical tools. Funding Acknowledgement Type of funding source: None

Download Full-text

CarFree: Hassle-Free Object Detection Dataset Generation Using Carla Autonomous Driving Simulator

Applied Sciences ◽

10.3390/app12010281 ◽

2021 ◽

Vol 12 (1) ◽

pp. 281

Author(s):

Jaesung Jang ◽

Hyeongyu Lee ◽

Jong-Chan Kim

Keyword(s):

Object Detection ◽

Real World ◽

Driving Simulator ◽

Data Extraction ◽

Ground Truth ◽

Autonomous Driving ◽

Relevant Information ◽

Detection Accuracy ◽

Daunting Task ◽

Bounding Boxes

For safe autonomous driving, deep neural network (DNN)-based perception systems play essential roles, where a vast amount of driving images should be manually collected and labeled with ground truth (GT) for training and validation purposes. After observing the manual GT generation’s high cost and unavoidable human errors, this study presents an open-source automatic GT generation tool, CarFree, based on the Carla autonomous driving simulator. By that, we aim to democratize the daunting task of (in particular) object detection dataset generation, which was only possible by big companies or institutes due to its high cost. CarFree comprises (i) a data extraction client that automatically collects relevant information from the Carla simulator’s server and (ii) a post-processing software that produces precise 2D bounding boxes of vehicles and pedestrians on the gathered driving images. Our evaluation results show that CarFree can generate a considerable amount of realistic driving images along with their GTs in a reasonable time. Moreover, using the synthesized training images with artificially made unusual weather and lighting conditions, which are difficult to obtain in real-world driving scenarios, CarFree significantly improves the object detection accuracy in the real world, particularly in the case of harsh environments. With CarFree, we expect its users to generate a variety of object detection datasets in hassle-free ways.

Download Full-text

Weakly-supervised object detection via mining pseudo ground truth bounding-boxes

Pattern Recognition ◽

10.1016/j.patcog.2018.07.005 ◽

2018 ◽

Vol 84 ◽

pp. 68-81 ◽

Cited By ~ 8

Author(s):

Yongqiang Zhang ◽

Yaicheng Bai ◽

Mingli Ding ◽

Yongqiang Li ◽

Bernard Ghanem

Keyword(s):

Object Detection ◽

Ground Truth ◽

Weakly Supervised ◽

Bounding Boxes

Download Full-text