Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation

When deploying a model for object detection, a confidence score threshold is chosen to filter out false positives and ensure that a predicted bounding box has a certain minimum score. To achieve state-of-the-art performance on benchmark datasets, most neural networks use a rather low threshold as a high number of false positives is not penalized by standard evaluation metrics. However, in scenarios of Artificial Intelligence (AI) applications that require high confidence scores (e.g., due to legal requirements or consequences of incorrect detections are severe) or a certain level of model robustness is required, it is unclear which base model to use since they were mainly optimized for benchmark scores. In this paper, we propose a method to find the optimum performance point of a model as a basis for fairer comparison and deeper insights into the trade-offs caused by selecting a confidence score threshold.

Download Full-text

An Experimental Analysis of Model Compression Techniques for Object Detection

10.5753/kdmile.2020.11958 ◽

2020 ◽

Author(s):

Andrey De Aguiar Salvi ◽

Rodrigo Coelho Barros

Keyword(s):

Object Detection ◽

Experimental Analysis ◽

State Of The Art ◽

Neural Architecture ◽

Model Compression ◽

Processing Power ◽

Benchmark Datasets ◽

The Difference ◽

And Performance ◽

Consumption Constraints

Recent research on Convolutional Neural Networks focuses on how to create models with a reduced number of parameters and a smaller storage size while keeping the model’s ability to perform its task, allowing the use of the best CNN for automating tasks in limited devices, with reduced processing power, memory, or energy consumption constraints. There are many different approaches in the literature: removing parameters, reduction of the floating-point precision, creating smaller models that mimic larger models, neural architecture search (NAS), etc. With all those possibilities, it is challenging to say which approach provides a better trade-off between model reduction and performance, due to the difference between the approaches, their respective models, the benchmark datasets, or variations in training details. Therefore, this article contributes to the literature by comparing three state-of-the-art model compression approaches to reduce a well-known convolutional approach for object detection, namely YOLOv3. Our experimental analysis shows that it is possible to create a reduced version of YOLOv3 with 90% fewer parameters and still outperform the original model by pruning parameters. We also create models that require only 0.43% of the original model’s inference effort.

Download Full-text

SuperVAE: Superpixelwise Variational Autoencoder for Salient Object Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018569 ◽

2019 ◽

Vol 33 ◽

pp. 8569-8576 ◽

Cited By ~ 2

Author(s):

Bo Li ◽

Zhengxing Sun ◽

Yuqi Guo

Keyword(s):

Deep Learning ◽

Object Detection ◽

Saliency Detection ◽

Salient Object Detection ◽

Salient Object ◽

Image Saliency ◽

Spatial Consistency ◽

Variational Autoencoder ◽

Benchmark Datasets ◽

Supervised Methods

Image saliency detection has recently witnessed rapid progress due to deep neural networks. However, there still exist many important problems in the existing deep learning based methods. Pixel-wise convolutional neural network (CNN) methods suffer from blurry boundaries due to the convolutional and pooling operations. While region-based deep learning methods lack spatial consistency since they deal with each region independently. In this paper, we propose a novel salient object detection framework using a superpixelwise variational autoencoder (SuperVAE) network. We first use VAE to model the image background and then separate salient objects from the background through the reconstruction residuals. To better capture semantic and spatial contexts information, we also propose a perceptual loss to take advantage from deep pre-trained CNNs to train our SuperVAE network. Without the supervision of mask-level annotated data, our method generates high quality saliency results which can better preserve object boundaries and maintain the spatial consistency. Extensive experiments on five wildly-used benchmark datasets show that the proposed method achieves superior or competitive performance compared to other algorithms including the very recent state-of-the-art supervised methods.

Download Full-text

Deep Learning Approaches for Object Detection

Artificial Intelligence Evolution ◽

10.37256/aie.122020564 ◽

2020 ◽

pp. 123-145

Author(s):

Sushma Jaiswal ◽

Tarun Jaiswal

Keyword(s):

Deep Learning ◽

Object Detection ◽

Autonomous Driving ◽

General Idea ◽

Detection Methods ◽

Learning Approaches ◽

Detection Techniques ◽

Second Stage ◽

Fast Pace ◽

Benchmark Datasets

In computer vision, object detection is a very important, exciting and mind-blowing study. Object detection work in numerous fields such as observing security, independently/autonomous driving and etc. Deep-learning based object detection techniques have developed at a very fast pace and have attracted the attention of many researchers. The main focus of the 21st century is the development of the object-detection framework, comprehensively and genuinely. In this investigation, we initially investigate and evaluate the various object detection approaches and designate the benchmark datasets. We also delivered the wide-ranging general idea of object detection approaches in an organized way. We covered the first and second stage detectors of object detection methods. And lastly, we consider the construction of these object detection approaches to give dimensions for further research.

Download Full-text

Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-start Users

ACM Transactions on Information Systems ◽

10.1145/3446427 ◽

2021 ◽

Vol 39 (4) ◽

pp. 1-29

Author(s):

Shijun Li ◽

Wenqiang Lei ◽

Qingyun Wu ◽

Xiangnan He ◽

Peng Jiang ◽

...

Keyword(s):

State Of The Art ◽

Cold Start ◽

User Preference ◽

Inherent Limitation ◽

Trade Off ◽

Thompson Sampling ◽

Trade Offs ◽

Benchmark Datasets ◽

Item Attributes ◽

Exploration Exploitation

Static recommendation methods like collaborative filtering suffer from the inherent limitation of performing real-time personalization for cold-start users. Online recommendation, e.g., multi-armed bandit approach, addresses this limitation by interactively exploring user preference online and pursuing the exploration-exploitation (EE) trade-off. However, existing bandit-based methods model recommendation actions homogeneously. Specifically, they only consider the items as the arms, being incapable of handling the item attributes , which naturally provide interpretable information of user’s current demands and can effectively filter out undesired items. In this work, we consider the conversational recommendation for cold-start users, where a system can both ask the attributes from and recommend items to a user interactively. This important scenario was studied in a recent work [54]. However, it employs a hand-crafted function to decide when to ask attributes or make recommendations. Such separate modeling of attributes and items makes the effectiveness of the system highly rely on the choice of the hand-crafted function, thus introducing fragility to the system. To address this limitation, we seamlessly unify attributes and items in the same arm space and achieve their EE trade-offs automatically using the framework of Thompson Sampling. Our Conversational Thompson Sampling (ConTS) model holistically solves all questions in conversational recommendation by choosing the arm with the maximal reward to play. Extensive experiments on three benchmark datasets show that ConTS outperforms the state-of-the-art methods Conversational UCB (ConUCB) [54] and Estimation—Action—Reflection model [27] in both metrics of success rate and average number of conversation turns.

Download Full-text

Accurate Image Retrieval With Unsupervised 2-Stage k-NN Re-Ranking

Computer Vision ◽

10.4018/978-1-5225-5204-8.ch072 ◽

2018 ◽

pp. 1726-1745

Author(s):

Dawei Li ◽

Mooi Choo Chuah

Keyword(s):

Image Retrieval ◽

Large Scale ◽

Ground Truth ◽

Confidence Score ◽

Image Feature ◽

Superior Performance ◽

Ranking Algorithm ◽

Ranking List ◽

Image Patches ◽

Benchmark Datasets

Many state-of-the-art image retrieval systems include a re-ranking step to refine the suggested initial ranking list so as to improve the retrieval accuracy. In this paper, we present a novel 2-stage k-NN re-ranking algorithm. In stage one, we generate an expanded list of candidate database images for re-ranking so that lower ranked ground truth images will be included and re-ranked. In stage two, we re-rank the list of candidate images using a confidence score which is calculated based on, rRBO, a new proposed ranking list similarity measure. In addition, we propose the rLoCATe image feature, which captures robust color and texture information on salient image patches, and shows superior performance in the image retrieval task. We evaluate the proposed re-ranking algorithm on various initial ranking lists created using both SIFT and rLoCATe on two popular benchmark datasets along with a large-scale one million distraction dataset. The results show that our proposed algorithm is not sensitive for different parameter configurations and it outperforms existing k-NN re-ranking methods.

Download Full-text

Left Object Detection with Reduced False Positives in Real-Time

Advances in Computing and Information Technology - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-642-31552-7_70 ◽

2013 ◽

pp. 691-698

Author(s):

Aditya Piratla ◽

Monotosh Das ◽

Jayalakshmi Surendran

Keyword(s):

Object Detection ◽

Real Time ◽

False Positives ◽

Left Object

Download Full-text

Reduce false positives for object detection by a priori probability in videos

Neurocomputing ◽

10.1016/j.neucom.2016.03.082 ◽

2016 ◽

Vol 208 ◽

pp. 325-332 ◽

Cited By ~ 2

Author(s):

Lei Wang ◽

Xu Zhao ◽

Yuncai Liu

Keyword(s):

Object Detection ◽

A Priori ◽

False Positives ◽

A Priori Probability

Download Full-text

Salient Object Detection Based on Weighted Hypergraph and Random Walk

Mathematical Problems in Engineering ◽

10.1155/2020/2073140 ◽

2020 ◽

Vol 2020 ◽

pp. 1-14

Author(s):

WeiYi Wei ◽

Hui Chen

Keyword(s):

Random Walk ◽

Object Detection ◽

Clustering Algorithm ◽

Graph Model ◽

Input Image ◽

Salient Object Detection ◽

Salient Object ◽

Hypergraph Model ◽

Benchmark Datasets ◽

Weighted Hypergraph

Recently, salient object detection based on the graph model has attracted extensive research interest in computer vision because the graph model can represent the relationship between two regions better. However, it is difficult to capture the high-level relationship between multiple regions. In this algorithm, the input image is segmented into superpixels first. Then, a weighted hypergraph model is established using fuzzy C-means clustering algorithm and a new weighting strategy. Finally, the random walk algorithm is used to sort all superpixels on the weighted hypergraph model to obtain the salient object. The experimental results on three benchmark datasets demonstrate that the proposed method performs better than some other state-of-the-art methods.

Download Full-text

RGB-D salient object detection: A survey

Computational Visual Media ◽

10.1007/s41095-020-0199-z ◽

2021 ◽

Author(s):

Tao Zhou ◽

Deng-Ping Fan ◽

Ming-Ming Cheng ◽

Jianbing Shen ◽

Ling Shao

Keyword(s):

Object Detection ◽

Spatial Information ◽

Salient Object Detection ◽

Future Research ◽

Salient Object ◽

Human Visual Perception ◽

Depth Sensors ◽

Depth Maps ◽

Benchmark Datasets ◽

Comprehensive Survey

AbstractSalient object detection, which simulates human visual perception in locating the most significant object(s) in a scene, has been widely applied to various computer vision tasks. Now, the advent of depth sensors means that depth maps can easily be captured; this additional spatial information can boost the performance of salient object detection. Although various RGB-D based salient object detection models with promising performance have been proposed over the past several years, an in-depth understanding of these models and the challenges in this field remains lacking. In this paper, we provide a comprehensive survey of RGB-D based salient object detection models from various perspectives, and review related benchmark datasets in detail. Further, as light fields can also provide depth maps, we review salient object detection models and popular benchmark datasets from this domain too. Moreover, to investigate the ability of existing models to detect salient objects, we have carried out a comprehensive attribute-based evaluation of several representative RGB-D based salient object detection models. Finally, we discuss several challenges and open directions of RGB-D based salient object detection for future research. All collected models, benchmark datasets, datasets constructed for attribute-based evaluation, and related code are publicly available at https://github.com/taozh2017/RGBD-SODsurvey.

Download Full-text

DETECTION OF TEXTURE-LESS OBJECTS BY LINE-BASED APPROACH

Facta Universitatis Series Automatic Control and Robotics ◽

10.22190/fuacr1902079c ◽

2020 ◽

Vol 18 (2) ◽

pp. 079

Author(s):

Stevica Cvetković ◽

Nemanja Grujić ◽

Slobodan Ilić ◽

Goran Stančić

Keyword(s):

Object Detection ◽

Affine Invariant ◽

Computationally Efficient ◽

Rotation Invariant ◽

Shape Information ◽

Art Object ◽

Extensive Evaluation ◽

Benchmark Datasets ◽

On Line ◽

Hypotheses Generation

This paper proposes a method for tackling the problem of scalable object instance detection in the presence of clutter and occlusions. It gathers together advantages in respect of the state-of-the-art object detection approaches, being at the same time able to scale favorably with the number of models, computationally efficient and suited to texture-less objects as well. The proposed method has the following advantages: a) generality – it works for both texture-less and textured objects, b) scalability – it scales sub-linearly with the number of objects stored in the object database, and c) computational efficiency – it runs in near real-time. In contrast to the traditional affine-invariant detectors/descriptors which are local and not discriminative for texture-less objects, our method is based on line segments around which it computes semi-global descriptor by encoding gradient information in scale and rotation invariant manner. It relies on both texture and shape information and is, therefore, suited for both textured and texture-less objects. The descriptor is integrated into efficient object detection procedure which exploits the fact that the line segment determines scale, orientation and position of an object, by its two endpoints. This is used to construct several effective techniques for object hypotheses generation, scoring and multiple object reasoning; which are integrated in the proposed object detection procedure. Thanks to its ability to detect objects even if only one correct line match is found, our method allows detection of the objects under heavy clutter and occlusions. Extensive evaluation on several public benchmark datasets for texture-less and textured object detection, demonstrates its scalability and high effectiveness.

Download Full-text