scholarly journals Deep Multiple Instance Hashing for Object-based Image Retrieval

Author(s):  
Wanqing Zhao ◽  
Ziyu Guan ◽  
Hangzai Luo ◽  
Jinye Peng ◽  
Jianping Fan

Multi-keyword query is widely supported in text search engines. However, an analogue in image retrieval systems, multi-object query, is rarely studied. Meanwhile, traditional object-based image retrieval methods often involve multiple steps separately and need expensive location labeling for detecting objects. In this work, we propose a weakly-supervised Deep Multiple Instance Hashing (DMIH) framework for object-based image retrieval. DMIH integrates object detection and hashing learning on the basis of a popular CNN model to build the end-to-end relation between a raw image and the binary hashing codes of multiple objects in it. Specifically, we cast the object detection of each object class as a binary multiple instance learning problem where instances are object proposals extracted from multi-scale convolutional feature maps. For hashing training, we sample image pairs to learn their semantic relationships in terms of hash codes of the most probable proposals for owned labels as guided by object predictors. The two objectives benefit each other in learning. DMIH outperforms state-of-the-arts on public benchmarks for object-based image retrieval and achieves promising results for multi-object queries.

Author(s):  
Baisheng Lai ◽  
Xiaojin Gong

Weakly supervised object detection (WSOD), which is the problem of learning detectors using only image-level labels, has been attracting more and more interest. However, this problem is quite challenging due to the lack of location supervision. To address this issue, this paper integrates saliency into a deep architecture, in which the location information is explored both explicitly and implicitly. Specifically, we select highly confident object proposals under the guidance of class-specific saliency maps. The location information, together with semantic and saliency information, of the select proposals are then used to explicitly supervise the network by imposing two additional losses. Meanwhile, a saliency prediction sub-network is built in the architecture. The prediction results are used to implicitly guide the localization procedure. The entire network is trained end-to-end. Experiments on PASCAL VOC demonstrate that our approach outperforms all state-of-the-arts.


2021 ◽  
Author(s):  
Zhiheng Zhou ◽  
Yongfan Guo ◽  
Ming Dai ◽  
Junchu Huang ◽  
Xiangwei Li

2013 ◽  
Vol 321-324 ◽  
pp. 1030-1034
Author(s):  
Jian Zhai Wu ◽  
De Wen Hu

In this paper we propose a powerful visual event pattern learning method to address the issue of high-level video understanding. We first model the deformable temporal structure of the action event in videos by a temporal composition of several primitive motions. Moreover, we describe each action class by multiple temporal models to deal with the significant intra-class variability. We implement a multiple instance learning method to train the models in the weakly supervised setting. We have conducted experiments on three major benchmarks. The results are comparative to the state-of-the-arts.


2021 ◽  
pp. 108233
Author(s):  
Wei Gao ◽  
Fang Wan ◽  
Jun Yue ◽  
Songcen Xu ◽  
Qixiang Ye

2020 ◽  
Vol 34 (07) ◽  
pp. 11482-11489
Author(s):  
Chenhao Lin ◽  
Siwen Wang ◽  
Dongqi Xu ◽  
Yu Lu ◽  
Wayne Zhang

Weakly supervised object detection (WSOD) using only image-level annotations has attracted growing attention over the past few years. Existing approaches using multiple instance learning easily fall into local optima, because such mechanism tends to learn from the most discriminative object in an image for each category. Therefore, these methods suffer from missing object instances which degrade the performance of WSOD. To address this problem, this paper introduces an end-to-end object instance mining (OIM) framework for weakly supervised object detection. OIM attempts to detect all possible object instances existing in each image by introducing information propagation on the spatial and appearance graphs, without any additional annotations. During the iterative learning process, the less discriminative object instances from the same class can be gradually detected and utilized for training. In addition, we design an object instance reweighted loss to learn larger portion of each object instance to further improve the performance. The experimental results on two publicly available databases, VOC 2007 and 2012, demonstrate the efficacy of proposed approach.


Sign in / Sign up

Export Citation Format

Share Document