Weakly Supervised Localization with Patch Detector for Fine-Grained Image Retrieval

Author(s):  
Rong Wang ◽  
Wei Zou ◽  
Jiacheng Pu ◽  
Jiajun Wang
2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Hongwei Zhao ◽  
Danyang Zhang ◽  
Jiaxin Wu ◽  
Pingping Liu

Fine-grained retrieval is one of the complex problems in computer vision. Compared with general content-based image retrieval, fine-grained image retrieval faces more difficult challenges. In fine-grained image retrieval tasks, all classes belong to a subclass of a meta-class, so there will be small interclass variance and large intraclass variance. In order to solve this problem, in this paper, we propose a fine-grained retrieval method to improve loss and feature aggregation, which can achieve better retrieval results under a unified framework. Firstly, we propose a novel multiproxies adaptive distribution loss which can better characterize the intraclass variations and the degree of dispersion of each cluster center. Secondly, we propose a weakly supervised feature aggregation method based on channel weighting, which distinguishes the importance of different feature channels to obtain more representative image feature descriptors. We verify the performance of our proposed method on the universal benchmark datasets such as CUB200-2011 and Stanford Dog. Higher Recall@K demonstrates the advantage of our proposed method over the state of the art.


2021 ◽  
Vol 30 ◽  
pp. 2826-2836 ◽  
Author(s):  
Yifeng Ding ◽  
Zhanyu Ma ◽  
Shaoguo Wen ◽  
Jiyang Xie ◽  
Dongliang Chang ◽  
...  

Author(s):  
Xiawu Zheng ◽  
Rongrong Ji ◽  
Xiaoshuai Sun ◽  
Yongjian Wu ◽  
Feiyue Huang ◽  
...  

Fine-grained object retrieval has attracted extensive research focus recently. Its state-of-the-art schemesare typically based upon convolutional neural network (CNN) features. Despite the extensive progress, two issues remain open. On one hand, the deep features are coarsely extracted at image level rather than precisely at object level, which are interrupted by background clutters. On the other hand, training CNN features with a standard triplet loss is time consuming and incapable to learn discriminative features. In this paper, we present a novel fine-grained object retrieval scheme that conquers these issues in a unified framework. Firstly, we introduce a novel centralized ranking loss (CRL), which achieves a very efficient (1,000times training speedup comparing to the triplet loss) and discriminative feature learning by a ?centralized? global pooling. Secondly, a weakly supervised attractive feature extraction is proposed, which segments object contours with top-down saliency. Consequently, the contours are integrated into the CNN response map to precisely extract features ?within? the target object. Interestingly, we have discovered that the combination of CRL and weakly supervised learning can reinforce each other. We evaluate the performance ofthe proposed scheme on widely-used benchmarks including CUB200-2011 and CARS196. We havereported significant gains over the state-of-the-art schemes, e.g., 5.4% over SCDA [Wei et al., 2017]on CARS196, and 3.7% on CUB200-2011.  


PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0254054
Author(s):  
Gaihua Wang ◽  
Lei Cheng ◽  
Jinheng Lin ◽  
Yingying Dai ◽  
Tianlun Zhang

The large intra-class variance and small inter-class variance are the key factor affecting fine-grained image classification. Recently, some algorithms have been more accurate and efficient. However, these methods ignore the multi-scale information of the network, resulting in insufficient ability to capture subtle changes. To solve this problem, a weakly supervised fine-grained classification network based on multi-scale pyramid is proposed in this paper. It uses pyramid convolution kernel to replace ordinary convolution kernel in residual network, which can expand the receptive field of the convolution kernel and use complementary information of different scales. Meanwhile, the weakly supervised data augmentation network (WS-DAN) is used to prevent over fitting and improve the performance of the model. In addition, a new attention module, which includes spatial attention and channel attention, is introduced to pay more attention to the object part in the image. The comprehensive experiments are carried out on three public benchmarks. It shows that the proposed method can extract subtle feature and achieve classification effectively.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Haopeng Lei ◽  
Simin Chen ◽  
Mingwen Wang ◽  
Xiangjian He ◽  
Wenjing Jia ◽  
...  

Due to the rise of e-commerce platforms, online shopping has become a trend. However, the current mainstream retrieval methods are still limited to using text or exemplar images as input. For huge commodity databases, it remains a long-standing unsolved problem for users to find the interested products quickly. Different from the traditional text-based and exemplar-based image retrieval techniques, sketch-based image retrieval (SBIR) provides a more intuitive and natural way for users to specify their search need. Due to the large cross-domain discrepancy between the free-hand sketch and fashion images, retrieving fashion images by sketches is a significantly challenging task. In this work, we propose a new algorithm for sketch-based fashion image retrieval based on cross-domain transformation. In our approach, the sketch and photo are first transformed into the same domain. Then, the sketch domain similarity and the photo domain similarity are calculated, respectively, and fused to improve the retrieval accuracy of fashion images. Moreover, the existing fashion image datasets mostly contain photos only and rarely contain the sketch-photo pairs. Thus, we contribute a fine-grained sketch-based fashion image retrieval dataset, which includes 36,074 sketch-photo pairs. Specifically, when retrieving on our Fashion Image dataset, the accuracy of our model ranks the correct match at the top-1 which is 96.6%, 92.1%, 91.0%, and 90.5% for clothes, pants, skirts, and shoes, respectively. Extensive experiments conducted on our dataset and two fine-grained instance-level datasets, i.e., QMUL-shoes and QMUL-chairs, show that our model has achieved a better performance than other existing methods.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 129469-129477
Author(s):  
Hanning Zhang ◽  
Bo Dong ◽  
Boqin Feng ◽  
Fang Yang ◽  
Bo Xu

Sign in / Sign up

Export Citation Format

Share Document