Understanding Center Loss Based Network for Image Retrieval with Few Training Data

Author(s):  
Pallabi Ghosh ◽  
Larry S. Davis
Author(s):  
Ziyu Guan ◽  
Fei Xie ◽  
Wanqing Zhao ◽  
Xiaopeng Wang ◽  
Long Chen ◽  
...  

We are concerned with using user-tagged images to learn proper hashing functions for image retrieval. The benefits are two-fold: (1) we could obtain abundant training data for deep hashing models; (2) tagging data possesses richer semantic information which could help better characterize similarity relationships between images. However, tagging data suffers from noises, vagueness and incompleteness. Different from previous unsupervised or supervised hashing learning, we propose a novel weakly-supervised deep hashing framework which consists of two stages: weakly-supervised pre-training and supervised fine-tuning. The second stage is as usual. In the first stage, rather than performing supervision on tags, the framework introduces a semantic embedding vector (sem-vector) for each image and performs learning of hashing and sem-vectors jointly. By carefully designing the optimization problem, it can well leverage tagging information and image content for hashing learning. The framework is general and does not depend on specific deep hashing methods. Empirical results on real world datasets show that when it is integrated with state-of-art deep hashing methods, the performance increases by 8-10%.


Author(s):  
В’ячеслав Васильович Москаленко ◽  
Микола Олександрович Зарецький ◽  
Альона Сергіївна Москаленко ◽  
Артем Геннадійович Коробов ◽  
Ярослав Юрійович Ковальський

A machine learningsemi-supervised method was developed for the classification analysis of defects on the surface of the sewer pipe based on CCTV video inspection images. The aim of the research is the process of defect detection on the surface of sewage pipes. The subject of the research is a machine learning method for the classification analysis of sewage pipe defects on video inspection images under conditions of a limited and unbalanced set of labeled training data. A five-stage algorithm for classifier training is proposed. In the first stage, contrast training occurs using the instance-prototype contrast loss function, where the normalized Euclidean distance is used to measure the similarity of the encoded samples. The second step considers two variants of regularized loss functions – a triplet NCA function and a contrast-center loss function. The regularizing component in the second stage of training is used to penalize the rounding error of the output feature vector to a discrete form and ensures that the principle of information bottlenecking is implemented. The next step is to calculate the binary code of each class to implement error-correcting codes, but considering the structure of the classes and the relationships between their features. The resulting prototype vector of each class is used as a label of image for training using the cross-entropy loss function.  The last stage of training conducts an optimization of the parameters of the decision rules using the information criterion to consider the variance of the class distribution in Hamming binary space. A micro-averaged metric F1, which is calculated on test data, is used to compare learning outcomes at different stages and within different approaches. The results obtained on the Sewer-ML open dataset confirm the suitability of the training method for practical use, with an F1 metric value of 0.977. The proposed method provides a 9 % increase in the value of the micro-averaged F1 metric compared to the results obtained using the traditional method.


Plant Methods ◽  
2021 ◽  
Vol 17 (1) ◽  
Author(s):  
Ruisong Zhang ◽  
Ye Tian ◽  
Junmei Zhang ◽  
Silan Dai ◽  
Xiaogai Hou ◽  
...  

Abstract Background The study of plant phenotype by deep learning has received increased interest in recent years, which impressive progress has been made in the fields of plant breeding. Deep learning extremely relies on a large amount of training data to extract and recognize target features in the field of plant phenotype classification and recognition tasks. However, for some flower cultivars identification tasks with a huge number of cultivars, it is difficult for traditional deep learning methods to achieve better recognition results with limited sample data. Thus, a method based on metric learning for flower cultivars identification is proposed to solve this problem. Results We added center loss to the classification network to make inter-class samples disperse and intra-class samples compact, the script of ResNet18, ResNet50, and DenseNet121 were used for feature extraction. To evaluate the effectiveness of the proposed method, a public dataset Oxford 102 Flowers dataset and two novel datasets constructed by us are chosen. For the method of joint supervision of center loss and L2-softmax loss, the test accuracy rate is 91.88%, 97.34%, and 99.82% across three datasets, respectively. Feature distribution observed by T-distributed stochastic neighbor embedding (T-SNE) verifies the effectiveness of the method presented above. Conclusions An efficient metric learning method has been described for flower cultivars identification task, which not only provides high recognition rates but also makes the feature extracted from the recognition network interpretable. This study demonstrated that the proposed method provides new ideas for the application of a small amount of data in the field of identification, and has important reference significance for the flower cultivars identification research.


2020 ◽  
Author(s):  
Lucas Pascotti Valem ◽  
Daniel Carlos Guimarães Pedronette

The CBIR (Content-Based Image Retrieval) systems are one of the main solutions for image retrieval tasks. These systems are mainly supported by the use of different visual features and machine learning methods. As distinct features produce complementary ranking results with different effectiveness performance, a promising solution consists in combining them. However, how to decide which visual features to combine is a very challenging task, especially when no training data is available. This work proposes three novel methods for selecting and combining ranked lists by estimating their effectiveness in an unsupervised way. The approaches were evaluated in five different image collections and several descriptors, achieving results comparable or superior to the state-of-the-art in most of the scenarios.


Author(s):  
M. S. Mueller ◽  
T. Sattler ◽  
M. Pollefeys ◽  
B. Jutzi

<p><strong>Abstract.</strong> The performance of machine learning and deep learning algorithms for image analysis depends significantly on the quantity and quality of the training data. The generation of annotated training data is often costly, time-consuming and laborious. Data augmentation is a powerful option to overcome these drawbacks. Therefore, we augment training data by rendering images with arbitrary poses from 3D models to increase the quantity of training images. These training images usually show artifacts and are of limited use for advanced image analysis. Therefore, we propose to use image-to-image translation to transform images from a <i>rendered</i> domain to a <i>captured</i> domain. We show that translated images in the <i>captured</i> domain are of higher quality than the rendered images. Moreover, we demonstrate that image-to-image translation based on rendered 3D models enhances the performance of common computer vision tasks, namely feature matching, image retrieval and visual localization. The experimental results clearly show the enhancement on translated images over rendered images for all investigated tasks. In addition to this, we present the advantages utilizing translated images over exclusively captured images for visual localization.</p>


2020 ◽  
Vol 2 (1) ◽  
pp. 121-129
Author(s):  
Ramlah Nurlaeli ◽  
I Gede Pasek Suta Wijaya ◽  
Fitri Bimantoro

Image retrieval is an image search method by performing a comparison between the query image and the image contained in the database based on the existing information. This study proposes to save the characteristic of Indonesian batik, so the system can help in the prevention of claims from other countries. This study discusses the content-based image retrieval using Multi Texton Histogram and Invariant Moment. MTH is known as a method of describing the characteristics of the surface texture, and IM is a method that produces characteristic geometry of an object and the introduction of geometry that are independent of translation, rotation, and scaling. This study used 10,000 each Batik and Corel images as datasets. The system will take random sample of 7,000 images as training data and the rest is used as the testing data. As the result, Batik Dataset produces precision of 99.75% and a recall of 14:25%. While Corel Dataset produces precision of 36.63% and a recall of 5:23%. The system generates a better performance in the Batik dataset because batik texture is monotonous. While, the Corel dataset has more diversified of the shape and texture.   Keywords: Batik, Image Retrieval, multi texton histogram, invariant moment


2010 ◽  
Vol 44-47 ◽  
pp. 3757-3761 ◽  
Author(s):  
Wang Ming Xu ◽  
Kang Ling Fang ◽  
Hai Ru Zhang

Clustering is an efficient and fundamental unsupervised learning algorithm for many vision-based applications. This paper aim at the problems of fast indexing high-dimensional local invariant features of images (e.g. SIFT features) and quick similarity searching of images in a scalable image database by using a hierarchical clustering algorithm. We adopt the hierarchical K-means (HKM) clustering method to build a visual vocabulary tree efficiently on given training data and represent image as a “bag of visual words” which are the leaf nodes of the visual vocabulary tree. For the application of image retrieval, we adopt an usually-used indexing structure called “inverted file” to record the mapping of each visual word to the database images containing that visual word along with the number of times it appears in each image. We propose a weighted voting strategy for the application of content-based image retrieval and achieve desirable performance through experiments.


2009 ◽  
Author(s):  
Christopher Layne ◽  
Virginia Strand ◽  
Robert Abramovitz ◽  
Glenn Saxe

Sign in / Sign up

Export Citation Format

Share Document