Personalized Content-Based Image Retrieval

Author(s):  
Iker Gondra

In content-based image retrieval (CBIR), a set of low-level features are extracted from an image to represent its visual content. Retrieval is performed by image example where a query image is given as input by the user and an appropriate similarity measure is used to find the best matches in the corresponding feature space. This approach suffers from the fact that there is a large discrepancy between the low-level visual features that one can extract from an image and the semantic interpretation of the image’s content that a particular user may have in a given situation. That is, users seek semantic similarity, but we can only provide similarity based on low-level visual features extracted from the raw pixel data, a situation known as the semantic gap. The selection of an appropriate similarity measure is thus an important problem. Since visual content can be represented by different attributes, the combination and importance of each set of features varies according to the user’s semantic intent. Thus, the retrieval strategy should be adaptive so that it can accommodate the preferences of different users. Relevance feedback (RF) learning has been proposed as a technique aimed at reducing the semantic gap. It works by gathering semantic information from user interaction. Based on the user’s feedback on the retrieval results, the retrieval scheme is adjusted. By providing an image similarity measure under human perception, RF learning can be seen as a form of supervised learning that finds relations between high-level semantic interpretations and low-level visual properties. That is, the feedback obtained within a single query session is used to personalize the retrieval strategy and thus enhance retrieval performance. In this chapter we present an overview of CBIR and related work on RF learning. We also present our own previous work on a RF learning-based probabilistic region relevance learning algorithm for automatically estimating the importance of each region in an image based on the user’s semantic intent.

Author(s):  
Iker Gondra

In content-based image retrieval (CBIR), a set of low-level features are extracted from an image to represent its visual content. Retrieval is performed by image example where a query image is given as input by the user and an appropriate similarity measure is used to find the best matches in the corresponding feature space. This approach suffers from the fact that there is a large discrepancy between the low-level visual features that one can extract from an image and the semantic interpretation of the image’s content that a particular user may have in a given situation. That is, users seek semantic similarity, but we can only provide similarity based on low-level visual features extracted from the raw pixel data, a situation known as the semantic gap. The selection of an appropriate similarity measure is thus an important problem. Since visual content can be represented by different attributes, the combination and importance of each set of features varies according to the user’s semantic intent. Thus, the retrieval strategy should be adaptive so that it can accommodate the preferences of different users. Relevance feedback (RF) learning has been proposed as a technique aimed at reducing the semantic gap. It works by gathering semantic information from user interaction. Based on the user’s feedback on the retrieval results, the retrieval scheme is adjusted. By providing an image similarity measure under human perception, RF learning can be seen as a form of supervised learning that finds relations between high-level semantic interpretations and low-level visual properties. That is, the feedback obtained within a single query session is used to personalize the retrieval strategy and thus enhance retrieval performance. In this chapter we present an overview of CBIR and related work on RF learning. We also present our own previous work on a RF learning-based probabilistic region relevance learning algorithm for automatically estimating the importance of each region in an image based on the user’s semantic intent.


2019 ◽  
Vol 45 (1) ◽  
pp. 15-19
Author(s):  
Sarmad Abdul-samad

Inn then last two decades the Content Based Image Retrieval (CBIR) considered as one of the topic of interest for theresearchers. It depending one analysis of the image’s visual content which can be done by extracting the color, texture and shapefeatures. Therefore, feature extraction is one of the important steps in CBIR system for representing the image completely. Color featureis the most widely used and more reliable feature among the image visual features. This paper reviews different methods, namely LocalColor Histogram, Color Correlogram, Row sum and Column sum and Colors Coherences Vectors were used to extract colors featurestaking in consideration the spatial information of the image.


2018 ◽  
Vol 6 (9) ◽  
pp. 259-273
Author(s):  
Priyanka Saxena ◽  
Shefali

Content Based Image Retrieval system automatically retrieves the most relevant images to the query image by extracting the visual features instead of keywords from images. Over the years, several researches have been conducted in this field but the system still faces the challenge of semantic gap and subjectivity of human perception. This paper proposes the extraction of low-level visual features by employing color moment, Local Binary Pattern and Canny Edge Detection techniques for extracting color, texture and edge features respectively. The combination of these features is used in conjunction with Support Vector Machine to reduce the retrieval time and improve the overall precision. Also, the challenge of semantic gap between low and high level features is addressed by incorporating Relevance Feedback. Average precision value of 0.782 was obtained by combining the color, texture and edge features, 0.896 was obtained by using combined features with SVM, 0.882 was obtained by using combined features with Relevance Feedback to overcome the challenge of semantic gap. Experimental results exhibit improved performance than other state of the art techniques.


Author(s):  
Silvester Tena ◽  
Rudy Hartanto ◽  
Igi Ardiyanto

In <span>recent years, a great deal of research has been conducted in the area of fabric image retrieval, especially the identification and classification of visual features. One of the challenges associated with the domain of content-based image retrieval (CBIR) is the semantic gap between low-level visual features and high-level human perceptions. Generally, CBIR includes two main components, namely feature extraction and similarity measurement. Therefore, this research aims to determine the content-based image retrieval for fabric using feature extraction techniques grouped into traditional methods and convolutional neural networks (CNN). Traditional descriptors deal with low-level features, while CNN addresses the high-level, called semantic features. Traditional descriptors have the advantage of shorter computation time and reduced system requirements. Meanwhile, CNN descriptors, which handle high-level features tailored to human perceptions, deal with large amounts of data and require a great deal of computation time. In general, the features of a CNN's fully connected layers are used for matching query and database images. In several studies, the extracted features of the CNN's convolutional layer were used for image retrieval. At the end of the CNN layer, hash codes are added to reduce  </span>search time.


Author(s):  
Gangavarapu Venkata Satya Kumar ◽  
Pillutla Gopala Krishna Mohan

In diverse computer applications, the analysis of image content plays a key role. This image content might be either textual (like text appearing in the images) or visual (like shape, color, texture). These two image contents consist of image’s basic features and therefore turn out to be as the major advantage for any of the implementation. Many of the art models are based on the visual search or annotated text for Content-Based Image Retrieval (CBIR) models. There is more demand toward multitasking, a new method needs to be introduced with the combination of both textual and visual features. This paper plans to develop the intelligent CBIR system for the collection of different benchmark texture datasets. Here, a new descriptor named Information Oriented Angle-based Local Tri-directional Weber Patterns (IOA-LTriWPs) is adopted. The pattern is operated not only based on tri-direction and eight neighborhood pixels but also based on four angles [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text]. Once the patterns concerning tri-direction, eight neighborhood pixels, and four angles are taken, the best patterns are selected based on maximum mutual information. Moreover, the histogram computation of the patterns provides the final feature vector, from which the new weighted feature extraction is performed. As a new contribution, the novel weight function is optimized by the Improved MVO on random basis (IMVO-RB), in such a way that the precision and recall of the retrieved image is high. Further, the proposed model has used the logarithmic similarity called Mean Square Logarithmic Error (MSLE) between the features of the query image and trained images for retrieving the concerned images. The analyses on diverse texture image datasets have validated the accuracy and efficiency of the developed pattern over existing.


2021 ◽  
Author(s):  
Maryam Nematollahi Arani

Object recognition has become a central topic in computer vision applications such as image search, robotics and vehicle safety systems. However, it is a challenging task due to the limited discriminative power of low-level visual features in describing the considerably diverse range of high-level visual semantics of objects. Semantic gap between low-level visual features and high-level concepts are a bottleneck in most systems. New content analysis models need to be developed to bridge the semantic gap. In this thesis, algorithms based on conditional random fields (CRF) from the class of probabilistic graphical models are developed to tackle the problem of multiclass image labeling for object recognition. Image labeling assigns a specific semantic category from a predefined set of object classes to each pixel in the image. By well capturing spatial interactions of visual concepts, CRF modeling has proved to be a successful tool for image labeling. This thesis proposes novel approaches to empowering the CRF modeling for robust image labeling. Our primary contributions are twofold. To better represent feature distributions of CRF potentials, new feature functions based on generalized Gaussian mixture models (GGMM) are designed and their efficacy is investigated. Due to its shape parameter, GGMM can provide a proper fit to multi-modal and skewed distribution of data in nature images. The new model proves more successful than Gaussian and Laplacian mixture models. It also outperforms a deep neural network model on Corel imageset by 1% accuracy. Further in this thesis, we apply scene level contextual information to integrate global visual semantics of the image with pixel-wise dense inference of fully-connected CRF to preserve small objects of foreground classes and to make dense inference robust to initial misclassifications of the unary classifier. Proposed inference algorithm factorizes the joint probability of labeling configuration and image scene type to obtain prediction update equations for labeling individual image pixels and also the overall scene type of the image. The proposed context-based dense CRF model outperforms conventional dense CRF model by about 2% in terms of labeling accuracy on MSRC imageset and by 4% on SIFT Flow imageset. Also, the proposed model obtains the highest scene classification rate of 86% on MSRC dataset.


Sign in / Sign up

Export Citation Format

Share Document