A Bayesian Image Retrieval Framework

Author(s):  
Rui Zhang ◽  
Ling Guan

Conventional approaches to content-based image retrieval exploit low-level visual information to represent images and relevance feedback techniques to incorporate human knowledge into the retrieval process, which can only alleviate the semantic gap to some extent. To further boost the performance, a Bayesian framework is proposed in which information independent of the visual content of images is utilized and integrated with the visual information. Two particular instances of the general framework are studied. First, context which is the statistical relation across the images is integrated with visual content such that the framework can extract information from both the images and past retrieval results. Second, characteristic sounds made by different objects are utilized along with their visual appearance. Based on various performance evaluation criteria, the proposed framework is evaluated using two databases for the two examples, respectively. The results demonstrate the advantage of the integration of information from multiple sources.

2003 ◽  
Vol 03 (01) ◽  
pp. 171-208 ◽  
Author(s):  
ANASTASIOS DOULAMIS ◽  
NIKOLAOS DOULAMIS ◽  
THEODORA VARVARIGOU

The performance of a Content-Based Image Retrieval System (CBIR) depends on (a) the system's adaptability to the user's information needs, which permits different types of indexing and simultaneously reduces the subjectivity of human perception for the interpretation of the image visual content and (b) the efficient organization of the extracted descriptors, which represent the rich visual information. Both issues are addressed in this paper. Descriptor organization is performed using a fuzzy classification scheme fragmented into multidimensional classes, instead of the previous works where fuzzy histograms were created in one dimension using, for example, the feature vector norm. Multidimensionality relates the descriptors with one another and thus allows a compact and meaningful visual representation by mapping the elements of the resulted feature vectors with a physical visual interpretation. Furthermore, fuzzy classification is applied for all visual content descriptors, in contrast to the previous approaches where only color information is exploited. Two kinds of content descriptors are extracted in our case; global-based and region-based. The first refers to the global image characteristics, while the second exploits the region-based properties. Regions are obtained by applying a multiresolution implementation of the Recursive Shortest Spanning Tree (RSST) algorithm, called M-RSST in this paper. The second issue is addressed by proposing a computationally efficient relevance feedback mechanism based on an optimal weight updating strategy. The scheme relies on the cross-correlation measure, instead of the Euclidean distance which is mainly used in most relevance feedback algorithms. Cross-correlation is a normalized measure, which expresses how similar the two feature vectors are and thus it indicates a metric of their content similarity. The proposed scheme can be recursively implemented in the case of multiple feedback iterations, instead of the previous approaches. Furthermore, it provides reliable results regardless of the number of selected sample and the feature vector size improving relevance feedback performance, as compared to other approaches.


2017 ◽  
Vol 1 (4) ◽  
pp. 165
Author(s):  
M. Premkumar ◽  
R. Sowmya

Retrieving images from large databases becomes a difficult task. Content based image retrieval (CBIR) deals with retrieval of images based on their similarities in content (features) between the query image and the target image. But the similarities do not vary equally in all directions of feature space. Further the CBIR efforts have relatively ignored the two distinct characteristics of the CBIR systems: 1) The gap between high level concepts and low level features; 2) Subjectivity of human perception of visual content. Hence an interactive technique called the relevance feedback technique was used. These techniques used user’s feedback about the retrieved images to reformulate the query which retrieves more relevant images during next iterations. But those relevance feedback techniques are called hard relevance feedback techniques as they use only two level user annotation. It was very difficult for the user to give feedback for the retrieved images whether they are relevant to the query image or not. To better capture user’s intention soft relevance feedback technique is proposed. This technique uses multilevel user annotation. But it makes use of only single user feedback. Hence Soft association rule mining technique is also proposed to infer image relevance from the collective feedback. Feedbacks from multiple users are used to retrieve more relevant images improving the performance of the system. Here soft relevance feedback and association rule mining techniques are combined. During first iteration prior association rules about the given query image are retrieved to find out the relevant images and during next iteration the feedbacks are inserted into the database and relevance feedback techniques are activated to retrieve more relevant images. The number of association rules is kept minimum based on redundancy detection.


2020 ◽  
Vol 79 (37-38) ◽  
pp. 26995-27021
Author(s):  
Lorenzo Putzu ◽  
Luca Piras ◽  
Giorgio Giacinto

Abstract Given the great success of Convolutional Neural Network (CNN) for image representation and classification tasks, we argue that Content-Based Image Retrieval (CBIR) systems could also leverage on CNN capabilities, mainly when Relevance Feedback (RF) mechanisms are employed. On the one hand, to improve the performances of CBIRs, that are strictly related to the effectiveness of the descriptors used to represent an image, as they aim at providing the user with images similar to an initial query image. On the other hand, to reduce the semantic gap between the similarity perceived by the user and the similarity computed by the machine, by exploiting an RF mechanism where the user labels the returned images as being relevant or not concerning her interests. Consequently, in this work, we propose a CBIR system based on transfer learning from a CNN trained on a vast image database, thus exploiting the generic image representation that it has already learned. Then, the pre-trained CNN is also fine-tuned exploiting the RF supplied by the user to reduce the semantic gap. In particular, after the user’s feedback, we propose to tune and then re-train the CNN according to the labelled set of relevant and non-relevant images. Then, we suggest different strategies to exploit the updated CNN for returning a novel set of images that are expected to be relevant to the user’s needs. Experimental results on different data sets show the effectiveness of the proposed mechanisms in improving the representation power of the CNN with respect to the user concept of image similarity. Moreover, the pros and cons of the different approaches can be clearly pointed out, thus providing clear guidelines for the implementation in production environments.


Author(s):  
Roberto Tronci ◽  
Luca Piras ◽  
Giorgio Giacinto

Anyone who has ever tried to describe a picture in words is aware that it is not an easy task to find a word, a concept, or a category that characterizes it completely. Most images in real life represent more than a concept; therefore, it is natural that images available to users over the Internet (e.g., FLICKR) are associated with multiple tags. By the term ‘tag’, the authors refer to a concept represented in the image. The purpose of this paper is to evaluate the performances of relevance feedback techniques in content-based image retrieval scenarios with multi-tag datasets, as typically performances are assessed on single-tag dataset. Thus, the authors show how relevance feedback mechanisms are able to adapt the search to user’s needs either in the case an image is used as an example for retrieving images each bearing different concepts, or the sample image is used to retrieve images containing the same set of concepts. In this paper, the authors also propose two novel performance measures aimed at comparing the accuracy of retrieval results when an image is used as a prototype for a number of different concepts.


2018 ◽  
Vol 6 (9) ◽  
pp. 259-273
Author(s):  
Priyanka Saxena ◽  
Shefali

Content Based Image Retrieval system automatically retrieves the most relevant images to the query image by extracting the visual features instead of keywords from images. Over the years, several researches have been conducted in this field but the system still faces the challenge of semantic gap and subjectivity of human perception. This paper proposes the extraction of low-level visual features by employing color moment, Local Binary Pattern and Canny Edge Detection techniques for extracting color, texture and edge features respectively. The combination of these features is used in conjunction with Support Vector Machine to reduce the retrieval time and improve the overall precision. Also, the challenge of semantic gap between low and high level features is addressed by incorporating Relevance Feedback. Average precision value of 0.782 was obtained by combining the color, texture and edge features, 0.896 was obtained by using combined features with SVM, 0.882 was obtained by using combined features with Relevance Feedback to overcome the challenge of semantic gap. Experimental results exhibit improved performance than other state of the art techniques.


Author(s):  
Iker Gondra

In content-based image retrieval (CBIR), a set of low-level features are extracted from an image to represent its visual content. Retrieval is performed by image example where a query image is given as input by the user and an appropriate similarity measure is used to find the best matches in the corresponding feature space. This approach suffers from the fact that there is a large discrepancy between the low-level visual features that one can extract from an image and the semantic interpretation of the image’s content that a particular user may have in a given situation. That is, users seek semantic similarity, but we can only provide similarity based on low-level visual features extracted from the raw pixel data, a situation known as the semantic gap. The selection of an appropriate similarity measure is thus an important problem. Since visual content can be represented by different attributes, the combination and importance of each set of features varies according to the user’s semantic intent. Thus, the retrieval strategy should be adaptive so that it can accommodate the preferences of different users. Relevance feedback (RF) learning has been proposed as a technique aimed at reducing the semantic gap. It works by gathering semantic information from user interaction. Based on the user’s feedback on the retrieval results, the retrieval scheme is adjusted. By providing an image similarity measure under human perception, RF learning can be seen as a form of supervised learning that finds relations between high-level semantic interpretations and low-level visual properties. That is, the feedback obtained within a single query session is used to personalize the retrieval strategy and thus enhance retrieval performance. In this chapter we present an overview of CBIR and related work on RF learning. We also present our own previous work on a RF learning-based probabilistic region relevance learning algorithm for automatically estimating the importance of each region in an image based on the user’s semantic intent.


Sign in / Sign up

Export Citation Format

Share Document