Extended CBIR via Learning Semantics of Query Image

Author(s):  
Chuanghua Gui ◽  
Jing Liu ◽  
Changsheng Xu ◽  
Hanqing Lu
Keyword(s):  
2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Chen Zhang ◽  
Bin Hu ◽  
Yucong Suo ◽  
Zhiqiang Zou ◽  
Yimu Ji

In this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large collection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video retrieval with low storage cost and high search efficiency. Our framework consists of the key-frame extraction algorithm and the feature aggregation strategy. Specifically, the key-frame extraction algorithm takes advantage of the clustering idea so that redundant information is removed in video data and storage cost is greatly reduced. The feature aggregation strategy adopts average pooling to encode deep local convolutional features followed by coarse-to-fine retrieval, which allows rapid retrieval in the large-scale video database. The results from extensive experiments on two publicly available datasets demonstrate that the proposed method achieves superior efficiency as well as accuracy over other state-of-the-art visual search methods.


Author(s):  
Gangavarapu Venkata Satya Kumar ◽  
Pillutla Gopala Krishna Mohan

In diverse computer applications, the analysis of image content plays a key role. This image content might be either textual (like text appearing in the images) or visual (like shape, color, texture). These two image contents consist of image’s basic features and therefore turn out to be as the major advantage for any of the implementation. Many of the art models are based on the visual search or annotated text for Content-Based Image Retrieval (CBIR) models. There is more demand toward multitasking, a new method needs to be introduced with the combination of both textual and visual features. This paper plans to develop the intelligent CBIR system for the collection of different benchmark texture datasets. Here, a new descriptor named Information Oriented Angle-based Local Tri-directional Weber Patterns (IOA-LTriWPs) is adopted. The pattern is operated not only based on tri-direction and eight neighborhood pixels but also based on four angles [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text]. Once the patterns concerning tri-direction, eight neighborhood pixels, and four angles are taken, the best patterns are selected based on maximum mutual information. Moreover, the histogram computation of the patterns provides the final feature vector, from which the new weighted feature extraction is performed. As a new contribution, the novel weight function is optimized by the Improved MVO on random basis (IMVO-RB), in such a way that the precision and recall of the retrieved image is high. Further, the proposed model has used the logarithmic similarity called Mean Square Logarithmic Error (MSLE) between the features of the query image and trained images for retrieving the concerned images. The analyses on diverse texture image datasets have validated the accuracy and efficiency of the developed pattern over existing.


Algorithms ◽  
2018 ◽  
Vol 11 (8) ◽  
pp. 115 ◽  
Author(s):  
Jing Wang ◽  
Lidong Wang ◽  
Xiaodong Liu ◽  
Yan Ren ◽  
Ye Yuan

The goal of object retrieval is to rank a set of images by their similarity compared with a query image. Nowadays, content-based image retrieval is a hot research topic, and color features play an important role in this procedure. However, it is important to establish a measure of image similarity in advance. The innovation point of this paper lies in the following. Firstly, the idea of the proximity space theory is utilized to retrieve the relevant images between the query image and images of database, and we use the color histogram of an image to obtain the Top-ranked colors, which can be regard as the object set. Secondly, the similarity is calculated based on an improved dominance granule structure similarity method. Thus, we propose a color-based image retrieval method by using proximity space theory. To detect the feasibility of this method, we conducted an experiment on COIL-20 image database and Corel-1000 database. Experimental results demonstrate the effectiveness of the proposed framework and its applications.


2017 ◽  
Vol 14 (1) ◽  
pp. 172988141668695 ◽  
Author(s):  
Yi Hou ◽  
Hong Zhang ◽  
Shilin Zhou

Recent impressive studies on using ConvNet landmarks for visual place recognition take an approach that involves three steps: (a) detection of landmarks, (b) description of the landmarks by ConvNet features using a convolutional neural network, and (c) matching of the landmarks in the current view with those in the database views. Such an approach has been shown to achieve the state-of-the-art accuracy even under significant viewpoint and environmental changes. However, the computational burden in step (c) significantly prevents this approach from being applied in practice, due to the complexity of linear search in high-dimensional space of the ConvNet features. In this article, we propose two simple and efficient search methods to tackle this issue. Both methods are built upon tree-based indexing. Given a set of ConvNet features of a query image, the first method directly searches the features’ approximate nearest neighbors in a tree structure that is constructed from ConvNet features of database images. The database images are voted on by features in the query image, according to a lookup table which maps each ConvNet feature to its corresponding database image. The database image with the highest vote is considered the solution. Our second method uses a coarse-to-fine procedure: the coarse step uses the first method to coarsely find the top- N database images, and the fine step performs a linear search in Hamming space of the hash codes of the ConvNet features to determine the best match. Experimental results demonstrate that our methods achieve real-time search performance on five data sets with different sizes and various conditions. Most notably, by achieving an average search time of 0.035 seconds/query, our second method improves the matching efficiency by the three orders of magnitude over a linear search baseline on a database with 20,688 images, with negligible loss in place recognition accuracy.


2022 ◽  
Vol 23 (1) ◽  
pp. 116-128
Author(s):  
Baydaa Khaleel

Image retrieval is an important system for retrieving similar images by searching and browsing in a large database. The image retrieval system can be a reliable tool for people to optimize the use of image accumulation, and finding efficient methods to retrieve images is very important. Recent decades have marked increased research interest in field image retrieval. To retrieve the images, an important set of features is used. In this work, a combination of methods was used to examine all the images and detect images in a database according to a query image. Linear Discriminant Analysis (LDA) was used for feature extraction of the images into the dataset. The images in the database were processed by extracting their important and robust features and storing them in the feature store. Likewise, the strong features were extracted for specific query images. By using some Meta Heuristic algorithms such as Cuckoo Search (CS), Ant Colony Optimization (ACO), and using an artificial neural network such as single-layer Perceptron Neural Network (PNN), similarity was evaluated. It also proposed a new two method by hybridized PNN and CS with fuzzy logic to produce a new method called Fuzzy Single Layer Perceptron Neural Network (FPNN), and Fuzzy Cuckoo Search to examine the similarity between features for query images and features for images in the database. The efficiency of the system methods was evaluated by calculating the precision recall value of the results. The proposed method of FCS outperformed other methods such as (PNN), (ACO), (CS), and (FPNN) in terms of precision and image recall. ABSTRAK: Imej dapatan semula adalah sistem penting bagi mendapatkan imej serupa melalui carian imej dan melayari pangkalan besar data. Sistem dapatan semula imej ini boleh dijadikan alat boleh percaya untuk orang mengoptimum penggunaan pengumpulan imej, dan kaedah pencarian yang berkesan bagi mendapatkan imej adalah sangat penting. Beberapa dekad yang lalu telah menunjukan banyak penyelidikan dalam bidang imej dapatan semula. Bagi mendapatkan imej-imej ini, ciri-ciri set penting telah digunakan. Kajian ini menggunakan beberapa kaedah bagi memeriksa semua imej dan mengesan imej dalam pangkalan data berdasarkan imej carian. Kami menggunakan Analisis Diskriminan Linear (LDA) bagi mengekstrak ciri imej ke dalam set data. Imej-imej dalam pangkalan data diproses dengan mengekstrak ciri-ciri penting dan berkesan daripadanya dan menyimpannya dalam simpanan ciri. Begitu juga, ciri-ciri penting ini diekstrak bagi imej carian tertentu. Persamaan dinilai melalui beberapa algoritma Meta Heuristik seperti Carian Cuckoo (CS), Pengoptimuman Koloni Semut (ACO), dan menggunakan lapisan tunggal rangkaian neural buatan seperti Rangkaian Neural Perseptron (PNN). Dua cadangan baru dengan kombinasi hibrid PNN dan CS bersama logik kabur bagi menghasilkan kaedah baru yang disebut Lapisan Tunggal Kabur Rangkaian Neural Perceptron (FPNN), dan Carian Cuckoo Kabur bagi mengkaji persamaan antara ciri carian imej dan imej pangkalan data. Nilai kecekapan kaedah sistem dinilai dengan mengira ketepatan mengingat pada dapatan hasil. Kaedah FCS yang dicadangkan ini mengatasi kaedah lain seperti (PNN), (ACO), (CS) dan (FPNN) dari segi ketepatan dan ingatan imej.


2021 ◽  
Vol 2050 (1) ◽  
pp. 012006
Author(s):  
Xili Dai ◽  
Chunmei Ma ◽  
Jingwei Sun ◽  
Tao Zhang ◽  
Haigang Gong ◽  
...  

Abstract Training deep neural networks from only a few examples has been an interesting topic that motivated few shot learning. In this paper, we study the fine-grained image classification problem in a challenging few-shot learning setting, and propose the Self-Amplificated Network (SAN), a method based on meta-learning to tackle this problem. The SAN model consists of three parts, which are the Encoder, Amplification and Similarity Modules. The Encoder Module encodes a fine-grained image input into a feature vector. The Amplification Module is used to amplify subtle differences between fine-grained images based on the self attention mechanism which is composed of multi-head attention. The Similarity Module measures how similar the query image and the support set are in order to determine the classification result. In-depth experiments on three benchmark datasets have showcased that our network achieves superior performance over the competing baselines.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Wang Li ◽  
Zhang Yong ◽  
Yuan Wei ◽  
Shi Hongxing

Vehicle reidentification refers to the mission of matching vehicles across nonoverlapping cameras, which is one of the critical problems of the intelligent transportation system. Due to the resemblance of the appearance of the vehicles on road, traditional methods could not perform well on vehicles with high similarity. In this paper, we utilize hypergraph representation to integrate image features and tackle the issue of vehicles re-ID via hypergraph learning algorithms. A feature descriptor can only extract features from a single aspect. To merge multiple feature descriptors, an efficient and appropriate representation is particularly necessary, and a hypergraph is naturally suitable for modeling high-order relationships. In addition, the spatiotemporal correlation of traffic status between cameras is the constraint beyond the image, which can greatly improve the re-ID accuracy of different vehicles with similar appearances. The method proposed in this paper uses hypergraph optimization to learn about the similarity between the query image and images in the library. By using the pair and higher-order relationship between query objects and image library, the similarity measurement method is improved compared to direct matching. The experiments conducted on the image library constructed in this paper demonstrates the effectiveness of using multifeature hypergraph fusion and the spatiotemporal correlation model to address issues in vehicle reidentification.


Author(s):  
Wenbin Li ◽  
Lei Wang ◽  
Jing Huo ◽  
Yinghuan Shi ◽  
Yang Gao ◽  
...  

The core idea of metric-based few-shot image classification is to directly measure the relations between query images and support classes to learn transferable feature embeddings. Previous work mainly focuses on image-level feature representations, which actually cannot effectively estimate a class's distribution due to the scarcity of samples. Some recent work shows that local descriptor based representations can achieve richer representations than image-level based representations. However, such works are still based on a less effective instance-level metric, especially a symmetric metric, to measure the relation between a query image and a support class. Given the natural asymmetric relation between a query image and a support class, we argue that an asymmetric measure is more suitable for metric-based few-shot learning. To that end, we propose a novel Asymmetric Distribution Measure (ADM) network for few-shot learning by calculating a joint local and global asymmetric measure between two multivariate local distributions of a query and a class. Moreover, a task-aware Contrastive Measure Strategy (CMS) is proposed to further enhance the measure function. On popular miniImageNet and tieredImageNet, ADM can achieve the state-of-the-art results, validating our innovative design of asymmetric distribution measures for few-shot learning. The source code can be downloaded from https://github.com/WenbinLee/ADM.git.


2016 ◽  
Vol 3 (2) ◽  
pp. 189-196
Author(s):  
Budi Hartono ◽  
Veronica Lusiana

Searching image is based on the image content, which is often called with searching of image object. If the image data has similarity object with query image then it is expected the searching process can recognize it. The position of the image object that contains an object, which is similar to the query image, is possible can be found at any positionon image data so that will become main attention or the region of interest (ROI). This image object can has different wide image, which is wider or smaller than the object on the query image. This research uses two kinds of image data sizes that are in size of 512X512 and in size of 256X256 pixels.Through experimental result is obtained that preparing model of multilevel sub-image and resize that has same size with query image that is in size of 128X128 pixels can help to find ROI position on image data. In order to find the image data that is similar to the query image then it is done by calculating Euclidean distance between query image feature and image data feature.


Sign in / Sign up

Export Citation Format

Share Document