scholarly journals Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval

2019 ◽  
Vol 28 (4) ◽  
pp. 1993-2007 ◽  
Author(s):  
Gengshen Wu ◽  
Jungong Han ◽  
Yuchen Guo ◽  
Li Liu ◽  
Guiguang Ding ◽  
...  
Author(s):  
Yingxin Wang ◽  
Xiushan Nie ◽  
Yang Shi ◽  
Xin Zhou ◽  
Yilong Yin

Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3094
Author(s):  
Hanqing Chen ◽  
Chunyan Hu ◽  
Feifei Lee ◽  
Chaowei Lin ◽  
Wei Yao ◽  
...  

Recently, with the popularization of camera tools such as mobile phones and the rise of various short video platforms, a lot of videos are being uploaded to the Internet at all times, for which a video retrieval system with fast retrieval speed and high precision is very necessary. Therefore, content-based video retrieval (CBVR) has aroused the interest of many researchers. A typical CBVR system mainly contains the following two essential parts: video feature extraction and similarity comparison. Feature extraction of video is very challenging, previous video retrieval methods are mostly based on extracting features from single video frames, while resulting the loss of temporal information in the videos. Hashing methods are extensively used in multimedia information retrieval due to its retrieval efficiency, but most of them are currently only applied to image retrieval. In order to solve these problems in video retrieval, we build an end-to-end framework called deep supervised video hashing (DSVH), which employs a 3D convolutional neural network (CNN) to obtain spatial-temporal features of videos, then train a set of hash functions by supervised hashing to transfer the video features into binary space and get the compact binary codes of videos. Finally, we use triplet loss for network training. We conduct a lot of experiments on three public video datasets UCF-101, JHMDB and HMDB-51, and the results show that the proposed method has advantages over many state-of-the-art video retrieval methods. Compared with the DVH method, the mAP value of UCF-101 dataset is improved by 9.3%, and the minimum improvement on JHMDB dataset is also increased by 0.3%. At the same time, we also demonstrate the stability of the algorithm in the HMDB-51 dataset.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Chen Zhang ◽  
Bin Hu ◽  
Yucong Suo ◽  
Zhiqiang Zou ◽  
Yimu Ji

In this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large collection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video retrieval with low storage cost and high search efficiency. Our framework consists of the key-frame extraction algorithm and the feature aggregation strategy. Specifically, the key-frame extraction algorithm takes advantage of the clustering idea so that redundant information is removed in video data and storage cost is greatly reduced. The feature aggregation strategy adopts average pooling to encode deep local convolutional features followed by coarse-to-fine retrieval, which allows rapid retrieval in the large-scale video database. The results from extensive experiments on two publicly available datasets demonstrate that the proposed method achieves superior efficiency as well as accuracy over other state-of-the-art visual search methods.


2021 ◽  
Author(s):  
D. Chandra Mouli ◽  
G. Varun Kumar ◽  
S. V. Kiran ◽  
Sanjeev Kumar

2021 ◽  
Author(s):  
Chen Jiang ◽  
Kaiming Huang ◽  
Sifeng He ◽  
Xudong Yang ◽  
Wei Zhang ◽  
...  

Author(s):  
Kimiaki Shirahama ◽  
Kuniaki Uehara

This paper examines video retrieval based on Query-By-Example (QBE) approach, where shots relevant to a query are retrieved from large-scale video data based on their similarity to example shots. This involves two crucial problems: The first is that similarity in features does not necessarily imply similarity in semantic content. The second problem is an expensive computational cost to compute the similarity of a huge number of shots to example shots. The authors have developed a method that can filter a large number of shots irrelevant to a query, based on a video ontology that is knowledge base about concepts displayed in a shot. The method utilizes various concept relationships (e.g., generalization/specialization, sibling, part-of, and co-occurrence) defined in the video ontology. In addition, although the video ontology assumes that shots are accurately annotated with concepts, accurate annotation is difficult due to the diversity of forms and appearances of the concepts. Dempster-Shafer theory is used to account the uncertainty in determining the relevance of a shot based on inaccurate annotation of this shot. Experimental results on TRECVID 2009 video data validate the effectiveness of the method.


2020 ◽  
Vol 14 (5) ◽  
Author(s):  
Ling Shen ◽  
Richang Hong ◽  
Yanbin Hao

Sign in / Sign up

Export Citation Format

Share Document