A novel video retrieval algorithm based on spatiotemporal feature

Text-video retrieval tasks face a great challenge in the semantic gap between cross modal information. Some existing methods transform the text or video into the same subspace to measure their similarity. However, this kind of method does not consider adding a semantic consistency constraint when associating the two modalities of semantic encoding, and the associated result is poor. In this paper, we propose a multi-modal retrieval algorithm based on semantic association and multi-task learning. Firstly, the multi-level features of video or text are extracted based on multiple deep learning networks, so that the information of the two modalities can be fully encoded. Then, in the public feature space where the two modalities information are mapped together, we propose a semantic similarity measurement and semantic consistency classification based on text-video features for a multi-task learning framework. With the semantic consistency classification task, the learning of semantic association task is restrained. So multi-task learning guides the better feature mapping of two modalities and optimizes the construction of unified feature subspace. Finally, the experimental results of our proposed algorithm on the Microsoft Video Description dataset (MSVD) and MSR-Video to Text (MSR-VTT) are better than the existing research, which prove that our algorithm can improve the performance of cross-modal retrieval.

Download Full-text

A Multigranularity Surveillance Video Retrieval Algorithm for Human Targets

Advances in Intelligent Systems and Computing - Foundations of Intelligent Systems ◽

10.1007/978-3-642-54924-3_16 ◽

2014 ◽

pp. 165-175

Author(s):

Zhenkun Wen ◽

Jinhua Gao ◽

Fumi Liu ◽

Huisi Wu

Keyword(s):

Video Retrieval ◽

Retrieval Algorithm ◽

Surveillance Video

Download Full-text

A novel video retrieval algorithm based on coarse-grained and fine-grained

11th International Conference on Wireless Communications, Networking and Mobile Computing (WiCOM 2015) ◽

10.1049/cp.2015.0689 ◽

2015 ◽

Author(s):

Xu Jie ◽

Yumeng Peng ◽

Sun Jian ◽

Aiyun Xie

Keyword(s):

Video Retrieval ◽

Coarse Grained ◽

Retrieval Algorithm ◽

Fine Grained

Download Full-text

INTEGRATION OF COLOR AND MOTION FEATURES FOR VIDEO RETRIEVAL

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001409007089 ◽

2009 ◽

Vol 23 (02) ◽

pp. 313-329 ◽

Cited By ~ 5

Author(s):

LIANG-HUA CHEN ◽

KUO-HAO CHIN ◽

HONG-YUAN MARK LIAO

Keyword(s):

Visual Cues ◽

Video Retrieval ◽

Retrieval Algorithm ◽

Compact Representation ◽

Key Frame ◽

Video Clips ◽

Video Shot ◽

Motion Features ◽

Spatio Temporal ◽

Video Matching

The usefulness of a video database depends on whether the video of interest can be easily located. In this paper, we propose a video retrieval algorithm based on the integration of several visual cues. In contrast to key-frame based representation of shot, our approach analyzes all frames within a shot to construct a compact representation of video shot. In the video matching step, by integrating the color and motion features, a similarity measure is defined to locate the occurrence of similar video clips in the database. Therefore, our approach is able to fully exploit the spatio-temporal information contained in video. Experimental results indicate that the proposed approach is effective and outperforms some existing technique.

Download Full-text