A visual saliency based video hashing algorithm

Author(s):  
Jing Wang ◽  
Jiande Sun ◽  
Ju Liu ◽  
Xiushan Nie ◽  
Hua Yan
2012 ◽  
Vol 19 (6) ◽  
pp. 328-331 ◽  
Author(s):  
Jiande Sun ◽  
Jing Wang ◽  
Jie Zhang ◽  
Xiushan Nie ◽  
Ju Liu

2020 ◽  
Vol 2020 (4) ◽  
pp. 218-1-218-7
Author(s):  
Huajian Liu ◽  
Sebastian Fach ◽  
Martin Steinebach

A novel robust video hashing scheme is proposed in this paper. Unlike most existing robust video hashing algorithms, the proposed video hash is generated based on the motion vectors instead of the image textures in the video stream. Therefore, neither full decoding of the video stream nor complex computation of pixel values is required. Based on analysis of motion vector properties regarding their suitability for robust hashing, an improved feature extraction mechanism is proposed and several optimization mechanisms are introduced in order to achieve better robustness and discriminability. The proposed hashing scheme is evaluated by a large and modern video data set and the experimental results demonstrate the excellent performance of the proposed hashing algorithm, which is comparable or even better than the complicated texture-based approaches.


2021 ◽  
Vol 11 (16) ◽  
pp. 7217
Author(s):  
Cristina Luna-Jiménez ◽  
Jorge Cristóbal-Martín ◽  
Ricardo Kleinlein ◽  
Manuel Gil-Martín ◽  
José M. Moya ◽  
...  

Spatial Transformer Networks are considered a powerful algorithm to learn the main areas of an image, but still, they could be more efficient by receiving images with embedded expert knowledge. This paper aims to improve the performance of conventional Spatial Transformers when applied to Facial Expression Recognition. Based on the Spatial Transformers’ capacity of spatial manipulation within networks, we propose different extensions to these models where effective attentional regions are captured employing facial landmarks or facial visual saliency maps. This specific attentional information is then hardcoded to guide the Spatial Transformers to learn the spatial transformations that best fit the proposed regions for better recognition results. For this study, we use two datasets: AffectNet and FER-2013. For AffectNet, we achieve a 0.35% point absolute improvement relative to the traditional Spatial Transformer, whereas for FER-2013, our solution gets an increase of 1.49% when models are fine-tuned with the Affectnet pre-trained weights.


2021 ◽  
Author(s):  
Sai Phani Kumar Malladi ◽  
Jayanta Mukhopadhyay ◽  
Chaker Larabi ◽  
Santanu Chaudhury

Author(s):  
Shenyi Qian ◽  
Yongsheng Shi ◽  
Huaiguang Wu ◽  
Jinhua Liu ◽  
Weiwei Zhang

Author(s):  
Wenxia Zhang ◽  
Chunguang Wang ◽  
Haichao Wang ◽  
Xiaofei Yin
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document