An extensive evaluation of deep featuresof convolutional neural networks for saliency prediction of human visual attention

Author(s):  
Ali Mahdi ◽  
Jun Qin
Author(s):  
Abraham Montoya Obeso ◽  
Jenny Benois-Pineau ◽  
Mireya Sarai Garcia Vazquez ◽  
Alejandro A. Ramirez Acosta

Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2170 ◽  
Author(s):  
Yuya Moroto ◽  
Keisuke Maeda ◽  
Takahiro Ogawa ◽  
Miki Haseyama

A few-shot personalized saliency prediction based on adaptive image selection considering object and visual attention is presented in this paper. Since general methods predicting personalized saliency maps (PSMs) need a large number of training images, the establishment of a theory using a small number of training images is needed. To tackle this problem, although finding persons who have visual attention similar to that of a target person is effective, all persons have to commonly gaze at many images. Thus, it becomes difficult and unrealistic when considering their burden. On the other hand, this paper introduces a novel adaptive image selection (AIS) scheme that focuses on the relationship between human visual attention and objects in images. AIS focuses on both a diversity of objects in images and a variance of PSMs for the objects. Specifically, AIS selects images so that selected images have various kinds of objects to maintain their diversity. Moreover, AIS guarantees the high variance of PSMs for persons since it represents the regions that many persons commonly gaze at or do not gaze at. The proposed method enables selecting similar users from a small number of images by selecting images that have high diversities and variances. This is the technical contribution of this paper. Experimental results show the effectiveness of our personalized saliency prediction including the new image selection scheme.


2020 ◽  
Vol 34 (07) ◽  
pp. 12410-12417 ◽  
Author(s):  
Xinyi Wu ◽  
Zhenyao Wu ◽  
Jinglin Zhang ◽  
Lili Ju ◽  
Song Wang

The performance of predicting human fixations in videos has been much enhanced with the help of development of the convolutional neural networks (CNN). In this paper, we propose a novel end-to-end neural network “SalSAC” for video saliency prediction, which uses the CNN-LSTM-Attention as the basic architecture and utilizes the information from both static and dynamic aspects. To better represent the static information of each frame, we first extract multi-level features of same size from different layers of the encoder CNN and calculate the corresponding multi-level attentions, then we randomly shuffle these attention maps among levels and multiply them to the extracted multi-level features respectively. Through this way, we leverage the attention consistency across different layers to improve the robustness of the network. On the dynamic aspect, we propose a correlation-based ConvLSTM to appropriately balance the influence of the current and preceding frames to the prediction. Experimental results on the DHF1K, Hollywood2 and UCF-sports datasets show that SalSAC outperforms many existing state-of-the-art methods.


2018 ◽  
Vol 77 (22) ◽  
pp. 29231-29244 ◽  
Author(s):  
Meijun Sun ◽  
Ziqi Zhou ◽  
Dong Zhang ◽  
Zheng Wang

Sign in / Sign up

Export Citation Format

Share Document