Visual Saliency Prediction Using Deep Learning

The human attention mechanism can be understood and simulated by closely associating the saliency prediction task to neuroscience and psychology. Furthermore, saliency prediction is widely used in computer vision and interdisciplinary subjects. In recent years, with the rapid development of deep learning, deep models have made amazing achievements in saliency prediction. Deep learning models can automatically learn features, thus solving many drawbacks of the classic models, such as handcrafted features and task settings, among others. Nevertheless, the deep models still have some limitations, for example in tasks involving multi-modality and semantic understanding. This study focuses on summarizing the relevant achievements in the field of saliency prediction, including the early neurological and psychological mechanisms and the guiding role of classic models, followed by the development process and data comparison of classic and deep saliency prediction models. This study also discusses the relationship between the model and human vision, as well as the factors that cause the semantic gaps, the influences of attention in cognitive research, the limitations of the saliency model, and the emerging applications, to provide new saliency predictions for follow-up work and the necessary help and advice.

Download Full-text

A novel fully convolutional network for visual saliency prediction

PeerJ Computer Science ◽

10.7717/peerj-cs.280 ◽

2020 ◽

Vol 6 ◽

pp. e280

Author(s):

Bashir Muftah Ghariba ◽

Mohamed S. Shehata ◽

Peter McGuire

Keyword(s):

Deep Learning ◽

Visual Saliency ◽

Superior Performance ◽

Convolutional Network ◽

Fully Convolutional Network ◽

Proposed Model ◽

Saliency Prediction ◽

Benchmark Datasets ◽

The Many ◽

Deep Learning Model

A human Visual System (HVS) has the ability to pay visual attention, which is one of the many functions of the HVS. Despite the many advancements being made in visual saliency prediction, there continues to be room for improvement. Deep learning has recently been used to deal with this task. This study proposes a novel deep learning model based on a Fully Convolutional Network (FCN) architecture. The proposed model is trained in an end-to-end style and designed to predict visual saliency. The entire proposed model is fully training style from scratch to extract distinguishing features. The proposed model is evaluated using several benchmark datasets, such as MIT300, MIT1003, TORONTO, and DUT-OMRON. The quantitative and qualitative experiment analyses demonstrate that the proposed model achieves superior performance for predicting visual saliency.

Download Full-text

Lighter and Faster Cross-Concatenated Multi-Scale Residual Block Based Network for Visual Saliency Prediction

10.1109/icip42928.2021.9506710 ◽

2021 ◽

Author(s):

Sai Phani Kumar Malladi ◽

Jayanta Mukhopadhyay ◽

Chaker Larabi ◽

Santanu Chaudhury

Keyword(s):

Visual Saliency ◽

Multi Scale ◽

Saliency Prediction ◽

Block Based ◽

Residual Block

Download Full-text

Multi-level Net: A Visual Saliency Prediction Model

Lecture Notes in Computer Science - Computer Vision – ECCV 2016 Workshops ◽

10.1007/978-3-319-48881-3_21 ◽

2016 ◽

pp. 302-315 ◽

Cited By ~ 5

Author(s):

Marcella Cornia ◽

Lorenzo Baraldi ◽

Giuseppe Serra ◽

Rita Cucchiara

Keyword(s):

Prediction Model ◽

Visual Saliency ◽

Saliency Prediction ◽

Multi Level

Download Full-text

Facial Expression Recognition Using Visual Saliency and Deep Learning

2017 IEEE International Conference on Computer Vision Workshops (ICCVW) ◽

10.1109/iccvw.2017.327 ◽

2017 ◽

Cited By ~ 11

Author(s):

Viraj Mavani ◽

Shanmuganathan Raman ◽

Krishna P. Miyapuram

Keyword(s):

Deep Learning ◽

Facial Expression ◽

Facial Expression Recognition ◽

Visual Saliency ◽

Expression Recognition

Download Full-text

Audiovisual saliency prediction via deep learning

Neurocomputing ◽

10.1016/j.neucom.2020.12.011 ◽

2021 ◽

Vol 428 ◽

pp. 248-258

Author(s):

Jiazhong Chen ◽

Qingqing Li ◽

Hefei Ling ◽

Dakai Ren ◽

Ping Duan

Keyword(s):

Deep Learning ◽

Saliency Prediction

Download Full-text

RETRACTED: Deep network for visual saliency prediction by encoding image composition

Journal of Visual Communication and Image Representation ◽

10.1016/j.jvcir.2018.08.010 ◽

2018 ◽

Vol 55 ◽

pp. 789-794

Author(s):

Bo Dai ◽

Weijing Ye ◽

Jing Zheng ◽

Qianyi Chai ◽

Yiyang Yao

Keyword(s):

Visual Saliency ◽

Image Composition ◽

Deep Network ◽

Saliency Prediction

Download Full-text

Visual Saliency Prediction Using a Mixture of Deep Neural Networks

IEEE Transactions on Image Processing ◽

10.1109/tip.2018.2834826 ◽

2018 ◽

Vol 27 (8) ◽

pp. 4080-4090 ◽

Cited By ~ 9

Author(s):

Samuel F. Dodge ◽

Lina J. Karam

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Visual Saliency ◽

Saliency Prediction

Download Full-text

Visual Saliency Prediction Based on Deep Learning

Information ◽

10.3390/info10080257 ◽

2019 ◽

Vol 10 (8) ◽

pp. 257 ◽

Cited By ~ 7

Author(s):

Bashir Ghariba ◽

Mohamed S. Shehata ◽

Peter McGuire

Keyword(s):

Deep Learning ◽

Saliency Detection ◽

Visual Saliency ◽

Semantic Segmentation ◽

Input Image ◽

Human Eye ◽

Proposed Model ◽

Global Accuracy ◽

Visual Saliency Detection ◽

Deep Learning Model

Human eye movement is one of the most important functions for understanding our surroundings. When a human eye processes a scene, it quickly focuses on dominant parts of the scene, commonly known as a visual saliency detection or visual attention prediction. Recently, neural networks have been used to predict visual saliency. This paper proposes a deep learning encoder-decoder architecture, based on a transfer learning technique, to predict visual saliency. In the proposed model, visual features are extracted through convolutional layers from raw images to predict visual saliency. In addition, the proposed model uses the VGG-16 network for semantic segmentation, which uses a pixel classification layer to predict the categorical label for every pixel in an input image. The proposed model is applied to several datasets, including TORONTO, MIT300, MIT1003, and DUT-OMRON, to illustrate its efficiency. The results of the proposed model are quantitatively and qualitatively compared to classic and state-of-the-art deep learning models. Using the proposed deep learning model, a global accuracy of up to 96.22% is achieved for the prediction of visual saliency.

Download Full-text