Synergetic Object Recognition Based on Visual Attention Saliency Map

Gaze movement and visual stimuli have been utilized to analyze human visual attention intuitively. Gaze behavior studies mainly show statistical analyses of eye movements and human visual attention. During these analyses, eye movement data and the saliency map are presented to the analysts as separate views or merged views. However, the analysts become frustrated when they need to memorize all of the separate views or when the eye movements obscure the saliency map in the merged views. Therefore, it is not easy to analyze how visual stimuli affect gaze movements since existing techniques focus excessively on the eye movement data. In this paper, we propose a novel visualization technique for analyzing gaze behavior using saliency features as visual clues to express the visual attention of an observer. The visual clues that represent visual attention are analyzed to reveal which saliency features are prominent for the visual stimulus analysis. We visualize the gaze data with the saliency features to interpret the visual attention. We analyze the gaze behavior with the proposed visualization to evaluate that our approach to embedding saliency features within the visualization supports us to understand the visual attention of an observer.

Download Full-text

Bottom-up visual attention model for still image: a preliminary study

International Journal of Advances in Intelligent Informatics ◽

10.26555/ijain.v6i1.469 ◽

2020 ◽

Vol 6 (1) ◽

pp. 82

Author(s):

Adhi Prahara ◽

Murinto Murinto ◽

Dewi Pramudi Ismi

Keyword(s):

Visual Attention ◽

Object Detection ◽

Video Compression ◽

Saliency Map ◽

Bottom Up ◽

Attention Model ◽

Intrinsic Cues ◽

Preliminary Study ◽

Segmentation Image ◽

Human Visual Attention

The philosophy of human visual attention is scientifically explained in the field of cognitive psychology and neuroscience then computationally modeled in the field of computer science and engineering. Visual attention models have been applied in computer vision systems such as object detection, object recognition, image segmentation, image and video compression, action recognition, visual tracking, and so on. This work studies bottom-up visual attention, namely human fixation prediction and salient object detection models. The preliminary study briefly covers from the biological perspective of visual attention, including visual pathway, the theory of visual attention, to the computational model of bottom-up visual attention that generates saliency map. The study compares some models at each stage and observes whether the stage is inspired by biological architecture, concept, or behavior of human visual attention. From the study, the use of low-level features, center-surround mechanism, sparse representation, and higher-level guidance with intrinsic cues dominate the bottom-up visual attention approaches. The study also highlights the correlation between bottom-up visual attention and curiosity.

Download Full-text

Influence of Movement Expertise on Visual Perception of Objects, Events and Motor Action

Developing and Applying Biologically-Inspired Vision Systems ◽

10.4018/978-1-4666-2539-6.ch001 ◽

2012 ◽

pp. 1-30 ◽

Cited By ~ 1

Author(s):

Kai Essig ◽

Oleg Strogan ◽

Helge Ritter ◽

Thomas Schack

Keyword(s):

Eye Movements ◽

Visual Perception ◽

Visual Attention ◽

Saliency Map ◽

Long Term Memory ◽

Bottom Up ◽

Term Memory ◽

Perceptual Skills ◽

Control Learning

Various computational models of visual attention rely on the extraction of salient points or proto-objects, i.e., discrete units of attention, computed from bottom-up image features. In recent years, different solutions integrating top-down mechanisms were implemented, as research has shown that although eye movements initially are solely influenced by bottom-up information, after some time goal driven (high-level) processes dominate the guidance of visual attention towards regions of interest (Hwang, Higgins & Pomplun, 2009). However, even these improved modeling approaches are unlikely to generalize to a broader range of application contexts, because basic principles of visual attention, such as cognitive control, learning and expertise, have thus far not sufficiently been taken into account (Tatler, Hayhoe, Land & Ballard, 2011). In some recent work, the authors showed the functional role and representational nature of long-term memory structures for human perceptual skills and motor control. Based on these findings, the chapter extends a widely applied saliency-based model of visual attention (Walther & Koch, 2006) in two ways: first, it computes the saliency map using the cognitive visual attention approach (CVA) that shows a correspondence between regions of high saliency values and regions of visual interest indicated by participants’ eye movements (Oyekoya & Stentiford, 2004). Second, it adds an expertise-based component (Schack, 2012) to represent the influence of the quality of mental representation structures in long-term memory (LTM) and the roles of learning on the visual perception of objects, events, and motor actions.

Download Full-text

Visual Attention Guided Object Detection and Tracking

Innovative Research in Attention Modeling and Computer Vision Applications - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-4666-8723-3.ch004 ◽

2016 ◽

pp. 99-114

Author(s):

Debi Prosad Dogra

Keyword(s):

Computer Vision ◽

Object Recognition ◽

Visual Attention ◽

Visual Saliency ◽

Video Object ◽

Salient Region Detection ◽

Region Detection ◽

Video Object Tracking ◽

Detection And Tracking ◽

Pros And Cons

Scene understanding and object recognition heavily depend on the success of visual attention guided salient region detection in images and videos. Therefore, summarizing computer vision techniques that take the help of visual attention models to accomplish video object recognition and tracking, can be helpful to the researchers of computer vision community. In this chapter, it is aimed to present a philosophical overview of the possible applications of visual attention models in the context of object recognition and tracking. At the beginning of this chapter, a brief introduction to various visual saliency models suitable for object recognition is presented, that is followed by discussions on possible applications of attention models on video object tracking. The chapter also provides a commentary on the existing techniques available on this domain and discusses some of their possible extensions. It is believed that, prospective readers will benefit since the chapter comprehensively guides a reader to understand the pros and cons of this particular topic.

Download Full-text

An Application of Deep Learning to Tactile Data for Object Recognition under Visual Guidance

Sensors ◽

10.3390/s19071534 ◽

2019 ◽

Vol 19 (7) ◽

pp. 1534 ◽

Cited By ~ 3

Author(s):

Ghazal Rouhafzay ◽

Ana-Maria Cretu

Keyword(s):

Object Recognition ◽

Visual Attention ◽

Visual Information ◽

Support Vector ◽

Sequential Data ◽

K Nearest Neighbors ◽

Object A ◽

Haptic Exploration ◽

Virtual Force ◽

Neuroscience Research

Drawing inspiration from haptic exploration of objects by humans, the current work proposes a novel framework for robotic tactile object recognition, where visual information in the form of a set of visually interesting points is employed to guide the process of tactile data acquisition. Neuroscience research confirms the integration of cutaneous data as a response to surface changes sensed by humans with data from joints, muscles, and bones (kinesthetic cues) for object recognition. On the other hand, psychological studies demonstrate that humans tend to follow object contours to perceive their global shape, which leads to object recognition. In compliance with these findings, a series of contours are determined around a set of 24 virtual objects from which bimodal tactile data (kinesthetic and cutaneous) are obtained sequentially and by adaptively changing the size of the sensor surface according to the object geometry for each object. A virtual Force Sensing Resistor array (FSR) is employed to capture cutaneous cues. Two different methods for sequential data classification are then implemented using Convolutional Neural Networks (CNN) and conventional classifiers, including support vector machines and k-nearest neighbors. In the case of conventional classifiers, we exploit contourlet transformation to extract features from tactile images. In the case of CNN, two networks are trained for cutaneous and kinesthetic data and a novel hybrid decision-making strategy is proposed for object recognition. The proposed framework is tested both for contours determined blindly (randomly determined contours of objects) and contours determined using a model of visual attention. Trained classifiers are tested on 4560 new sequential tactile data and the CNN trained over tactile data from object contours selected by the model of visual attention yields an accuracy of 98.97% which is the highest accuracy among other implemented approaches.

Download Full-text

A Neurodynamical cortical model of visual attention and invariant object recognition

Vision Research ◽

10.1016/j.visres.2003.09.037 ◽

2004 ◽

Vol 44 (6) ◽

pp. 621-642 ◽

Cited By ~ 193

Author(s):

Gustavo Deco ◽

Edmund T. Rolls

Keyword(s):

Object Recognition ◽

Visual Attention ◽

Cortical Model ◽

Invariant Object Recognition

Download Full-text

Extraction of Visual Attention with Gaze Duration and Saliency Map

IEEE International Symposium on Intelligent Control ◽

10.1109/cca.2006.285931 ◽

2006 ◽

Cited By ~ 5

Author(s):

H. Igarashi ◽

S. Suzuki ◽

T. Sugita ◽

M. Kurisu ◽

M. Kakikura

Keyword(s):

Visual Attention ◽

Saliency Map

Download Full-text

Infant visual attention and object recognition

Behavioural Brain Research ◽

10.1016/j.bbr.2015.01.015 ◽

2015 ◽

Vol 285 ◽

pp. 34-43 ◽

Cited By ~ 34

Author(s):

Greg D. Reynolds

Keyword(s):

Object Recognition ◽

Visual Attention

Download Full-text

Extraction of visual attention with gaze duration and saliency map

10.1109/cacsd-cca-isic.2006.4776707 ◽

2006 ◽

Cited By ~ 2

Author(s):

Hiroshi Igarashi ◽

Satoshi Suzuki ◽

Tetsuro Sugita ◽

Masamitsu Kurisu ◽

Masayoshi Kakikura

Keyword(s):

Visual Attention ◽

Saliency Map

Download Full-text

The Application of Visual Attention Mechanism in Road Disaster Identification and early Warning System

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.385-386.523 ◽

2013 ◽

Vol 385-386 ◽

pp. 523-526

Author(s):

Shu Yue Hua ◽

Nan Feng Xiao

Keyword(s):

Visual Attention ◽

Early Warning ◽

Early Warning System ◽

Warning System ◽

Saliency Map ◽

Attention Mechanism ◽

Total System ◽

Time Performance ◽

Disaster Monitoring ◽

Visual Attention Mechanism

Visual attention mechanism is introduced into the traditional road disaster monitoring and early warning system. In this system, the disaster region is the focus of attention (FOA), which happens to be the object needed to process. Ittis algorithm [1]was used to extract the saliency map, then quickly located the regions which may contain disaster according to saliency. The recognition and early warning of disaster can be completed, quickly. This method was tested snowstorms and rolling stones are simulated, and gave the corresponding experimental results. Experiment results show the correctness and efficiency of introducing visual attention mechanism into road disaster monitor and early warning system. It is of great significance and practical value for reducing the computation and improving real-time performance of the total system.

Download Full-text