Separation between top-down and bottom-up control of visual attention

<p>The human visual attention system (HVA) encompasses a set of interconnected neurological modules that are responsible for analyzing visual stimuli by attending to those regions that are salient. Two contrasting biological mechanisms exist in the HVA systems; bottom-up, data-driven attention and top-down, task-driven attention. The former is mostly responsible for low-level instinctive behaviors, while the latter is responsible for performing complex visual tasks such as target object detection. Very few computational models have been proposed to model top-down attention, mainly due to three reasons. The first is that the functionality of top-down process involves many influential factors. The second reason is that there is a diversity in top-down responses from task to task. Finally, many biological aspects of the top-down process are not well understood yet. For the above reasons, it is difficult to come up with a generalized top-down model that could be applied to all high level visual tasks. Instead, this thesis addresses some outstanding issues in modelling top-down attention for one particular task, target object detection. Target object detection is an essential step for analyzing images to further perform complex visual tasks. Target object detection has not been investigated thoroughly when modelling top-down saliency and hence, constitutes the may domain application for this thesis. The thesis will investigate methods to model top-down attention through various high-level data acquired from images. Furthermore, the thesis will investigate different strategies to dynamically combine bottom-up and top-down processes to improve the detection accuracy, as well as the computational efficiency of the existing and new visual attention models. The following techniques and approaches are proposed to address the outstanding issues in modelling top-down saliency: 1. A top-down saliency model that weights low-level attentional features through contextual knowledge of a scene. The proposed model assigns weights to features of a novel image by extracting a contextual descriptor of the image. The contextual descriptor plays the role of tuning the weighting of low-level features to maximize detection accuracy. By incorporating context into the feature weighting mechanism we improve the quality of the assigned weights to these features. 2. Two modules of target features combined with contextual weighting to improve detection accuracy of the target object. In this proposed model, two sets of attentional feature weights are learned, one through context and the other through target features. When both sources of knowledge are used to model top-down attention, a drastic increase in detection accuracy is achieved in images with complex backgrounds and a variety of target objects. 3. A top-down and bottom-up attention combination model based on feature interaction. This model provides a dynamic way for combining both processes by formulating the problem as feature selection. The feature selection exploits the interaction between these features, yielding a robust set of features that would maximize both the detection accuracy and the overall efficiency of the system. 4. A feature map quality score estimation model that is able to accurately predict the detection accuracy score of any previously novel feature map without the need of groundtruth data. The model extracts various local, global, geometrical and statistical characteristic features from a feature map. These characteristics guide a regression model to estimate the quality of a novel map. 5. A dynamic feature integration framework for combining bottom-up and top-down saliencies at runtime. If the estimation model is able to predict the quality score of any novel feature map accurately, then it is possible to perform dynamic feature map integration based on the estimated value. We propose two frameworks for feature map integration using the estimation model. The proposed integration framework achieves higher human fixation prediction accuracy with minimum number of feature maps than that achieved by combining all feature maps. The proposed works in this thesis provide new directions in modelling top-down saliency for target object detection. In addition, dynamic approaches for top-down and bottom-up combination show considerable improvements over existing approaches in both efficiency and accuracy.</p>

Download Full-text

A visual attention model combining top-down and bottom-up mechanisms for salient object detection

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2011.5946648 ◽

2011 ◽

Cited By ~ 18

Author(s):

Yuming Fang ◽

Weisi Lin ◽

Chiew Tong Lau ◽

Bu-Sung Lee

Keyword(s):

Visual Attention ◽

Object Detection ◽

Salient Object Detection ◽

Salient Object ◽

Top Down ◽

Bottom Up ◽

Visual Attention Model ◽

Attention Model ◽

Model Combining

Download Full-text

Neural substrates of visual percepts, imagery, and hallucinations

Behavioral and Brain Sciences ◽

10.1017/s0140525x02350040 ◽

2002 ◽

Vol 25 (2) ◽

pp. 194-195

Author(s):

Stephen Grossberg

Keyword(s):

Visual Attention ◽

Normal Control ◽

Mental Imagery ◽

Visual Information ◽

Neural Substrates ◽

Visual Hallucinations ◽

Neural Models ◽

Top Down ◽

Bottom Up

Recent neural models clarify many properties of mental imagery as part of the process whereby bottom-up visual information is influenced by top-down expectations, and how these expectations control visual attention. Volitional signals can transform modulatory top-down signals into supra-threshold imagery. Visual hallucinations can occur when the normal control of these volitional signals is lost.

Download Full-text

Dissection of early bottom-up and top-down deficits during visual attention in schizophrenia

Clinical Neurophysiology ◽

10.1016/j.clinph.2010.06.011 ◽

2011 ◽

Vol 122 (1) ◽

pp. 90-98 ◽

Cited By ~ 19

Author(s):

Andres H. Neuhaus ◽

Christine Karl ◽

Eric Hahn ◽

Niklas R. Trempler ◽

Carolin Opgen-Rhein ◽

...

Keyword(s):

Visual Attention ◽

Top Down ◽

Bottom Up

Download Full-text

Visual Attention With Cognitive Aging

Oxford Research Encyclopedia of Psychology ◽

10.1093/acrefore/9780190236557.013.369 ◽

2019 ◽

Cited By ~ 1

Author(s):

David J. Madden ◽

Zachary A. Monge

Keyword(s):

Visual Attention ◽

Visual Processing ◽

Cognitive Aging ◽

Brain Regions ◽

Object Identification ◽

Features Selection ◽

Top Down ◽

Bottom Up ◽

Attentional Functioning ◽

Age Related

Age-related decline occurs in several aspects of fluid, speed-dependent cognition, particularly those related to attention. Empirical research on visual attention has determined that attention-related effects occur across a range of information processing components, including the sensory registration of features, selection of information from working memory, controlling motor responses, and coordinating multiple perceptual and cognitive tasks. Thus, attention is a multifaceted construct that is relevant at virtually all stages of object identification. A fundamental theme of attentional functioning is the interaction between the bottom-up salience of visual features and top-down allocation of processing based on the observer’s goals. An underlying age-related slowing is prominent throughout visual processing stages, which in turn contributes to age-related decline in some aspects of attention, such as the inhibition of irrelevant information and the coordination of multiple tasks. However, some age-related preservation of attentional functioning is also evident, particularly the top-down allocation of attention. Neuroimaging research has identified networks of frontal and parietal brain regions relevant for top-down and bottom-up attentional processing. Disconnection among these networks contributes to an age-related decline in attention, but preservation and perhaps even increased patterns of functional brain activation and connectivity also contribute to preserved attentional functioning.

Download Full-text

A Novel Method of Visual Attention for Targets Detection

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.3764 ◽

2013 ◽

Vol 347-350 ◽

pp. 3764-3768 ◽

Cited By ~ 1

Author(s):

Zhuo Zhang ◽

Xin Nan Fan ◽

Xue Wu Zhang ◽

Hai Yan Xu ◽

Min Li

Keyword(s):

Visual Attention ◽

Information Extraction ◽

Feature Space ◽

Gain Factor ◽

Top Down ◽

Bottom Up ◽

Attention Model ◽

Proposed Model ◽

Novel Method ◽

Statistical Prior

Inspired by the research of human visual system in neuroanatomy and psychology, the paper proposes a two-way collaborative visual attention model for target detection.In this new method , bottom-up attention information cooperates with top-down attention information to detect a target rapidly and accuractly. Firstly,the statistical prior knowledge of target and background is applied to optimize bottom-up attention information in different feature space and scale space.Secondly, after the SNR of salience difference between target and interference is computed ,the bottom-up gain factor is obtained.Thirdly, the gain factor is applied to adjust bottom up attention information extraction and then to maximize the salience contrast of target and background.Finally, target is detected by adjusted saliency.Experimental results shows that the proposed model in this paper can improve the real-time capability and reliability of target detection.

Download Full-text