scholarly journals Information Bottleneck Approach to Spatial Attention Learning

Author(s):  
Qiuxia Lai ◽  
Yu Li ◽  
Ailing Zeng ◽  
Minhao Liu ◽  
Hanqiu Sun ◽  
...  

The selective visual attention mechanism in the human visual system (HVS) restricts the amount of information to reach visual awareness for perceiving natural scenes, allowing near real-time information processing with limited computational capacity. This kind of selectivity acts as an ‘Information Bottleneck (IB)’, which seeks a trade-off between information compression and predictive accuracy. However, such information constraints are rarely explored in the attention mechanism for deep neural networks (DNNs). In this paper, we propose an IB-inspired spatial attention module for DNN structures built for visual recognition. The module takes as input an intermediate representation of the input image, and outputs a variational 2D attention map that minimizes the mutual information (MI) between the attention-modulated representation and the input, while maximizing the MI between the attention-modulated representation and the task label. To further restrict the information bypassed by the attention map, we quantize the continuous attention scores to a set of learnable anchor values during training. Extensive experiments show that the proposed IB-inspired spatial attention mechanism can yield attention maps that neatly highlight the regions of interest while suppressing backgrounds, and bootstrap standard DNN structures for visual recognition tasks (e.g., image classification, fine-grained recognition, cross-domain classification). The attention maps are interpretable for the decision making of the DNNs as verified in the experiments. Our code is available at this https URL.

Author(s):  
Chaojian Yu ◽  
Xinyi Zhao ◽  
Qi Zheng ◽  
Peng Zhang ◽  
Xinge You

2021 ◽  
Author(s):  
◽  
Daniel Jenkins

<p>Multisensory integration describes the cognitive processes by which information from various perceptual domains is combined to create coherent percepts. For consciously aware perception, multisensory integration can be inferred when information in one perceptual domain influences subjective experience in another. Yet the relationship between integration and awareness is not well understood. One current question is whether multisensory integration can occur in the absence of perceptual awareness. Because there is subjective experience for unconscious perception, researchers have had to develop novel tasks to infer integration indirectly. For instance, Palmer and Ramsey (2012) presented auditory recordings of spoken syllables alongside videos of faces speaking either the same or different syllables, while masking the videos to prevent visual awareness. The conjunction of matching voices and faces predicted the location of a subsequent Gabor grating (target) on each trial. Participants indicated the location/orientation of the target more accurately when it appeared in the cued location (80% chance), thus the authors inferred that auditory and visual speech events were integrated in the absence of visual awareness. In this thesis, I investigated whether these findings generalise to the integration of auditory and visual expressions of emotion. In Experiment 1, I presented spatially informative cues in which congruent facial and vocal emotional expressions predicted the target location, with and without visual masking. I found no evidence of spatial cueing in either awareness condition. To investigate the lack of spatial cueing, in Experiment 2, I repeated the task with aware participants only, and had half of those participants explicitly report the emotional prosody. A significant spatial-cueing effect was found only when participants reported emotional prosody, suggesting that audiovisual congruence can cue spatial attention during aware perception. It remains unclear whether audiovisual congruence can cue spatial attention without awareness, and whether such effects genuinely imply multisensory integration.</p>


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yongyi Li ◽  
Shiqi Wang ◽  
Shuang Dong ◽  
Xueling Lv ◽  
Changzhi Lv ◽  
...  

At present, person reidentification based on attention mechanism has attracted many scholars’ interests. Although attention module can improve the representation ability and reidentification accuracy of Re-ID model to a certain extent, it depends on the coupling of attention module and original network. In this paper, a person reidentification model that combines multiple attentions and multiscale residuals is proposed. The model introduces combined attention fusion module and multiscale residual fusion module in the backbone network ResNet 50 to enhance the feature flow between residual blocks and better fuse multiscale features. Furthermore, a global branch and a local branch are designed and applied to enhance the channel aggregation and position perception ability of the network by utilizing the dual ensemble attention module, as along as the fine-grained feature expression is obtained by using multiproportion block and reorganization. Thus, the global and local features are enhanced. The experimental results on Market-1501 dataset and DukeMTMC-reID dataset show that the indexes of the presented model, especially Rank-1 accuracy, reach 96.20% and 89.59%, respectively, which can be considered as a progress in Re-ID.


2021 ◽  
Author(s):  
Zongbao Liang ◽  
Xing Liu ◽  
Bo Chen ◽  
YunFei Yuan ◽  
Yang Song ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document