Combining an information-maximization-based attention mechanism and illumination invariance theory for the recognition of green apples in natural scenes

2020 ◽  
Vol 79 (37-38) ◽  
pp. 28301-28327
Author(s):  
Sashuang Sun ◽  
Mei Jiang ◽  
Ning Liang ◽  
Dongjian He ◽  
Yan Long ◽  
...  
2019 ◽  
Vol 10 (1) ◽  
pp. 101 ◽  
Author(s):  
Yadong Yang ◽  
Chengji Xu ◽  
Feng Dong ◽  
Xiaofeng Wang

Computer vision systems are insensitive to the scale of objects in natural scenes, so it is important to study the multi-scale representation of features. Res2Net implements hierarchical multi-scale convolution in residual blocks, but its random grouping method affects the robustness and intuitive interpretability of the network. We propose a new multi-scale convolution model based on multiple attention. It introduces the attention mechanism into the structure of a Res2-block to better guide feature expression. First, we adopt channel attention to score channels and sort them in descending order of the feature’s importance (Channels-Sort). The sorted residual blocks are grouped and intra-block hierarchically convolved to form a single attention and multi-scale block (AMS-block). Then, we implement channel attention on the residual small blocks to constitute a dual attention and multi-scale block (DAMS-block). Introducing spatial attention before sorting the channels to form multi-attention multi-scale blocks(MAMS-block). A MAMS-convolutional neural network (CNN) is a series of multiple MAMS-blocks. It enables significant information to be expressed at more levels, and can also be easily grafted into different convolutional structures. Limited by hardware conditions, we only prove the validity of the proposed ideas through convolutional networks of the same magnitude. The experimental results show that the convolution model with an attention mechanism and multi-scale features is superior in image classification.


Author(s):  
Qiuxia Lai ◽  
Yu Li ◽  
Ailing Zeng ◽  
Minhao Liu ◽  
Hanqiu Sun ◽  
...  

The selective visual attention mechanism in the human visual system (HVS) restricts the amount of information to reach visual awareness for perceiving natural scenes, allowing near real-time information processing with limited computational capacity. This kind of selectivity acts as an ‘Information Bottleneck (IB)’, which seeks a trade-off between information compression and predictive accuracy. However, such information constraints are rarely explored in the attention mechanism for deep neural networks (DNNs). In this paper, we propose an IB-inspired spatial attention module for DNN structures built for visual recognition. The module takes as input an intermediate representation of the input image, and outputs a variational 2D attention map that minimizes the mutual information (MI) between the attention-modulated representation and the input, while maximizing the MI between the attention-modulated representation and the task label. To further restrict the information bypassed by the attention map, we quantize the continuous attention scores to a set of learnable anchor values during training. Extensive experiments show that the proposed IB-inspired spatial attention mechanism can yield attention maps that neatly highlight the regions of interest while suppressing backgrounds, and bootstrap standard DNN structures for visual recognition tasks (e.g., image classification, fine-grained recognition, cross-domain classification). The attention maps are interpretable for the decision making of the DNNs as verified in the experiments. Our code is available at this https URL.


1995 ◽  
Author(s):  
S.N. Yendrikhovskij ◽  
H. DE Ridder ◽  
E.A. Fedorovskaya

2020 ◽  
Vol 140 (12) ◽  
pp. 1393-1401
Author(s):  
Hiroki Chinen ◽  
Hidehiro Ohki ◽  
Keiji Gyohten ◽  
Toshiya Takami

Sign in / Sign up

Export Citation Format

Share Document