scholarly journals A New Multi-Scale Convolutional Model Based on Multiple Attention for Image Classification

2019 ◽  
Vol 10 (1) ◽  
pp. 101 ◽  
Author(s):  
Yadong Yang ◽  
Chengji Xu ◽  
Feng Dong ◽  
Xiaofeng Wang

Computer vision systems are insensitive to the scale of objects in natural scenes, so it is important to study the multi-scale representation of features. Res2Net implements hierarchical multi-scale convolution in residual blocks, but its random grouping method affects the robustness and intuitive interpretability of the network. We propose a new multi-scale convolution model based on multiple attention. It introduces the attention mechanism into the structure of a Res2-block to better guide feature expression. First, we adopt channel attention to score channels and sort them in descending order of the feature’s importance (Channels-Sort). The sorted residual blocks are grouped and intra-block hierarchically convolved to form a single attention and multi-scale block (AMS-block). Then, we implement channel attention on the residual small blocks to constitute a dual attention and multi-scale block (DAMS-block). Introducing spatial attention before sorting the channels to form multi-attention multi-scale blocks(MAMS-block). A MAMS-convolutional neural network (CNN) is a series of multiple MAMS-blocks. It enables significant information to be expressed at more levels, and can also be easily grafted into different convolutional structures. Limited by hardware conditions, we only prove the validity of the proposed ideas through convolutional networks of the same magnitude. The experimental results show that the convolution model with an attention mechanism and multi-scale features is superior in image classification.

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yongyi Li ◽  
Shiqi Wang ◽  
Shuang Dong ◽  
Xueling Lv ◽  
Changzhi Lv ◽  
...  

At present, person reidentification based on attention mechanism has attracted many scholars’ interests. Although attention module can improve the representation ability and reidentification accuracy of Re-ID model to a certain extent, it depends on the coupling of attention module and original network. In this paper, a person reidentification model that combines multiple attentions and multiscale residuals is proposed. The model introduces combined attention fusion module and multiscale residual fusion module in the backbone network ResNet 50 to enhance the feature flow between residual blocks and better fuse multiscale features. Furthermore, a global branch and a local branch are designed and applied to enhance the channel aggregation and position perception ability of the network by utilizing the dual ensemble attention module, as along as the fine-grained feature expression is obtained by using multiproportion block and reorganization. Thus, the global and local features are enhanced. The experimental results on Market-1501 dataset and DukeMTMC-reID dataset show that the indexes of the presented model, especially Rank-1 accuracy, reach 96.20% and 89.59%, respectively, which can be considered as a progress in Re-ID.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Qi Zhang

AbstractImage classification plays an important role in computer vision. The existing convolutional neural network methods have some problems during image classification process, such as low accuracy of tumor classification and poor ability of feature expression and feature extraction. Therefore, we propose a novel ResNet101 model based on dense dilated convolution for medical liver tumors classification. The multi-scale feature extraction module is used to extract multi-scale features of images, and the receptive field of the network is increased. The depth feature extraction module is used to reduce background noise information and focus on effective features of the focal region. To obtain broader and deeper semantic information, a dense dilated convolution module is deployed in the network. This module combines the advantages of Inception, residual structure, and multi-scale dilated convolution to obtain a deeper level of feature information without causing gradient explosion and gradient disappearance. To solve the common feature loss problems in the classification network, the up- down-sampling module in the network is improved, and multiple convolution kernels with different scales are cascaded to widen the network, which can effectively avoid feature loss. Finally, experiments are carried out on the proposed method. Compared with the existing mainstream classification networks, the proposed method can improve the classification performance, and finally achieve accurate classification of liver tumors. The effectiveness of the proposed method is further verified by ablation experiments.Highlights The multi-scale feature extraction module is introduced to extract multi-scale features of images, it can extract deep context information of the lesion region and surrounding tissues to enhance the feature extraction ability of the network. The depth feature extraction module is used to focus on the local features of the lesion region from both channel and space, weaken the influence of irrelevant information, and strengthen the recognition ability of the lesion region. The feature extraction module is enhanced by the parallel structure of dense dilated convolution, and the deeper feature information is obtained without losing the image feature information to improve the classification accuracy.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 39069-39082 ◽  
Author(s):  
Md. Mostafa Kamal Sarker ◽  
Hatem A. Rashwan ◽  
Farhan Akram ◽  
Estefania Talavera ◽  
Syeda Furruka Banu ◽  
...  

2021 ◽  
Vol 25 (2) ◽  
pp. 359-382
Author(s):  
Tao Luo ◽  
Xudong Cao ◽  
Jin Li ◽  
Kun Dong ◽  
Rui Zhang ◽  
...  

The energy load data in the micro-energy network are a time series with sequential and nonlinear characteristics. This paper proposes a model based on the encode-decode architecture and ConvLSTM for multi-scale prediction of multi-energy loads in the micro-energy network. We apply ConvLSTM, LSTM, attention mechanism and multi-task learning concepts to construct a model specifically for processing the energy load forecasting of the micro-energy network. In this paper, ConvLSTM is used to encode the input time series. The attention mechanism is used to assign different weights to the features, which are subsequently decoded by the decoder LSTM layer. Finally, the fully connected layer interprets the output. This model is applied to forecast the multi-energy load data of the micro-energy network in a certain area of Northwest China. The test results prove that our model is convergent, and the evaluation index value of the model is better than that of the multi-task FC-LSTM and the single-task FC-LSTM. In particular, the application of the attention mechanism makes the model converge faster and with higher precision.


2020 ◽  
Vol 34 (07) ◽  
pp. 12581-12588
Author(s):  
Chuanguang Yang ◽  
Zhulin An ◽  
Hui Zhu ◽  
Xiaolong Hu ◽  
Kun Zhang ◽  
...  

We propose a simple yet effective method to reduce the redundancy of DenseNet by substantially decreasing the number of stacked modules by replacing the original bottleneck by our SMG module, which is augmented by local residual. Furthermore, SMG module is equipped with an efficient two-stage pipeline, which aims to DenseNet-like architectures that need to integrate all previous outputs, i.e., squeezing the incoming informative but redundant features gradually by hierarchical convolutions as a hourglass shape and then exciting it by multi-kernel depthwise convolutions, the output of which would be compact and hold more informative multi-scale features. We further develop a forget and an update gate by introducing the popular attention modules to implement the effective fusion instead of a simple addition between reused and new features. Due to the Hybrid Connectivity (nested combination of global dense and local residual) and Gated mechanisms, we called our network as the HCGNet. Experimental results on CIFAR and ImageNet datasets show that HCGNet is more prominently efficient than DenseNet, and can also significantly outperform state-of-the-art networks with less complexity. Moreover, HCGNet also shows the remarkable interpretability and robustness by network dissection and adversarial defense, respectively. On MS-COCO, HCGNet can consistently learn better features than popular backbones.


2021 ◽  
Vol 2082 (1) ◽  
pp. 012006
Author(s):  
Runyi Li ◽  
Sen Wang ◽  
Zizhou Wang ◽  
Lei Zhang

Abstract With ever-progressing development period, image classification algorithms based on deep learning have shown good performance on some large datasets. In the development of classification algorithms, many proposals related to attention mechanism have greatly improved the accuracy of the model, and at the same time increased the interpretability of the network structure. However, on medical image data, the performance of the classification algorithm is not as expected, and the reason is that the fine-grained image data differs little among all classes, resulting that the knowledge domain is also hard to learn for models. We (1) proposed the Efficientnet model based on the cbam attention mechanism, and added a multi-scale fusion method; (2) applied the model to the breast cancer medical image data set, and completed the breast cancer classification task with high accuracy (Phase I, Phase II, Phase III, etc.); (3) Compared with other existing image classification algorithms, our method has the highest accuracy, thus the researchers conclude that EfficientNet with CBAM and multi-scale fusion will improve the classification performance. This result is helpful for deeper research on medical image processing and breast cancer staging.


Sign in / Sign up

Export Citation Format

Share Document