scholarly journals Efficient Attention Mechanism for Dynamic Convolution in Lightweight Neural Network

2021 ◽  
Vol 11 (7) ◽  
pp. 3111
Author(s):  
Enjie Ding ◽  
Yuhao Cheng ◽  
Chengcheng Xiao ◽  
Zhongyu Liu ◽  
Wanli Yu

Light-weight convolutional neural networks (CNNs) suffer limited feature representation capabilities due to low computational budgets, resulting in degradation in performance. To make CNNs more efficient, dynamic neural networks (DyNet) have been proposed to increase the complexity of the model by using the Squeeze-and-Excitation (SE) module to adaptively obtain the importance of each convolution kernel through the attention mechanism. However, the attention mechanism in the SE network (SENet) selects all channel information for calculations, which brings essential challenges: (a) interference caused by the internal redundant information; and (b) increasing number of network calculations. To address the above problems, this work proposes a dynamic convolutional network (termed as EAM-DyNet) to reduce the number of channels in feature maps by extracting only the useful spatial information. EAM-DyNet first uses the random channel reduction and channel grouping reduction methods to remove the redundancy in the information. As the downsampling of information can lead to the loss of useful information, it then applies an adaptive average pooling method to maintain the information integrity. Extensive experimental results on the baseline demonstrate that EAM-DyNet outperformed the existing approaches, thus it can achieve higher accuracy of the network test and less network parameters.

Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2158
Author(s):  
Juan Du ◽  
Kuanhong Cheng ◽  
Yue Yu ◽  
Dabao Wang ◽  
Huixin Zhou

Panchromatic (PAN) images contain abundant spatial information that is useful for earth observation, but always suffer from low-resolution ( LR) due to the sensor limitation and large-scale view field. The current super-resolution (SR) methods based on traditional attention mechanism have shown remarkable advantages but remain imperfect to reconstruct the edge details of SR images. To address this problem, an improved SR model which involves the self-attention augmented Wasserstein generative adversarial network ( SAA-WGAN) is designed to dig out the reference information among multiple features for detail enhancement. We use an encoder-decoder network followed by a fully convolutional network (FCN) as the backbone to extract multi-scale features and reconstruct the High-resolution (HR) results. To exploit the relevance between multi-layer feature maps, we first integrate a convolutional block attention module (CBAM) into each skip-connection of the encoder-decoder subnet, generating weighted maps to enhance both channel-wise and spatial-wise feature representation automatically. Besides, considering that the HR results and LR inputs are highly similar in structure, yet cannot be fully reflected in traditional attention mechanism, we, therefore, designed a self augmented attention (SAA) module, where the attention weights are produced dynamically via a similarity function between hidden features; this design allows the network to flexibly adjust the fraction relevance among multi-layer features and keep the long-range inter information, which is helpful to preserve details. In addition, the pixel-wise loss is combined with perceptual and gradient loss to achieve comprehensive supervision. Experiments on benchmark datasets demonstrate that the proposed method outperforms other SR methods in terms of both objective evaluation and visual effect.


Author(s):  
Juan Du ◽  
Kuanhong Cheng ◽  
Yue Yu ◽  
Dabao Wang ◽  
Huixin Zhou

Panchromatic (PAN) images contain abundant spatial information that is useful for earth observation, but always suffer from low-resolution due to the sensor limitation and large-scale view field. The current super-resolution (SR) methods based on traditional attention mechanism have shown remarkable advantages but remain imperfect to reconstruct the edge details of SR images. To address this problem, an improved super-resolution model which involves the self-attention augmented WGAN is designed to dig out the reference information among multiple features for detail enhancement. We use an encoder-decoder network followed by a fully convolutional network (FCN) as the backbone to extract multi-scale features and reconstruct the HR results. To exploit the relevance between multi-layer feature maps, we first integrate a convolutional block attention module (CBAM) into each skip-connection of the encoder-decoder subnet, generating weighted maps to enhance both channel-wise and spatial-wise feature representation automatically. Besides, considering that the HR results and LR inputs are highly similar in structure, yet cannot be fully reflected in traditional attention mechanism, we therefore design a self augmented attention (SAA) module, where the attention weights are produced dynamically via a similarity function between hidden features, this design allows the network to flexibly adjust the fraction relevance among multi-layer features and keep the long-range inter information, which is helpful to preserve details. In addition, the pixel-wise loss is combined with perceptual and gradient loss to achieve comprehensive supervision. Experiments on benchmark datasets demonstrate that the proposed method outperforms other SR methods in terms of both objective evaluation and visual effect.


2022 ◽  
Vol 16 (1) ◽  
pp. 1-20
Author(s):  
Ping Zhao ◽  
Zhijie Fan* ◽  
Zhiwei Cao ◽  
Xin Li

In order to improve the ability to detect network attacks, traditional intrusion detection models often used convolutional neural networks to encode spatial information or recurrent neural networks to obtain temporal features of the data. Some models combined the two methods to extract spatio-temporal features. However, these approaches used separate models and learned features insufficiently. This paper presented an improved model based on temporal convolutional networks (TCN) and attention mechanism. The causal and dilation convolution can capture the spatio-temporal dependencies of the data. The residual blocks allow the network to transfer information in a cross-layered manner, enabling in-depth network learning. Meanwhile, attention mechanism can enhance the model's attention to the relevant anomalous features of different attacks. Finally, this paper compared models results on the KDD CUP99 and UNSW-NB15 datasets. Besides, the authors apply the model to video surveillance network attack detection scenarios. The result shows that the model has advantages in evaluation metrics.


2019 ◽  
Vol 11 (24) ◽  
pp. 2970 ◽  
Author(s):  
Ziran Ye ◽  
Yongyong Fu ◽  
Muye Gan ◽  
Jinsong Deng ◽  
Alexis Comber ◽  
...  

Automated methods to extract buildings from very high resolution (VHR) remote sensing data have many applications in a wide range of fields. Many convolutional neural network (CNN) based methods have been proposed and have achieved significant advances in the building extraction task. In order to refine predictions, a lot of recent approaches fuse features from earlier layers of CNNs to introduce abundant spatial information, which is known as skip connection. However, this strategy of reusing earlier features directly without processing could reduce the performance of the network. To address this problem, we propose a novel fully convolutional network (FCN) that adopts attention based re-weighting to extract buildings from aerial imagery. Specifically, we consider the semantic gap between features from different stages and leverage the attention mechanism to bridge the gap prior to the fusion of features. The inferred attention weights along spatial and channel-wise dimensions make the low level feature maps adaptive to high level feature maps in a target-oriented manner. Experimental results on three publicly available aerial imagery datasets show that the proposed model (RFA-UNet) achieves comparable and improved performance compared to other state-of-the-art models for building extraction.


2020 ◽  
Vol 21 (S13) ◽  
Author(s):  
Jian Wang ◽  
Mengying Li ◽  
Qishuai Diao ◽  
Hongfei Lin ◽  
Zhihao Yang ◽  
...  

Abstract Background Biomedical document triage is the foundation of biomedical information extraction, which is important to precision medicine. Recently, some neural networks-based methods have been proposed to classify biomedical documents automatically. In the biomedical domain, documents are often very long and often contain very complicated sentences. However, the current methods still find it difficult to capture important features across sentences. Results In this paper, we propose a hierarchical attention-based capsule model for biomedical document triage. The proposed model effectively employs hierarchical attention mechanism and capsule networks to capture valuable features across sentences and construct a final latent feature representation for a document. We evaluated our model on three public corpora. Conclusions Experimental results showed that both hierarchical attention mechanism and capsule networks are helpful in biomedical document triage task. Our method proved itself highly competitive or superior compared with other state-of-the-art methods.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Shasha Sun ◽  
Chuanpeng Li ◽  
Ning Lv ◽  
Xiaoman Zhang ◽  
Zhaoyan Yu ◽  
...  

Abstract Sleep staging is an important basis for diagnosing sleep-related problems. In this paper, an attention based convolutional network for automatic sleep staging is proposed. The network takes time-frequency image as input and predict sleep stage for each 30-s epoch as output. For each CNN feature maps, our model generate attention maps along two separate dimensions, time and filter, and then multiplied to form the final attention map. Residual-like fusion structure is used to append the attention map to the input feature map for adaptive feature refinement. In addition, to get the global feature representation with less information loss, the generalized mean pooling is introduced. To prove the efficacy of the proposed method, we have compared with two baseline method on sleep-EDF data set with different setting of the framework and input channel type, the experimental results show that the paper model has achieved significant improvements in terms of overall accuracy, Cohen’s kappa, MF1, sensitivity and specificity. The performance of the proposed network is compared with that of the state-of-the-art algorithms with an overall accuracy of 83.4%, a macro F1-score of 77.3%, κ = 0.77, sensitivity = 77.1% and specificity = 95.4%, respectively. The experimental results demonstrate the superiority of the proposed network.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Jie Xu ◽  
Hanyuan Wang ◽  
Mingzhu Xu ◽  
Fan Yang ◽  
Yifei Zhou ◽  
...  

Object detection is used widely in smart cities including safety monitoring, traffic control, and car driving. However, in the smart city scenario, many objects will have occlusion problems. Moreover, most popular object detectors are often sensitive to various real-world occlusions. This paper proposes a feature-enhanced occlusion perception object detector by simultaneously detecting occluded objects and fully utilizing spatial information. To generate hard examples with occlusions, a mask generator localizes and masks discriminated regions with weakly supervised methods. To obtain enriched feature representation, we design a multiscale representation fusion module to combine hierarchical feature maps. Moreover, this method exploits contextual information by heaping up representations from different regions in feature maps. The model is trained end-to-end learning by minimizing the multitask loss. Our model obtains superior performance compared to previous object detectors, 77.4% mAP and 74.3% mAP on PASCAL VOC 2007 and PASCAL VOC 2012, respectively. It also achieves 24.6% mAP on MS COCO. Experiments demonstrate that the proposed method is useful to improve the effectiveness of object detection, making it highly suitable for smart cities application that need to discover key objects with occlusions.


Author(s):  
Ainaz Hajimoradlou ◽  
Gioachino Roberti ◽  
David Poole

Landslides, movement of soil and rock under the influence of gravity, are common phenomena that cause significant human and economic losses every year. Experts use heterogeneous features such as slope, elevation, land cover, lithology, rock age, and rock family to predict landslides. To work with such features, we adapted convolutional neural networks to consider relative spatial information for the prediction task. Traditional filters in these networks either have a fixed orientation or are rotationally invariant. Intuitively, the filters should orient uphill, but there is not enough data to learn the concept of uphill; instead, it can be provided as prior knowledge. We propose a model called Locally Aligned Convolutional Neural Network, LACNN, that follows the ground surface at multiple scales to predict possible landslide occurrence for a single point. To validate our method, we created a standardized dataset of georeferenced images consisting of the heterogeneous features as inputs, and compared our method to several baselines, including linear regression, a neural network, and a convolutional network, using log-likelihood error and Receiver Operating Characteristic curves on the test set. Our model achieves 2-7% improvement in terms of accuracy and 2-15% boost in terms of log likelihood compared to the other proposed baselines.


2020 ◽  
Vol 12 (11) ◽  
pp. 178
Author(s):  
Kerang Cao ◽  
Jingyu Gao ◽  
Kwang-nam Choi ◽  
Lini Duan

To classify the image material on the internet, the deep learning methodology, especially deep neural network, is the most optimal and costliest method of all computer vision methods. Convolutional neural networks (CNNs) learn a comprehensive feature representation by exploiting local information with a fixed receptive field, demonstrating distinguished capacities on image classification. Recent works concentrate on efficient feature exploration, which neglect the global information for holistic consideration. There is large effort to reduce the computational costs of deep neural networks. Here, we provide a hierarchical global attention mechanism that improve the network representation with restricted increase of computation complexity. Different from nonlocal-based methods, the hierarchical global attention mechanism requires no matrix multiplication and can be flexibly applied in various modern network designs. Experimental results demonstrate that proposed hierarchical global attention mechanism can conspicuously improve the image classification precision—a reduction of 7.94% and 16.63% percent in Top 1 and Top 5 errors separately—with little increase of computation complexity (6.23%) in comparison to competing approaches.


2019 ◽  
Vol 14 ◽  
pp. 155892501989739 ◽  
Author(s):  
Zhoufeng Liu ◽  
Chi Zhang ◽  
Chunlei Li ◽  
Shumin Ding ◽  
Yan Dong ◽  
...  

Fabric defect recognition is an important measure for quality control in a textile factory. This article utilizes a deep convolutional neural network to recognize defects in fabrics that have complicated textures. Although convolutional neural networks are very powerful, a large number of parameters consume considerable computation time and memory bandwidth. In real-world applications, however, the fabric defect recognition task needs to be carried out in a timely fashion on a computation-limited platform. To optimize a deep convolutional neural network, a novel method is introduced to reveal the input pattern that originally caused a specific activation in the network feature maps. Using this visualization technique, this study visualizes the features in a fully trained convolutional model and attempts to change the architecture of original neural network to reduce computational load. After a series of improvements, a new convolutional network is acquired that is more efficient to the fabric image feature extraction, and the computation load and the total number of parameters in the new network is 23% and 8.9%, respectively, of the original model. The proposed neural network is specifically tailored for fabric defect recognition in resource-constrained environments. All of the source code and pretrained models are available online at https://github.com/ZCmeteor .


Sign in / Sign up

Export Citation Format

Share Document