Efficient Attention Mechanism for Dynamic Convolution in Lightweight Neural Network

Light-weight convolutional neural networks (CNNs) suffer limited feature representation capabilities due to low computational budgets, resulting in degradation in performance. To make CNNs more efficient, dynamic neural networks (DyNet) have been proposed to increase the complexity of the model by using the Squeeze-and-Excitation (SE) module to adaptively obtain the importance of each convolution kernel through the attention mechanism. However, the attention mechanism in the SE network (SENet) selects all channel information for calculations, which brings essential challenges: (a) interference caused by the internal redundant information; and (b) increasing number of network calculations. To address the above problems, this work proposes a dynamic convolutional network (termed as EAM-DyNet) to reduce the number of channels in feature maps by extracting only the useful spatial information. EAM-DyNet first uses the random channel reduction and channel grouping reduction methods to remove the redundancy in the information. As the downsampling of information can lead to the loss of useful information, it then applies an adaptive average pooling method to maintain the information integrity. Extensive experimental results on the baseline demonstrate that EAM-DyNet outperformed the existing approaches, thus it can achieve higher accuracy of the network test and less network parameters.

Download Full-text

Panchromatic Image Super-Resolution Via Self Attention-Augmented Wasserstein Generative Adversarial Network

Sensors ◽

10.3390/s21062158 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2158

Author(s):

Juan Du ◽

Kuanhong Cheng ◽

Yue Yu ◽

Dabao Wang ◽

Huixin Zhou

Keyword(s):

Large Scale ◽

Spatial Information ◽

Super Resolution ◽

Attention Mechanism ◽

Feature Representation ◽

Similarity Function ◽

Feature Maps ◽

Generative Adversarial Network ◽

Convolutional Network ◽

Adversarial Network

Panchromatic (PAN) images contain abundant spatial information that is useful for earth observation, but always suffer from low-resolution ( LR) due to the sensor limitation and large-scale view field. The current super-resolution (SR) methods based on traditional attention mechanism have shown remarkable advantages but remain imperfect to reconstruct the edge details of SR images. To address this problem, an improved SR model which involves the self-attention augmented Wasserstein generative adversarial network ( SAA-WGAN) is designed to dig out the reference information among multiple features for detail enhancement. We use an encoder-decoder network followed by a fully convolutional network (FCN) as the backbone to extract multi-scale features and reconstruct the High-resolution (HR) results. To exploit the relevance between multi-layer feature maps, we first integrate a convolutional block attention module (CBAM) into each skip-connection of the encoder-decoder subnet, generating weighted maps to enhance both channel-wise and spatial-wise feature representation automatically. Besides, considering that the HR results and LR inputs are highly similar in structure, yet cannot be fully reflected in traditional attention mechanism, we, therefore, designed a self augmented attention (SAA) module, where the attention weights are produced dynamically via a similarity function between hidden features; this design allows the network to flexibly adjust the fraction relevance among multi-layer features and keep the long-range inter information, which is helpful to preserve details. In addition, the pixel-wise loss is combined with perceptual and gradient loss to achieve comprehensive supervision. Experiments on benchmark datasets demonstrate that the proposed method outperforms other SR methods in terms of both objective evaluation and visual effect.

Download Full-text

Panchromatic Image Super-Resolution via Self Attention-augmented WGAN

10.20944/preprints202012.0592.v1 ◽

2020 ◽

Author(s):

Juan Du ◽

Kuanhong Cheng ◽

Yue Yu ◽

Dabao Wang ◽

Huixin Zhou

Keyword(s):

Large Scale ◽

Spatial Information ◽

Super Resolution ◽

Objective Evaluation ◽

Attention Mechanism ◽

Feature Representation ◽

Similarity Function ◽

Feature Maps ◽

Convolutional Network ◽

Benchmark Datasets

Panchromatic (PAN) images contain abundant spatial information that is useful for earth observation, but always suffer from low-resolution due to the sensor limitation and large-scale view field. The current super-resolution (SR) methods based on traditional attention mechanism have shown remarkable advantages but remain imperfect to reconstruct the edge details of SR images. To address this problem, an improved super-resolution model which involves the self-attention augmented WGAN is designed to dig out the reference information among multiple features for detail enhancement. We use an encoder-decoder network followed by a fully convolutional network (FCN) as the backbone to extract multi-scale features and reconstruct the HR results. To exploit the relevance between multi-layer feature maps, we first integrate a convolutional block attention module (CBAM) into each skip-connection of the encoder-decoder subnet, generating weighted maps to enhance both channel-wise and spatial-wise feature representation automatically. Besides, considering that the HR results and LR inputs are highly similar in structure, yet cannot be fully reflected in traditional attention mechanism, we therefore design a self augmented attention (SAA) module, where the attention weights are produced dynamically via a similarity function between hidden features, this design allows the network to flexibly adjust the fraction relevance among multi-layer features and keep the long-range inter information, which is helpful to preserve details. In addition, the pixel-wise loss is combined with perceptual and gradient loss to achieve comprehensive supervision. Experiments on benchmark datasets demonstrate that the proposed method outperforms other SR methods in terms of both objective evaluation and visual effect.

Download Full-text

Intrusion Detection Model Using Temporal Convolutional Network Blend Into Attention Mechanism

International Journal of Information Security and Privacy ◽

10.4018/ijisp.290832 ◽

2022 ◽

Vol 16 (1) ◽

pp. 1-20

Author(s):

Ping Zhao ◽

Zhijie Fan* ◽

Zhiwei Cao ◽

Xin Li

Keyword(s):

Neural Networks ◽

Intrusion Detection ◽

Spatial Information ◽

Attack Detection ◽

Attention Mechanism ◽

Surveillance Network ◽

Convolutional Network ◽

Detection Model ◽

Temporal Features ◽

Spatio Temporal

In order to improve the ability to detect network attacks, traditional intrusion detection models often used convolutional neural networks to encode spatial information or recurrent neural networks to obtain temporal features of the data. Some models combined the two methods to extract spatio-temporal features. However, these approaches used separate models and learned features insufficiently. This paper presented an improved model based on temporal convolutional networks (TCN) and attention mechanism. The causal and dilation convolution can capture the spatio-temporal dependencies of the data. The residual blocks allow the network to transfer information in a cross-layered manner, enabling in-depth network learning. Meanwhile, attention mechanism can enhance the model's attention to the relevant anomalous features of different attacks. Finally, this paper compared models results on the KDD CUP99 and UNSW-NB15 datasets. Besides, the authors apply the model to video surveillance network attack detection scenarios. The result shows that the model has advantages in evaluation metrics.

Download Full-text

Building Extraction from Very High Resolution Aerial Imagery Using Joint Attention Deep Neural Network

Remote Sensing ◽

10.3390/rs11242970 ◽

2019 ◽

Vol 11 (24) ◽

pp. 2970 ◽

Cited By ~ 3

Author(s):

Ziran Ye ◽

Yongyong Fu ◽

Muye Gan ◽

Jinsong Deng ◽

Alexis Comber ◽

...

Keyword(s):

Neural Network ◽

High Resolution ◽

Spatial Information ◽

Remote Sensing Data ◽

Aerial Imagery ◽

Building Extraction ◽

Feature Maps ◽

Convolutional Network ◽

Wide Range ◽

Very High

Automated methods to extract buildings from very high resolution (VHR) remote sensing data have many applications in a wide range of fields. Many convolutional neural network (CNN) based methods have been proposed and have achieved significant advances in the building extraction task. In order to refine predictions, a lot of recent approaches fuse features from earlier layers of CNNs to introduce abundant spatial information, which is known as skip connection. However, this strategy of reusing earlier features directly without processing could reduce the performance of the network. To address this problem, we propose a novel fully convolutional network (FCN) that adopts attention based re-weighting to extract buildings from aerial imagery. Specifically, we consider the semantic gap between features from different stages and leverage the attention mechanism to bridge the gap prior to the fusion of features. The inferred attention weights along spatial and channel-wise dimensions make the low level feature maps adaptive to high level feature maps in a target-oriented manner. Experimental results on three publicly available aerial imagery datasets show that the proposed model (RFA-UNet) achieves comparable and improved performance compared to other state-of-the-art models for building extraction.

Download Full-text

Biomedical document triage using a hierarchical attention-based capsule network

BMC Bioinformatics ◽

10.1186/s12859-020-03673-5 ◽

2020 ◽

Vol 21 (S13) ◽

Author(s):

Jian Wang ◽

Mengying Li ◽

Qishuai Diao ◽

Hongfei Lin ◽

Zhihao Yang ◽

...

Keyword(s):

Neural Networks ◽

Information Extraction ◽

Precision Medicine ◽

State Of The Art ◽

Attention Mechanism ◽

Feature Representation ◽

Experimental Results ◽

Biomedical Domain ◽

Proposed Model ◽

Document Triage

Abstract Background Biomedical document triage is the foundation of biomedical information extraction, which is important to precision medicine. Recently, some neural networks-based methods have been proposed to classify biomedical documents automatically. In the biomedical domain, documents are often very long and often contain very complicated sentences. However, the current methods still find it difficult to capture important features across sentences. Results In this paper, we propose a hierarchical attention-based capsule model for biomedical document triage. The proposed model effectively employs hierarchical attention mechanism and capsule networks to capture valuable features across sentences and construct a final latent feature representation for a document. We evaluated our model on three public corpora. Conclusions Experimental results showed that both hierarchical attention mechanism and capsule networks are helpful in biomedical document triage task. Our method proved itself highly competitive or superior compared with other state-of-the-art methods.

Download Full-text

Attention based convolutional network for automatic sleep stage classification

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2020-0051 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Shasha Sun ◽

Chuanpeng Li ◽

Ning Lv ◽

Xiaoman Zhang ◽

Zhaoyan Yu ◽

...

Keyword(s):

Sleep Stage ◽

Feature Representation ◽

Experimental Results ◽

Sleep Staging ◽

Feature Maps ◽

Convolutional Network ◽

Data Set ◽

Time Frequency ◽

Sleep Stage Classification ◽

Important Basis

Abstract Sleep staging is an important basis for diagnosing sleep-related problems. In this paper, an attention based convolutional network for automatic sleep staging is proposed. The network takes time-frequency image as input and predict sleep stage for each 30-s epoch as output. For each CNN feature maps, our model generate attention maps along two separate dimensions, time and filter, and then multiplied to form the final attention map. Residual-like fusion structure is used to append the attention map to the input feature map for adaptive feature refinement. In addition, to get the global feature representation with less information loss, the generalized mean pooling is introduced. To prove the efficacy of the proposed method, we have compared with two baseline method on sleep-EDF data set with different setting of the framework and input channel type, the experimental results show that the paper model has achieved significant improvements in terms of overall accuracy, Cohen’s kappa, MF1, sensitivity and specificity. The performance of the proposed network is compared with that of the state-of-the-art algorithms with an overall accuracy of 83.4%, a macro F1-score of 77.3%, κ = 0.77, sensitivity = 77.1% and specificity = 95.4%, respectively. The experimental results demonstrate the superiority of the proposed network.

Download Full-text

Feature-Enhanced Occlusion Perception Object Detection for Smart Cities

Wireless Communications and Mobile Computing ◽

10.1155/2021/5544194 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Jie Xu ◽

Hanyuan Wang ◽

Mingzhu Xu ◽

Fan Yang ◽

Yifei Zhou ◽

...

Keyword(s):

Object Detection ◽

Traffic Control ◽

Spatial Information ◽

Smart Cities ◽

Contextual Information ◽

Feature Representation ◽

Superior Performance ◽

Feature Maps ◽

Pascal Voc ◽

Occluded Objects

Object detection is used widely in smart cities including safety monitoring, traffic control, and car driving. However, in the smart city scenario, many objects will have occlusion problems. Moreover, most popular object detectors are often sensitive to various real-world occlusions. This paper proposes a feature-enhanced occlusion perception object detector by simultaneously detecting occluded objects and fully utilizing spatial information. To generate hard examples with occlusions, a mask generator localizes and masks discriminated regions with weakly supervised methods. To obtain enriched feature representation, we design a multiscale representation fusion module to combine hierarchical feature maps. Moreover, this method exploits contextual information by heaping up representations from different regions in feature maps. The model is trained end-to-end learning by minimizing the multitask loss. Our model obtains superior performance compared to previous object detectors, 77.4% mAP and 74.3% mAP on PASCAL VOC 2007 and PASCAL VOC 2012, respectively. It also achieves 24.6% mAP on MS COCO. Experiments demonstrate that the proposed method is useful to improve the effectiveness of object detection, making it highly suitable for smart cities application that need to discover key objects with occlusions.

Download Full-text

Predicting Landslides Using Locally Aligned Convolutional Neural Networks

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/462 ◽

2020 ◽

Cited By ~ 2

Author(s):

Ainaz Hajimoradlou ◽

Gioachino Roberti ◽

David Poole

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Networks ◽

Spatial Information ◽

Multiple Scales ◽

Single Point ◽

Economic Losses ◽

Convolutional Network ◽

Heterogeneous Features ◽

Log Likelihood

Landslides, movement of soil and rock under the influence of gravity, are common phenomena that cause significant human and economic losses every year. Experts use heterogeneous features such as slope, elevation, land cover, lithology, rock age, and rock family to predict landslides. To work with such features, we adapted convolutional neural networks to consider relative spatial information for the prediction task. Traditional filters in these networks either have a fixed orientation or are rotationally invariant. Intuitively, the filters should orient uphill, but there is not enough data to learn the concept of uphill; instead, it can be provided as prior knowledge. We propose a model called Locally Aligned Convolutional Neural Network, LACNN, that follows the ground surface at multiple scales to predict possible landslide occurrence for a single point. To validate our method, we created a standardized dataset of georeferenced images consisting of the heterogeneous features as inputs, and compared our method to several baselines, including linear regression, a neural network, and a convolutional network, using log-likelihood error and Receiver Operating Characteristic curves on the test set. Our model achieves 2-7% improvement in terms of accuracy and 2-15% boost in terms of log likelihood compared to the other proposed baselines.

Download Full-text

Learning a Hierarchical Global Attention for Image Classification

Future Internet ◽

10.3390/fi12110178 ◽

2020 ◽

Vol 12 (11) ◽

pp. 178

Author(s):

Kerang Cao ◽

Jingyu Gao ◽

Kwang-nam Choi ◽

Lini Duan

Keyword(s):

Neural Networks ◽

Image Classification ◽

Deep Neural Networks ◽

Matrix Multiplication ◽

Attention Mechanism ◽

Feature Representation ◽

Global Information ◽

Computation Complexity ◽

Computational Costs ◽

Large Effort

To classify the image material on the internet, the deep learning methodology, especially deep neural network, is the most optimal and costliest method of all computer vision methods. Convolutional neural networks (CNNs) learn a comprehensive feature representation by exploiting local information with a fixed receptive field, demonstrating distinguished capacities on image classification. Recent works concentrate on efficient feature exploration, which neglect the global information for holistic consideration. There is large effort to reduce the computational costs of deep neural networks. Here, we provide a hierarchical global attention mechanism that improve the network representation with restricted increase of computation complexity. Different from nonlocal-based methods, the hierarchical global attention mechanism requires no matrix multiplication and can be flexibly applied in various modern network designs. Experimental results demonstrate that proposed hierarchical global attention mechanism can conspicuously improve the image classification precision—a reduction of 7.94% and 16.63% percent in Top 1 and Top 5 errors separately—with little increase of computation complexity (6.23%) in comparison to competing approaches.

Download Full-text

Fabric defect recognition using optimized neural networks

Journal of Engineered Fibers and Fabrics ◽

10.1177/1558925019897396 ◽

2019 ◽

Vol 14 ◽

pp. 155892501989739 ◽

Cited By ~ 1

Author(s):

Zhoufeng Liu ◽

Chi Zhang ◽

Chunlei Li ◽

Shumin Ding ◽

Yan Dong ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Recognition Task ◽

Deep Convolutional Neural Network ◽

Image Feature ◽

Feature Maps ◽

Convolutional Network ◽

Fabric Defect ◽

Defect Recognition

Fabric defect recognition is an important measure for quality control in a textile factory. This article utilizes a deep convolutional neural network to recognize defects in fabrics that have complicated textures. Although convolutional neural networks are very powerful, a large number of parameters consume considerable computation time and memory bandwidth. In real-world applications, however, the fabric defect recognition task needs to be carried out in a timely fashion on a computation-limited platform. To optimize a deep convolutional neural network, a novel method is introduced to reveal the input pattern that originally caused a specific activation in the network feature maps. Using this visualization technique, this study visualizes the features in a fully trained convolutional model and attempts to change the architecture of original neural network to reduce computational load. After a series of improvements, a new convolutional network is acquired that is more efficient to the fabric image feature extraction, and the computation load and the total number of parameters in the new network is 23% and 8.9%, respectively, of the original model. The proposed neural network is specifically tailored for fabric defect recognition in resource-constrained environments. All of the source code and pretrained models are available online at https://github.com/ZCmeteor .

Download Full-text