Nonlocal spatial attention module for image classification

To enhance the capability of neural networks, research on attention mechanism have been deepened. In this area, attention modules make forward inference along channel dimension and spatial dimension sequentially, parallelly, or simultaneously. However, we have found that spatial attention modules mainly apply convolution layers to generate attention maps, which aggregate feature responses only based on local receptive fields. In this article, we take advantage of this finding to create a nonlocal spatial attention module (NL-SAM), which collects context information from all pixels to adaptively recalibrate spatial responses in a convolutional feature map. NL-SAM overcomes the limitations of repeating local operations and exports a 2D spatial attention map to emphasize or suppress responses in different locations. Experiments on three benchmark datasets show at least 0.58% improvements on variant ResNets. Furthermore, this module is simple and can be easily integrated with existing channel attention modules, such as squeeze-and-excitation and gather-excite, to exceed these significant models at a minimal additional computational cost (0.196%).

Download Full-text

An efficient pruning scheme of deep neural networks for Internet of Things applications

EURASIP Journal on Advances in Signal Processing ◽

10.1186/s13634-021-00744-4 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Chen Qi ◽

Shibo Shen ◽

Rongpeng Li ◽

Zhifeng Zhao ◽

Qing Liu ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Internet Of Things ◽

Deep Neural Networks ◽

Computational Cost ◽

Superior Performance ◽

Compact Structure ◽

Resource Limited ◽

Benchmark Datasets ◽

Iot Devices

AbstractNowadays, deep neural networks (DNNs) have been rapidly deployed to realize a number of functionalities like sensing, imaging, classification, recognition, etc. However, the computational-intensive requirement of DNNs makes it difficult to be applicable for resource-limited Internet of Things (IoT) devices. In this paper, we propose a novel pruning-based paradigm that aims to reduce the computational cost of DNNs, by uncovering a more compact structure and learning the effective weights therein, on the basis of not compromising the expressive capability of DNNs. In particular, our algorithm can achieve efficient end-to-end training that transfers a redundant neural network to a compact one with a specifically targeted compression rate directly. We comprehensively evaluate our approach on various representative benchmark datasets and compared with typical advanced convolutional neural network (CNN) architectures. The experimental results verify the superior performance and robust effectiveness of our scheme. For example, when pruning VGG on CIFAR-10, our proposed scheme is able to significantly reduce its FLOPs (floating-point operations) and number of parameters with a proportion of 76.2% and 94.1%, respectively, while still maintaining a satisfactory accuracy. To sum up, our scheme could facilitate the integration of DNNs into the common machine-learning-based IoT framework and establish distributed training of neural networks in both cloud and edge.

Download Full-text

Hyperspectral image classification via local receptive fields based random weights networks

2015 11th International Conference on Natural Computation (ICNC) ◽

10.1109/icnc.2015.7378123 ◽

2015 ◽

Author(s):

Qi Lv ◽

Yong Dou ◽

Jiaqing Xu ◽

Xin Niu ◽

Fei Xia

Keyword(s):

Image Classification ◽

Hyperspectral Image ◽

Receptive Fields ◽

Hyperspectral Image Classification ◽

Random Weights ◽

Local Receptive Fields

Download Full-text

Robust Image Classification with Cognitive-Driven Color Priors

Electronics ◽

10.3390/electronics9111837 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1837

Author(s):

Peng Gu ◽

Chengfei Zhu ◽

Xiaosong Lan ◽

Jie Wang ◽

Shuxiao Li

Keyword(s):

Neural Networks ◽

Image Classification ◽

Convolutional Neural Networks ◽

Human Memory ◽

Layer By Layer ◽

Training Methods ◽

Prior Model ◽

Benchmark Datasets ◽

Robust Image ◽

Classification Probability

Existing image classification methods based on convolutional neural networks usually use a large number of samples to learn classification features hierarchically, causing the problems of over-fitting and error propagation layer by layer. Thus, they are vulnerable to adversarial samples generated by adding imperceptible disturbances to input samples. To address the above issue, we propose a cognitive-driven color prior model to memorize the color attributes of target samples inspired by the characteristics of human memory. At inference stage, color priors are indexed from the memory and fused with features of convolutional neural networks to achieve robust image classification. The proposed color prior model is cognitive-driven and has no training parameters, thus it has strong generalization and can effectively defend against adversarial samples. In addition, our method directly combines the features of the prior model with the classification probability of the convolutional neural network, without changing the network structure and its parameters of the existing algorithm. It can be combined with other adversarial attack defense methods, such as various preprocessing modules such as PixelDefense or adversarial training methods, to improve the robustness of image classification. Experiments on several benchmark datasets show that the proposed method improves the anti-interference ability of image classification algorithms.

Download Full-text

Spectral-Spatial Attention Networks for Hyperspectral Image Classification

Remote Sensing ◽

10.3390/rs11080963 ◽

2019 ◽

Vol 11 (8) ◽

pp. 963 ◽

Cited By ~ 25

Author(s):

Xiaoguang Mei ◽

Erting Pan ◽

Yong Ma ◽

Xiaobing Dai ◽

Jun Huang ◽

...

Keyword(s):

Neural Network ◽

Image Classification ◽

Spatial Attention ◽

Spatial Information ◽

Hyperspectral Image ◽

Spatial Dimension ◽

Hyperspectral Image Classification ◽

Attention Networks ◽

Land Covers

Many deep learning models, such as convolutional neural network (CNN) and recurrent neural network (RNN), have been successfully applied to extracting deep features for hyperspectral tasks. Hyperspectral image classification allows distinguishing the characterization of land covers by utilizing their abundant information. Motivated by the attention mechanism of the human visual system, in this study, we propose a spectral-spatial attention network for hyperspectral image classification. In our method, RNN with attention can learn inner spectral correlations within a continuous spectrum, while CNN with attention is designed to focus on saliency features and spatial relevance between neighboring pixels in the spatial dimension. Experimental results demonstrate that our method can fully utilize the spectral and spatial information to obtain competitive performance.

Download Full-text

Hybrid pooling with wavelets for convolutional neural networks

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219223 ◽

2022 ◽

pp. 1-10

Author(s):

Daniel Trevino-Sanchez ◽

Vicente Alarcon-Aquino

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

State Of The Art ◽

Computational Cost ◽

Relevant Information ◽

Accuracy Improvement ◽

Proposed Model ◽

Benchmark Datasets ◽

Augmentation Techniques ◽

High Computational Cost

The need to detect and classify objects correctly is a constant challenge, being able to recognize them at different scales and scenarios, sometimes cropped or badly lit is not an easy task. Convolutional neural networks (CNN) have become a widely applied technique since they are completely trainable and suitable to extract features. However, the growing number of convolutional neural networks applications constantly pushes their accuracy improvement. Initially, those improvements involved the use of large datasets, augmentation techniques, and complex algorithms. These methods may have a high computational cost. Nevertheless, feature extraction is known to be the heart of the problem. As a result, other approaches combine different technologies to extract better features to improve the accuracy without the need of more powerful hardware resources. In this paper, we propose a hybrid pooling method that incorporates multiresolution analysis within the CNN layers to reduce the feature map size without losing details. To prevent relevant information from losing during the downsampling process an existing pooling method is combined with wavelet transform technique, keeping those details "alive" and enriching other stages of the CNN. Achieving better quality characteristics improves CNN accuracy. To validate this study, ten pooling methods, including the proposed model, are tested using four benchmark datasets. The results are compared with four of the evaluated methods, which are also considered as the state-of-the-art.

Download Full-text

Hyperspectral image classification via kernel extreme learning machine using local receptive fields

2016 IEEE International Conference on Image Processing (ICIP) ◽

10.1109/icip.2016.7532358 ◽

2016 ◽

Cited By ~ 2

Author(s):

Qi Lv ◽

Xin Niu ◽

Yong Dou ◽

Yueqing Wang ◽

Jiaqing Xu ◽

...

Keyword(s):

Image Classification ◽

Extreme Learning Machine ◽

Hyperspectral Image ◽

Receptive Fields ◽

Hyperspectral Image Classification ◽

Kernel Extreme Learning Machine ◽

Learning Machine ◽

Local Receptive Fields

Download Full-text

Attribute Aware Pooling for Pedestrian Attribute Recognition

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/341 ◽

2019 ◽

Cited By ~ 5

Author(s):

Kai Han ◽

Yunhe Wang ◽

Han Shu ◽

Chuanjian Liu ◽

Chunjing Xu ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Recognition Problem ◽

Context Information ◽

Deep Convolutional Neural Networks ◽

Attribute Data ◽

Branch Architecture ◽

Benchmark Datasets ◽

Attribute Classification ◽

Attribute Recognition

This paper expands the strength of deep convolutional neural networks (CNNs) to the pedestrian attribute recognition problem by devising a novel attribute aware pooling algorithm. Existing vanilla CNNs cannot be straightforwardly applied to handle multi-attribute data because of the larger label space as well as the attribute entanglement and correlations. We tackle these challenges that hampers the development of CNNs for multi-attribute classification by fully exploiting the correlation between different attributes. The multi-branch architecture is adopted for fucusing on attributes at different regions. Besides the prediction based on each branch itself, context information of each branch are employed for decision as well. The attribute aware pooling is developed to integrate both kinds of information. Therefore, attributes which are indistinct or tangled with others can be accurately recognized by exploiting the context information. Experiments on benchmark datasets demonstrate that the proposed pooling method appropriately explores and exploits the correlations between attributes for the pedestrian attribute recognition.

Download Full-text

Random Shifting for CNN: a Solution to Reduce Information Loss in Down-Sampling Layers

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/486 ◽

2017 ◽

Cited By ~ 3

Author(s):

Gangming Zhao ◽

Jingdong Wang ◽

Zhaoxiang Zhang

Keyword(s):

Neural Networks ◽

Computational Cost ◽

Receptive Fields ◽

Information Loss ◽

Network Architectures ◽

Training Process ◽

Feature Maps ◽

Improve Performance ◽

Deep Convolutional Neural Networks ◽

Random Strategy

Down-sampling is widely adopted in deep convolutional neural networks (DCNN) for reducing the number of network parameters while preserving the transformation invariance. However, it cannot utilize information effectively because it only adopts a fixed stride strategy, which may result in poor generalization ability and information loss. In this paper, we propose a novel random strategy to alleviate these problems by embedding random shifting in the down-sampling layers during the training process. Random shifting can be universally applied to diverse DCNN models to dynamically adjust receptive fields by shifting kernel centers on feature maps in different directions. Thus, it can generate more robust features in networks and further enhance the transformation invariance of down-sampling operators. In addition, random shifting cannot only be integrated in all down-sampling layers including strided convolutional layers and pooling layers, but also improve performance of DCNN with negligible additional computational cost. We evaluate our method in different tasks (e.g., image classification and segmentation) with various network architectures (i.e., AlexNet, FCN and DFN-MR). Experimental results demonstrate the effectiveness of our proposed method.

Download Full-text