Learning Instance-wise Sparsity for Accelerating Deep Models

Exploring deep convolutional neural networks of high efficiency and low memory usage is very essential for a wide variety of machine learning tasks. Most of existing approaches used to accelerate deep models by manipulating parameters or filters without data, e.g., pruning and decomposition. In contrast, we study this problem from a different perspective by respecting the difference between data. An instance-wise feature pruning is developed by identifying informative features for different instances. Specifically, by investigating a feature decay regularization, we expect intermediate feature maps of each instance in deep neural networks to be sparse while preserving the overall network performance. During online inference, subtle features of input images extracted by intermediate layers of a well-trained neural network can be eliminated to accelerate the subsequent calculations. We further take coefficient of variation as a measure to select the layers that are appropriate for acceleration. Extensive experiments conducted on benchmark datasets and networks demonstrate the effectiveness of the proposed method.

Download Full-text

BD-ELM: A Regularized Extreme Learning Machine Using Biased DropConnect and Biased Dropout

Mathematical Problems in Engineering ◽

10.1155/2020/3604579 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Jie Lai ◽

Xiaodan Wang ◽

Rui Li ◽

Yafei Song ◽

Lei Lei

Keyword(s):

Extreme Learning Machine ◽

Regularization Method ◽

Network Performance ◽

Structural Complexity ◽

Benchmark Datasets ◽

Fixed Constant ◽

The Difference ◽

Learning Machine ◽

Hidden Layer ◽

Hidden Nodes

In order to prevent the overfitting and improve the generalization performance of Extreme Learning Machine (ELM), a new regularization method, Biased DropConnect, and a new regularized ELM using the Biased DropConnect and Biased Dropout (BD-ELM) are both proposed in this paper. Like the Biased Dropout to hidden nodes, the Biased DropConnect can utilize the difference of connection weights to keep more information of network after dropping. The regular Dropout and DropConnect set the connection weights and output of the hidden layer to 0 with a single fixed probability. But the Biased DropConnect and Biased Dropout divide the connection weights and hidden nodes into high and low groups by threshold, and set different groups to 0 with different probabilities. Connection weights with high value and hidden nodes with a high-activated value, which make more contribution to network performance, will be kept by a lower drop probability, while the weights and hidden nodes with a low value will be given a higher drop probability to keep the drop probability of the whole network to a fixed constant. Using Biased DropConnect and Biased Dropout regularization, in BD-ELM, the sparsity of parameters is enhanced and the structural complexity is reduced. Experiments on various benchmark datasets show that Biased DropConnect and Biased Dropout can effectively address the overfitting, and BD-ELM can provide higher classification accuracy than ELM, R-ELM, and Drop-ELM.

Download Full-text

Orthogonal Representations of Object Shape and Category in Deep Convolutional Neural Networks and Human Visual Cortex

10.1101/555193 ◽

2019 ◽

Author(s):

Astrid A. Zeman ◽

J. Brendan Ritchie ◽

Stefania Bracci ◽

Hans Op de Beeck

Keyword(s):

Neural Networks ◽

Visual Cortex ◽

Convolutional Neural Networks ◽

Network Performance ◽

Temporal Cortex ◽

Visual Object ◽

Visual Object Recognition ◽

Deep Convolutional Neural Networks ◽

Shape Information ◽

Category Information

AbstractDeep Convolutional Neural Networks (CNNs) are gaining traction as the benchmark model of visual object recognition, with performance now surpassing humans. While CNNs can accurately assign one image to potentially thousands of categories, network performance could be the result of layers that are tuned to represent the visual shape of objects, rather than object category, since both are often confounded in natural images. Using two stimulus sets that explicitly dissociate shape from category, we correlate these two types of information with each layer of multiple CNNs. We also compare CNN output with fMRI activation along the human visual ventral stream by correlating artificial with biological representations. We find that CNNs encode category information independently from shape, peaking at the final fully connected layer in all tested CNN architectures. Comparing CNNs with fMRI brain data, early visual cortex (V1) and early layers of CNNs encode shape information. Anterior ventral temporal cortex encodes category information, which correlates best with the final layer of CNNs. The interaction between shape and category that is found along the human visual ventral pathway is echoed in multiple deep networks. Our results suggest CNNs represent category information independently from shape, much like the human visual system.

Download Full-text

A mixed-scale dense convolutional neural network for image analysis

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1715832114 ◽

2017 ◽

Vol 115 (2) ◽

pp. 254-259 ◽

Cited By ~ 60

Author(s):

Daniël M. Pelt ◽

James A. Sethian

Keyword(s):

Neural Network ◽

Neural Networks ◽

Image Processing ◽

Network Architecture ◽

Training Data ◽

Network Architectures ◽

Feature Maps ◽

Deep Convolutional Neural Networks ◽

Single Set ◽

Reduced Risk

Deep convolutional neural networks have been successfully applied to many image-processing problems in recent works. Popular network architectures often add additional operations and connections to the standard architecture to enable training deeper networks. To achieve accurate results in practice, a large number of trainable parameters are often required. Here, we introduce a network architecture based on using dilated convolutions to capture features at different image scales and densely connecting all feature maps with each other. The resulting architecture is able to achieve accurate results with relatively few parameters and consists of a single set of operations, making it easier to implement, train, and apply in practice, and automatically adapts to different problems. We compare results of the proposed network architecture with popular existing architectures for several segmentation problems, showing that the proposed architecture is able to achieve accurate results with fewer parameters, with a reduced risk of overfitting the training data.

Download Full-text

A hyperspectral image classification algorithm based on atrous convolution

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-019-1594-y ◽

2019 ◽

Vol 2019 (1) ◽

Cited By ~ 2

Author(s):

Xiaoqing Zhang ◽

Yongguo Zheng ◽

Weike Liu ◽

Zhiyong Wang

Keyword(s):

Neural Networks ◽

Receptive Field ◽

High Efficiency ◽

Hyperspectral Image ◽

Hyperspectral Images ◽

Spectral Features ◽

Hyperspectral Image Classification ◽

Deep Convolutional Neural Networks ◽

Multi Level ◽

Spatial Size

AbstractHyperspectral images not only have high spectral dimension, but the spatial size of datasets containing such kind of images is also small. Aiming at this problem, we design the NG-APC (non-gridding multi-level concatenated Atrous Pyramid Convolution) module based on the combined atrous convolution. By expanding the receptive field of three layers convolution from 7 to 45, the module can obtain a distanced combination of the spectral features of hyperspectral pixels and solve the gridding problem of atrous convolution. In NG-APC module, we construct a 15-layer Deep Convolutional Neural Networks (DCNN) model to classify each hyperspectral pixel. Through the experiments on the Pavia University dataset, the model reaches 97.9% accuracy while the parameter amount is only 0.25 M. Compared with other CNN algorithms, our method gets the best OA (Over All Accuracy) and Kappa metrics, at the same time, NG-APC module keeps good performance and high efficiency with smaller number of parameters.

Download Full-text

Residual Augmented Attentional U-Shaped Network for Spectral Reconstruction from RGB Images

Remote Sensing ◽

10.3390/rs13010115 ◽

2020 ◽

Vol 13 (1) ◽

pp. 115

Author(s):

Jiaojiao Li ◽

Chaoxiong Wu ◽

Rui Song ◽

Yunsong Li ◽

Weiying Xie

Keyword(s):

Superior Performance ◽

Feature Maps ◽

Deep Convolutional Neural Networks ◽

Second Order Statistics ◽

Feature Representations ◽

Quantitative Measurements ◽

Spectral Reconstruction ◽

Perceptual Comparison ◽

Benchmark Datasets ◽

Rgb Images

Deep convolutional neural networks (CNNs) have been successfully applied to spectral reconstruction (SR) and acquired superior performance. Nevertheless, the existing CNN-based SR approaches integrate hierarchical features from different layers indiscriminately, lacking an investigation of the relationships of intermediate feature maps, which limits the learning power of CNNs. To tackle this problem, we propose a deep residual augmented attentional u-shape network (RA2UN) with several double improved residual blocks (DIRB) instead of paired plain convolutional units. Specifically, a trainable spatial augmented attention (SAA) module is developed to bridge the encoder and decoder to emphasize the features in the informative regions. Furthermore, we present a novel channel augmented attention (CAA) module embedded in the DIRB to rescale adaptively and enhance residual learning by using first-order and second-order statistics for stronger feature representations. Finally, a boundary-aware constraint is employed to focus on the salient edge information and recover more accurate high-frequency details. Experimental results on four benchmark datasets demonstrate that the proposed RA2UN network outperforms the state-of-the-art SR methods under quantitative measurements and perceptual comparison.

Download Full-text

Rectified Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/120 ◽

2019 ◽

Cited By ~ 3

Author(s):

Chunlei Liu ◽

Wenrui Ding ◽

Xin Xia ◽

Yuan Hu ◽

Baochang Zhang ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Superior Performance ◽

Feature Maps ◽

Unified Framework ◽

Deep Convolutional Neural Networks ◽

Convolutional Networks ◽

Computation Efficiency ◽

Significant Performance ◽

Binary Network

Binarized convolutional neural networks (BCNNs) are widely used to improve memory and computation efficiency of deep convolutional neural networks (DCNNs) for mobile and AI chips based applications. However, current BCNNs are not able to fully explore their corresponding full-precision models, causing a significant performance gap between them. In this paper, we propose rectified binary convolutional networks (RBCNs), towards optimized BCNNs, by combining full-precision kernels and feature maps to rectify the binarization process in a unified framework. In particular, we use a GAN to train the 1-bit binary network with the guidance of its corresponding full-precision model, which significantly improves the performance of BCNNs. The rectified convolutional layers are generic and flexible, and can be easily incorporated into existing DCNNs such as WideResNets and ResNets. Extensive experiments demonstrate the superior performance of the proposed RBCNs over state-of-the-art BCNNs. In particular, our method shows strong generalization on the object tracking task.

Download Full-text

Attribute Aware Pooling for Pedestrian Attribute Recognition

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/341 ◽

2019 ◽

Cited By ~ 5

Author(s):

Kai Han ◽

Yunhe Wang ◽

Han Shu ◽

Chuanjian Liu ◽

Chunjing Xu ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Recognition Problem ◽

Context Information ◽

Deep Convolutional Neural Networks ◽

Attribute Data ◽

Branch Architecture ◽

Benchmark Datasets ◽

Attribute Classification ◽

Attribute Recognition

This paper expands the strength of deep convolutional neural networks (CNNs) to the pedestrian attribute recognition problem by devising a novel attribute aware pooling algorithm. Existing vanilla CNNs cannot be straightforwardly applied to handle multi-attribute data because of the larger label space as well as the attribute entanglement and correlations. We tackle these challenges that hampers the development of CNNs for multi-attribute classification by fully exploiting the correlation between different attributes. The multi-branch architecture is adopted for fucusing on attributes at different regions. Besides the prediction based on each branch itself, context information of each branch are employed for decision as well. The attribute aware pooling is developed to integrate both kinds of information. Therefore, attributes which are indistinct or tangled with others can be accurately recognized by exploiting the context information. Experiments on benchmark datasets demonstrate that the proposed pooling method appropriately explores and exploits the correlations between attributes for the pedestrian attribute recognition.

Download Full-text

Training Group Orthogonal Neural Networks with Privileged Information

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/212 ◽

2017 ◽

Cited By ~ 9

Author(s):

Yunpeng Chen ◽

Xiaojie Jin ◽

Jiashi Feng ◽

Shuicheng Yan

Keyword(s):

Neural Networks ◽

State Of The Art ◽

Training Group ◽

Privileged Information ◽

Training Process ◽

Deep Convolutional Neural Networks ◽

Absolute Value ◽

Generalization Ability ◽

Training Images ◽

Benchmark Datasets

Learning rich and diverse representations is critical for the performance of deep convolutional neural networks (CNNs). In this paper, we consider how to use privileged information to promote inherent diversity of a single CNN model such that the model can learn better representations and offer stronger generalization ability. To this end, we propose a novel group orthogonal convolutional neural network (GoCNN) that learns untangled representations within each layer by exploiting provided privileged information and enhances representation diversity effectively. We take image classification as an example where image segmentation annotations are used as privileged information during the training process. Experiments on two benchmark datasets – ImageNet and PASCAL VOC – clearly demonstrate the strong generalization ability of our proposed GoCNN model. On the ImageNet dataset, GoCNN improves the performance of state-of-the-art ResNet-152 model by absolute value of 1.2% while only uses privileged information of 10% of the training images, confirming effectiveness of GoCNN on utilizing available privileged knowledge to train better CNNs.

Download Full-text

From photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction

10.31234/osf.io/xg2uy ◽

2021 ◽

Author(s):

Johannes Janek Daniel Singer ◽

Katja Seeliger ◽

Tim Christian Kietzmann ◽

Martin N Hebart

Keyword(s):

Neural Networks ◽

Poor Performance ◽

Classification Performance ◽

Natural Images ◽

Fine Tuning ◽

Deep Convolutional Neural Networks ◽

Line Drawings ◽

General Utility ◽

Intermediate Layers ◽

Latent Representations

Line drawings convey meaning with just a few strokes. Despite strong simplifications, humans can recognize objects depicted in such abstracted images without effort. To what degree do deep convolutional neural networks (CNNs) mirror this human ability to generalize to abstracted object images? While CNNs trained on natural images have been shown to exhibit poor classification performance on drawings, other work has demonstrated highly similar latent representations in the networks for abstracted and natural images. Here, we address these seemingly conflicting findings by analyzing the activation patterns of a CNN trained on natural images across a set of photos, drawings and sketches of the same objects and comparing them to human behavior. We find a highly similar representational structure across levels of visual abstraction in early and intermediate layers of the network. This similarity, however, does not translate to later stages in the network, resulting in low classification performance for drawings and sketches. We identified that texture bias in CNNs contributes to the dissimilar representational structure in late layers and the poor performance on drawings. Finally, by fine-tuning late network layers with object drawings, we show that performance can be largely restored, demonstrating the general utility of features learned on natural images in early and intermediate layers for the recognition of drawings. In conclusion, generalization to abstracted images such as drawings seems to be an emergent property of CNNs trained on natural images, which is, however, suppressed by domain-related biases that arise during later processing stages in the network.

Download Full-text

Random Shifting for CNN: a Solution to Reduce Information Loss in Down-Sampling Layers

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/486 ◽

2017 ◽

Cited By ~ 3

Author(s):

Gangming Zhao ◽

Jingdong Wang ◽

Zhaoxiang Zhang

Keyword(s):

Neural Networks ◽

Computational Cost ◽

Receptive Fields ◽

Information Loss ◽

Network Architectures ◽

Training Process ◽

Feature Maps ◽

Improve Performance ◽

Deep Convolutional Neural Networks ◽

Random Strategy

Down-sampling is widely adopted in deep convolutional neural networks (DCNN) for reducing the number of network parameters while preserving the transformation invariance. However, it cannot utilize information effectively because it only adopts a fixed stride strategy, which may result in poor generalization ability and information loss. In this paper, we propose a novel random strategy to alleviate these problems by embedding random shifting in the down-sampling layers during the training process. Random shifting can be universally applied to diverse DCNN models to dynamically adjust receptive fields by shifting kernel centers on feature maps in different directions. Thus, it can generate more robust features in networks and further enhance the transformation invariance of down-sampling operators. In addition, random shifting cannot only be integrated in all down-sampling layers including strided convolutional layers and pooling layers, but also improve performance of DCNN with negligible additional computational cost. We evaluate our method in different tasks (e.g., image classification and segmentation) with various network architectures (i.e., AlexNet, FCN and DFN-MR). Experimental results demonstrate the effectiveness of our proposed method.

Download Full-text