Weak sub-network pruning for strong and efficient neural networks

Neural networks employ massive interconnection of simple computing units called neurons to compute the problems that are highly nonlinear and could not be hard coded into a program. These neural networks are computation-intensive, and training them requires a lot of training data. Each training example requires heavy computations. We look at different ways in which we can reduce the heavy computation requirement and possibly make them work on mobile devices. In this paper, we survey various techniques that can be matched and combined in order to improve the training time of neural networks. Additionally, we also review some extra recommendations to make the process work for mobile devices as well. We finally survey deep compression technique that tries to solve the problem by network pruning, quantization, and encoding the network weights. Deep compression reduces the time required for training the network by first pruning the irrelevant connections, i.e., the pruning stage, which is then followed by quantizing the network weights via choosing centroids for each layer. Finally, at the third stage, it employs Huffman encoding algorithm to deal with the storage issue of the remaining weights.

Download Full-text

Feature-Based Interpretation of the Deep Neural Network

Electronics ◽

10.3390/electronics10212687 ◽

2021 ◽

Vol 10 (21) ◽

pp. 2687

Author(s):

Eun-Hun Lee ◽

Hyeoncheol Kim

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Network ◽

Lower Layer ◽

Network Models ◽

Neural Network Models ◽

Network Pruning ◽

The Neural Network ◽

Single Instance ◽

High Level

The significant advantage of deep neural networks is that the upper layer can capture the high-level features of data based on the information acquired from the lower layer by stacking layers deeply. Since it is challenging to interpret what knowledge the neural network has learned, various studies for explaining neural networks have emerged to overcome this problem. However, these studies generate the local explanation of a single instance rather than providing a generalized global interpretation of the neural network model itself. To overcome such drawbacks of the previous approaches, we propose the global interpretation method for the deep neural network through features of the model. We first analyzed the relationship between the input and hidden layers to represent the high-level features of the model, then interpreted the decision-making process of neural networks through high-level features. In addition, we applied network pruning techniques to make concise explanations and analyzed the effect of layer complexity on interpretability. We present experiments on the proposed approach using three different datasets and show that our approach could generate global explanations on deep neural network models with high accuracy and fidelity.

Download Full-text

Small Network for Lightweight Task in Computer Vision: A Pruning Method Based on Feature Representation

Computational Intelligence and Neuroscience ◽

10.1155/2021/5531023 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Yisu Ge ◽

Shufang Lu ◽

Fei Gao

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Feature Representation ◽

Fine Tuning ◽

Practical Application ◽

Pruning Algorithm ◽

Network Pruning ◽

Pruning Strategy ◽

Speed Up ◽

Network Compression

Many current convolutional neural networks are hard to meet the practical application requirement because of the enormous network parameters. For accelerating the inference speed of networks, more and more attention has been paid to network compression. Network pruning is one of the most efficient and simplest ways to compress and speed up the networks. In this paper, a pruning algorithm for the lightweight task is proposed, and a pruning strategy based on feature representation is investigated. Different from other pruning approaches, the proposed strategy is guided by the practical task and eliminates the irrelevant filters in the network. After pruning, the network is compacted to a smaller size and is easy to recover accuracy with fine-tuning. The performance of the proposed pruning algorithm is validated on the acknowledged image datasets, and the experimental results prove that the proposed algorithm is more suitable to prune the irrelevant filters for the fine-tuning dataset.

Download Full-text

Novel Pruning Techniques in Convolutional-Neural Networks

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.d8397.049420 ◽

2020 ◽

Vol 9 (4) ◽

pp. 1541-1548

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Image Classification ◽

Language Processing ◽

Time Series Prediction ◽

L1 Norm ◽

Network Pruning ◽

Weight Pruning ◽

The Cost ◽

Pruning Techniques

Deep Learning allows us to build powerful models to solve problems like image classification, time series prediction, natural language processing, etc. This is achieved at the cost of huge amounts of storage and processing requirements which are sometimes not possible in machines with limited resources. In this paper, we compare different methods which tackle this problem with network pruning. Selected few pruning methodologies from the deep learning literature were implemented to display their results. Modern neural architectures have a combination of different layers like convolutional layers, pooling layers, dense layers, etc. We compare pruning techniques for dense layers (such as unit/neuron pruning, and weight Pruning), and convolutional layers as well (using L1 norm, taylor expansion of loss to determine importance of convolutional filters, and Variable Importance in Projection using Partial Least Squares) for the image classification task. This study aims to ease the overhead in terms of optimization of the model for academic, as well as commercial, use of deep neural networks.

Download Full-text

Optimizing and Learning Algorithm for Feed-forward Neural Networks

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2001.p0051 ◽

2001 ◽

Vol 5 (1) ◽

pp. 51-57

Author(s):

Pilar Bachiller ◽

◽

Julia González

Keyword(s):

Neural Networks ◽

Learning Algorithm ◽

Neural Model ◽

Training Set ◽

Feed Forward Neural Network ◽

Feed Forward ◽

Training Time ◽

Network Pruning ◽

Feed Forward Neural Networks ◽

Pruning Techniques

Feed-forward neural networks have emerged as a good solution for many problems, such as classification, recognition and identification, and signal processing. However, the importance of selecting an adequate hidden structure for this neural model should not be underestimated. When the hidden structure of the network is too large and complex for the model being developed, the network may tend to memorize input and output sets rather than learning relationships between them. Such a network may train well but test poorly when inputs outside the training set are presented. In addition, training time will significantly increase when the network is unnecessarily large and complex. Most of the proposed solutions to this problem consist of training a larger than necessary network, pruning unnecessary links and nodes and retraining the reduced network. We propose a new method to optimize the size of a feed-forward neural network using orthogonal transformations. This approach prunes unnecessary nodes during the training process, avoiding the retraining phase of the reduced network, which is necessary in most pruning techniques.

Download Full-text

Multi-granularity pruning for deep residual networks

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-200771 ◽

2020 ◽

Vol 39 (5) ◽

pp. 7403-7410

Author(s):

Yangke Huang ◽

Zhiming Wang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Compression Ratio ◽

Gradient Descent ◽

Computational Cost ◽

Deep Convolutional Neural Networks ◽

Acceleration Ratio ◽

Network Pruning ◽

High Computational Cost ◽

Pruning Methods

Network pruning has been widely used to reduce the high computational cost of deep convolutional neural networks(CNNs). The dominant pruning methods, channel pruning, removes filters in layers based on their importance or sparsity training. But these methods often give limited acceleration ratio and encounter difficulties when pruning CNNs with skip connections. Block pruning methods take a sequence of consecutive layers (e.g., Conv-BN-ReLu) as a block and remove entire block each time. However, previous methods usually introduce new parameters to help pruning and lead additional parameters and extra computations. This work proposes a novel multi-granularity pruning approach that combines block pruning with channel pruning (BPCP). The block pruning (BP) module remove blocks by directly searches the redundant blocks with gradient descent and leaves no extra parameters in final models, which is friendly to hardware optimization. The channel pruning (CP) module remove redundant channels based on importance criteria and handles CNNs with skip connections properly, which further improves the overall compression ratio. As a result, for CIFAR10, BPCP reduces the number of parameters and MACs of a ResNet56 model up to 78.9% and 80.3% respectively with <3% accuracy drop. In terms of speed, it gives a 3.17 acceleration ratio. Our code has been made available at https://github.com/Pokemon-Huang/BPCP.

Download Full-text

Research and Application of Improved AGP Algorithm for Structural Optimization Based on Feedforward Neural Networks

Mathematical Problems in Engineering ◽

10.1155/2015/481919 ◽

2015 ◽

Vol 2015 ◽

pp. 1-6

Author(s):

Ruliang Wang ◽

Huanlong Sun ◽

Benbo Zha ◽

Lei Wang

Keyword(s):

Neural Networks ◽

Structural Optimization ◽

Network Performance ◽

Feedforward Neural Networks ◽

Improved Method ◽

Pruning Algorithm ◽

Network Pruning ◽

Internal Relation ◽

Traffic Capacity ◽

Simulation Results

The adaptive growing and pruning algorithm (AGP) has been improved, and the network pruning is based on the sigmoidal activation value of the node and all the weights of its outgoing connections. The nodes are pruned directly, but those nodes that have internal relation are not removed. The network growing is based on the idea of variance. We directly copy those nodes with high correlation. An improved AGP algorithm (IAGP) is proposed. And it improves the network performance and efficiency. The simulation results show that, compared with the AGP algorithm, the improved method (IAGP) can quickly and accurately predict traffic capacity.

Download Full-text

Pruning from Scratch

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6910 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12273-12280

Author(s):

Yulong Wang ◽

Xiaolu Zhang ◽

Lingxi Xie ◽

Jun Zhou ◽

Hang Su ◽

...

Keyword(s):

Neural Networks ◽

Search Space ◽

Research Field ◽

Classification Models ◽

Important Research ◽

Potential Models ◽

Computational Costs ◽

Network Pruning ◽

Parameterized Model ◽

Pruning Methods

Network pruning is an important research field aiming at reducing computational costs of neural networks. Conventional approaches follow a fixed paradigm which first trains a large and redundant network, and then determines which units (e.g., channels) are less important and thus can be removed. In this work, we find that pre-training an over-parameterized model is not necessary for obtaining the target pruned structure. In fact, a fully-trained over-parameterized model will reduce the search space for the pruned structure. We empirically show that more diverse pruned structures can be directly pruned from randomly initialized weights, including potential models with better performance. Therefore, we propose a novel network pruning pipeline which allows pruning from scratch with little training overhead. In the experiments for compressing classification models on CIFAR10 and ImageNet datasets, our approach not only greatly reduces the pre-training burden of traditional pruning methods, but also achieves similar or even higher accuracy under the same computation budgets. Our results facilitate the community to rethink the effectiveness of existing techniques used for network pruning.

Download Full-text