Weak sub-network pruning for strong and efficient neural networks

2021 ◽  
Vol 144 ◽  
pp. 614-626
Author(s):  
Qingbei Guo ◽  
Xiao-Jun Wu ◽  
Josef Kittler ◽  
Zhiquan Feng
Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Saad Naeem ◽  
Noreen Jamil ◽  
Habib Ullah Khan ◽  
Shah Nazir

Neural networks employ massive interconnection of simple computing units called neurons to compute the problems that are highly nonlinear and could not be hard coded into a program. These neural networks are computation-intensive, and training them requires a lot of training data. Each training example requires heavy computations. We look at different ways in which we can reduce the heavy computation requirement and possibly make them work on mobile devices. In this paper, we survey various techniques that can be matched and combined in order to improve the training time of neural networks. Additionally, we also review some extra recommendations to make the process work for mobile devices as well. We finally survey deep compression technique that tries to solve the problem by network pruning, quantization, and encoding the network weights. Deep compression reduces the time required for training the network by first pruning the irrelevant connections, i.e., the pruning stage, which is then followed by quantizing the network weights via choosing centroids for each layer. Finally, at the third stage, it employs Huffman encoding algorithm to deal with the storage issue of the remaining weights.


Electronics ◽  
2021 ◽  
Vol 10 (21) ◽  
pp. 2687
Author(s):  
Eun-Hun Lee ◽  
Hyeoncheol Kim

The significant advantage of deep neural networks is that the upper layer can capture the high-level features of data based on the information acquired from the lower layer by stacking layers deeply. Since it is challenging to interpret what knowledge the neural network has learned, various studies for explaining neural networks have emerged to overcome this problem. However, these studies generate the local explanation of a single instance rather than providing a generalized global interpretation of the neural network model itself. To overcome such drawbacks of the previous approaches, we propose the global interpretation method for the deep neural network through features of the model. We first analyzed the relationship between the input and hidden layers to represent the high-level features of the model, then interpreted the decision-making process of neural networks through high-level features. In addition, we applied network pruning techniques to make concise explanations and analyzed the effect of layer complexity on interpretability. We present experiments on the proposed approach using three different datasets and show that our approach could generate global explanations on deep neural network models with high accuracy and fidelity.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yisu Ge ◽  
Shufang Lu ◽  
Fei Gao

Many current convolutional neural networks are hard to meet the practical application requirement because of the enormous network parameters. For accelerating the inference speed of networks, more and more attention has been paid to network compression. Network pruning is one of the most efficient and simplest ways to compress and speed up the networks. In this paper, a pruning algorithm for the lightweight task is proposed, and a pruning strategy based on feature representation is investigated. Different from other pruning approaches, the proposed strategy is guided by the practical task and eliminates the irrelevant filters in the network. After pruning, the network is compacted to a smaller size and is easy to recover accuracy with fine-tuning. The performance of the proposed pruning algorithm is validated on the acknowledged image datasets, and the experimental results prove that the proposed algorithm is more suitable to prune the irrelevant filters for the fine-tuning dataset.


Deep Learning allows us to build powerful models to solve problems like image classification, time series prediction, natural language processing, etc. This is achieved at the cost of huge amounts of storage and processing requirements which are sometimes not possible in machines with limited resources. In this paper, we compare different methods which tackle this problem with network pruning. Selected few pruning methodologies from the deep learning literature were implemented to display their results. Modern neural architectures have a combination of different layers like convolutional layers, pooling layers, dense layers, etc. We compare pruning techniques for dense layers (such as unit/neuron pruning, and weight Pruning), and convolutional layers as well (using L1 norm, taylor expansion of loss to determine importance of convolutional filters, and Variable Importance in Projection using Partial Least Squares) for the image classification task. This study aims to ease the overhead in terms of optimization of the model for academic, as well as commercial, use of deep neural networks.


Author(s):  
Pilar Bachiller ◽  
◽  
Julia González

Feed-forward neural networks have emerged as a good solution for many problems, such as classification, recognition and identification, and signal processing. However, the importance of selecting an adequate hidden structure for this neural model should not be underestimated. When the hidden structure of the network is too large and complex for the model being developed, the network may tend to memorize input and output sets rather than learning relationships between them. Such a network may train well but test poorly when inputs outside the training set are presented. In addition, training time will significantly increase when the network is unnecessarily large and complex. Most of the proposed solutions to this problem consist of training a larger than necessary network, pruning unnecessary links and nodes and retraining the reduced network. We propose a new method to optimize the size of a feed-forward neural network using orthogonal transformations. This approach prunes unnecessary nodes during the training process, avoiding the retraining phase of the reduced network, which is necessary in most pruning techniques.


2020 ◽  
Vol 39 (5) ◽  
pp. 7403-7410
Author(s):  
Yangke Huang ◽  
Zhiming Wang

Network pruning has been widely used to reduce the high computational cost of deep convolutional neural networks(CNNs). The dominant pruning methods, channel pruning, removes filters in layers based on their importance or sparsity training. But these methods often give limited acceleration ratio and encounter difficulties when pruning CNNs with skip connections. Block pruning methods take a sequence of consecutive layers (e.g., Conv-BN-ReLu) as a block and remove entire block each time. However, previous methods usually introduce new parameters to help pruning and lead additional parameters and extra computations. This work proposes a novel multi-granularity pruning approach that combines block pruning with channel pruning (BPCP). The block pruning (BP) module remove blocks by directly searches the redundant blocks with gradient descent and leaves no extra parameters in final models, which is friendly to hardware optimization. The channel pruning (CP) module remove redundant channels based on importance criteria and handles CNNs with skip connections properly, which further improves the overall compression ratio. As a result, for CIFAR10, BPCP reduces the number of parameters and MACs of a ResNet56 model up to 78.9% and 80.3% respectively with <3% accuracy drop. In terms of speed, it gives a 3.17 acceleration ratio. Our code has been made available at https://github.com/Pokemon-Huang/BPCP.


2015 ◽  
Vol 2015 ◽  
pp. 1-6
Author(s):  
Ruliang Wang ◽  
Huanlong Sun ◽  
Benbo Zha ◽  
Lei Wang

The adaptive growing and pruning algorithm (AGP) has been improved, and the network pruning is based on the sigmoidal activation value of the node and all the weights of its outgoing connections. The nodes are pruned directly, but those nodes that have internal relation are not removed. The network growing is based on the idea of variance. We directly copy those nodes with high correlation. An improved AGP algorithm (IAGP) is proposed. And it improves the network performance and efficiency. The simulation results show that, compared with the AGP algorithm, the improved method (IAGP) can quickly and accurately predict traffic capacity.


2020 ◽  
Vol 34 (07) ◽  
pp. 12273-12280
Author(s):  
Yulong Wang ◽  
Xiaolu Zhang ◽  
Lingxi Xie ◽  
Jun Zhou ◽  
Hang Su ◽  
...  

Network pruning is an important research field aiming at reducing computational costs of neural networks. Conventional approaches follow a fixed paradigm which first trains a large and redundant network, and then determines which units (e.g., channels) are less important and thus can be removed. In this work, we find that pre-training an over-parameterized model is not necessary for obtaining the target pruned structure. In fact, a fully-trained over-parameterized model will reduce the search space for the pruned structure. We empirically show that more diverse pruned structures can be directly pruned from randomly initialized weights, including potential models with better performance. Therefore, we propose a novel network pruning pipeline which allows pruning from scratch with little training overhead. In the experiments for compressing classification models on CIFAR10 and ImageNet datasets, our approach not only greatly reduces the pre-training burden of traditional pruning methods, but also achieves similar or even higher accuracy under the same computation budgets. Our results facilitate the community to rethink the effectiveness of existing techniques used for network pruning.


Sign in / Sign up

Export Citation Format

Share Document