Selective Information Control and Network Compression in Multi-layered Neural Networks

Author(s):  
Ryotaro Kamimura
2020 ◽  
Vol 34 (04) ◽  
pp. 4989-4996
Author(s):  
Ekaterina Lobacheva ◽  
Nadezhda Chirkova ◽  
Alexander Markovich ◽  
Dmitry Vetrov

One of the most popular approaches for neural network compression is sparsification — learning sparse weight matrices. In structured sparsification, weights are set to zero by groups corresponding to structure units, e. g. neurons. We further develop the structured sparsification approach for the gated recurrent neural networks, e. g. Long Short-Term Memory (LSTM). Specifically, in addition to the sparsification of individual weights and neurons, we propose sparsifying the preactivations of gates. This makes some gates constant and simplifies an LSTM structure. We test our approach on the text classification and language modeling tasks. Our method improves the neuron-wise compression of the model in most of the tasks. We also observe that the resulting structure of gate sparsity depends on the task and connect the learned structures to the specifics of the particular tasks.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yisu Ge ◽  
Shufang Lu ◽  
Fei Gao

Many current convolutional neural networks are hard to meet the practical application requirement because of the enormous network parameters. For accelerating the inference speed of networks, more and more attention has been paid to network compression. Network pruning is one of the most efficient and simplest ways to compress and speed up the networks. In this paper, a pruning algorithm for the lightweight task is proposed, and a pruning strategy based on feature representation is investigated. Different from other pruning approaches, the proposed strategy is guided by the practical task and eliminates the irrelevant filters in the network. After pruning, the network is compacted to a smaller size and is easy to recover accuracy with fine-tuning. The performance of the proposed pruning algorithm is validated on the acknowledged image datasets, and the experimental results prove that the proposed algorithm is more suitable to prune the irrelevant filters for the fine-tuning dataset.


Electronics ◽  
2020 ◽  
Vol 9 (7) ◽  
pp. 1059 ◽  
Author(s):  
Wenzhe Guo ◽  
Hasan Erdem Yantır ◽  
Mohammed E. Fouda ◽  
Ahmed M. Eltawil ◽  
Khaled Nabil Salama

To solve real-time challenges, neuromorphic systems generally require deep and complex network structures. Thus, it is crucial to search for effective solutions that can reduce network complexity, improve energy efficiency, and maintain high accuracy. To this end, we propose unsupervised pruning strategies that are focused on pruning neurons while training in spiking neural networks (SNNs) by utilizing network dynamics. The importance of neurons is determined by the fact that neurons that fire more spikes contribute more to network performance. Based on these criteria, we demonstrate that pruning with an adaptive spike count threshold provides a simple and effective approach that can reduce network size significantly and maintain high classification accuracy. The online adaptive pruning shows potential for developing energy-efficient training techniques due to less memory access and less weight-update computation. Furthermore, a parallel digital implementation scheme is proposed to implement spiking neural networks (SNNs) on field programmable gate array (FPGA). Notably, our proposed pruning strategies preserve the dense format of weight matrices, so the implementation architecture remains the same after network compression. The adaptive pruning strategy enables 2.3× reduction in memory size and 2.8× improvement on energy efficiency when 400 neurons are pruned from an 800-neuron network, while the loss of classification accuracy is 1.69%. And the best choice of pruning percentage depends on the trade-off among accuracy, memory, and energy. Therefore, this work offers a promising solution for effective network compression and energy-efficient hardware implementation of neuromorphic systems in real-time applications.


Informatics ◽  
2021 ◽  
Vol 8 (4) ◽  
pp. 77
Author(s):  
Ali Alqahtani ◽  
Xianghua Xie ◽  
Mark W. Jones

Deep networks often possess a vast number of parameters, and their significant redundancy in parameterization has become a widely-recognized property. This presents significant challenges and restricts many deep learning applications, making the focus on reducing the complexity of models while maintaining their powerful performance. In this paper, we present an overview of popular methods and review recent works on compressing and accelerating deep neural networks. We consider not only pruning methods but also quantization methods, and low-rank factorization methods. This review also intends to clarify these major concepts, and highlights their characteristics, advantages, and shortcomings.


Sign in / Sign up

Export Citation Format

Share Document