scholarly journals Combine-Net: An Improved Filter Pruning Algorithm

Information ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 264
Author(s):  
Jinghan Wang ◽  
Guangyue Li ◽  
Wenzhao Zhang

The powerful performance of deep learning is evident to all. With the deepening of research, neural networks have become more complex and not easily generalized to resource-constrained devices. The emergence of a series of model compression algorithms makes artificial intelligence on edge possible. Among them, structured model pruning is widely utilized because of its versatility. Structured pruning prunes the neural network itself and discards some relatively unimportant structures to compress the model’s size. However, in the previous pruning work, problems such as evaluation errors of networks, empirical determination of pruning rate, and low retraining efficiency remain. Therefore, we propose an accurate, objective, and efficient pruning algorithm—Combine-Net, introducing Adaptive BN to eliminate evaluation errors, the Kneedle algorithm to determine the pruning rate objectively, and knowledge distillation to improve the efficiency of retraining. Results show that, without precision loss, Combine-Net achieves 95% parameter compression and 83% computation compression on VGG16 on CIFAR10, 71% of parameter compression and 41% computation compression on ResNet50 on CIFAR100. Experiments on different datasets and models have proved that Combine-Net can efficiently compress the neural network’s parameters and computation.

Informatica ◽  
2017 ◽  
Vol 28 (1) ◽  
pp. 193-214 ◽  
Author(s):  
Tung-Tso Tsai ◽  
Sen-Shan Huang ◽  
Yuh-Min Tseng

Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4496
Author(s):  
Vlad Pandelea ◽  
Edoardo Ragusa ◽  
Tommaso Apicella ◽  
Paolo Gastaldo ◽  
Erik Cambria

Emotion recognition, among other natural language processing tasks, has greatly benefited from the use of large transformer models. Deploying these models on resource-constrained devices, however, is a major challenge due to their computational cost. In this paper, we show that the combination of large transformers, as high-quality feature extractors, and simple hardware-friendly classifiers based on linear separators can achieve competitive performance while allowing real-time inference and fast training. Various solutions including batch and Online Sequential Learning are analyzed. Additionally, our experiments show that latency and performance can be further improved via dimensionality reduction and pre-training, respectively. The resulting system is implemented on two types of edge device, namely an edge accelerator and two smartphones.


Sign in / Sign up

Export Citation Format

Share Document