Model compression via pruning and knowledge distillation for person re-identification

The powerful performance of deep learning is evident to all. With the deepening of research, neural networks have become more complex and not easily generalized to resource-constrained devices. The emergence of a series of model compression algorithms makes artificial intelligence on edge possible. Among them, structured model pruning is widely utilized because of its versatility. Structured pruning prunes the neural network itself and discards some relatively unimportant structures to compress the model’s size. However, in the previous pruning work, problems such as evaluation errors of networks, empirical determination of pruning rate, and low retraining efficiency remain. Therefore, we propose an accurate, objective, and efficient pruning algorithm—Combine-Net, introducing Adaptive BN to eliminate evaluation errors, the Kneedle algorithm to determine the pruning rate objectively, and knowledge distillation to improve the efficiency of retraining. Results show that, without precision loss, Combine-Net achieves 95% parameter compression and 83% computation compression on VGG16 on CIFAR10, 71% of parameter compression and 41% computation compression on ResNet50 on CIFAR100. Experiments on different datasets and models have proved that Combine-Net can efficiently compress the neural network’s parameters and computation.

Download Full-text

Data-Free Ensemble Knowledge Distillation for Privacy-conscious Multimedia Model Compression

10.1145/3474085.3475329 ◽

2021 ◽

Author(s):

Zhiwei Hao ◽

Yong Luo ◽

Han Hu ◽

Jianping An ◽

Yonggang Wen

Keyword(s):

Multimedia Model ◽

Model Compression ◽

Knowledge Distillation

Download Full-text

Revisiting knowledge distillation for light-weight visual object detection

Transactions of the Institute of Measurement and Control ◽

10.1177/01423312211022877 ◽

2021 ◽

Vol 43 (13) ◽

pp. 2888-2898

Author(s):

Tianze Gao ◽

Yunfeng Gao ◽

Yu Li ◽

Peiyuan Qin

Keyword(s):

Object Detection ◽

Essential Element ◽

Detection Algorithm ◽

Positive Sample ◽

Detection Methods ◽

Visual Object ◽

Light Weight ◽

Model Compression ◽

Novel Approach ◽

Knowledge Distillation

An essential element for intelligent perception in mechatronic and robotic systems (M&RS) is the visual object detection algorithm. With the ever-increasing advance of artificial neural networks (ANN), researchers have proposed numerous ANN-based visual object detection methods that have proven to be effective. However, networks with cumbersome structures do not befit the real-time scenarios in M&RS, necessitating the techniques of model compression. In the paper, a novel approach to training light-weight visual object detection networks is developed by revisiting knowledge distillation. Traditional knowledge distillation methods are oriented towards image classification is not compatible with object detection. Therefore, a variant of knowledge distillation is developed and adapted to a state-of-the-art keypoint-based visual detection method. Two strategies named as positive sample retaining and early distribution softening are employed to yield a natural adaption. The mutual consistency between teacher model and student model is further promoted through a hint-based distillation. By extensive controlled experiments, the proposed method is testified to be effective in enhancing the light-weight network’s performance by a large margin.

Download Full-text

Model Compression with NAS and Knowledge Distillation for Medical Image Segmentation

10.1145/3478905.3478940 ◽

2021 ◽

Author(s):

Zhong Zheng ◽

Guixia Kang

Keyword(s):

Image Segmentation ◽

Medical Image ◽

Medical Image Segmentation ◽

Model Compression ◽

Knowledge Distillation

Download Full-text

Sequence-level Knowledge Distillation for Model Compression of Attention-based Sequence-to-sequence Speech Recognition

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2019.8683171 ◽

2019 ◽

Cited By ~ 2

Author(s):

Raden Mu'az Mun'im ◽

Nakamasa Inoue ◽

Koichi Shinoda

Keyword(s):

Speech Recognition ◽

Model Compression ◽

Knowledge Distillation

Download Full-text

Pairwise Ranking Distillation for Deep Face Recognition

10.51130/graphicon-2020-2-3-30 ◽

2020 ◽

pp. paper30-1-paper30-13

Author(s):

Mikhail Nikitin ◽

Vadim Konushin ◽

Anton Konushin

Keyword(s):

Face Recognition ◽

Metric Learning ◽

High Capacity ◽

Recognition Task ◽

Learning Task ◽

Face Verification ◽

Baseline Method ◽

Task Knowledge ◽

Model Compression ◽

Knowledge Distillation

This work addresses the problem of knowledge distillation for deep face recognition task. Knowledge distillation technique is known to be an effective way of model compression, which implies transferring of the knowledge from high-capacity teacher to a lightweight student. The knowledge and the way how it is distilled can be defined in different ways depending on the problem where the technique is applied. Considering the fact that face recognition is a typical metric learning task, we propose to perform knowledge distillation on a score-level. Specifically, for any pair of matching scores computed by teacher, our method forces student to have the same order for the corresponding matching scores. We evaluate proposed pairwise ranking distillation (PWR) approach using several face recognition benchmarks for both face verification and face identification scenarios. Experimental results show that PWR not only can improve over the baseline method by a large margin, but also outperforms other score-level distillation approaches.

Download Full-text

HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression

10.18653/v1/2021.emnlp-main.250 ◽

2021 ◽

Author(s):

Chenhe Dong ◽

Yaliang Li ◽

Ying Shen ◽

Minghui Qiu

Keyword(s):

Language Model ◽

Model Compression ◽

Cross Domain ◽

Relational Knowledge ◽

Knowledge Distillation

Download Full-text

Model Compression Based on Knowledge Distillation and Its Application in HRRP

2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC) ◽

10.1109/itnec48623.2020.9084894 ◽

2020 ◽

Author(s):

Xiaojiao Chen ◽

Zhenyu An ◽

Liansheng Huang ◽

Shiying He ◽

Zhen Wang

Keyword(s):

Model Compression ◽

Knowledge Distillation

Download Full-text

Feature fusion-based collaborative learning for knowledge distillation

International Journal of Distributed Sensor Networks ◽

10.1177/15501477211057037 ◽

2021 ◽

Vol 17 (11) ◽

pp. 155014772110570

Author(s):

Yiting Li ◽

Liyuan Sun ◽

Jianping Gou ◽

Lan Du ◽

Weihua Ou

Keyword(s):

Collaborative Learning ◽

Large Scale ◽

Feature Fusion ◽

Model Performance ◽

Data Sets ◽

Great Success ◽

Compression Technique ◽

Model Compression ◽

Knowledge Distillation ◽

Self Driving Cars

Deep neural networks have achieved a great success in a variety of applications, such as self-driving cars and intelligent robotics. Meanwhile, knowledge distillation has received increasing attention as an effective model compression technique for training very efficient deep models. The performance of the student network obtained through knowledge distillation heavily depends on whether the transfer of the teacher’s knowledge can effectively guide the student training. However, most existing knowledge distillation schemes require a large teacher network pre-trained on large-scale data sets, which can increase the difficulty of knowledge distillation in different applications. In this article, we propose a feature fusion-based collaborative learning for knowledge distillation. Specifically, during knowledge distillation, it enables networks to learn from each other using the feature/response-based knowledge in different network layers. We concatenate the features learned by the teacher and the student networks to obtain a more representative feature map for knowledge transfer. In addition, we also introduce a network regularization method to further improve the model performance by providing a positive knowledge during training. Experiments and ablation studies on two widely used data sets demonstrate that the proposed method, feature fusion-based collaborative learning, significantly outperforms recent state-of-the-art knowledge distillation methods.

Download Full-text