A survey on GAN acceleration using memory compression techniques

Dina Tantawy; Mohamed Zahran; Amr Wassal

doi:10.1186/s44147-021-00045-5

A survey on GAN acceleration using memory compression techniques

Journal of Engineering and Applied Science ◽

10.1186/s44147-021-00045-5 ◽

2021 ◽

Vol 68 (1) ◽

Author(s):

Dina Tantawy ◽

Mohamed Zahran ◽

Amr Wassal

Keyword(s):

Deep Learning ◽

Data Transfer ◽

Lossy Compression ◽

Research Field ◽

Low Rank ◽

Generative Adversarial Networks ◽

Learning Models ◽

Memory Compression ◽

Knowledge Distillation ◽

Rank Factorization

AbstractSince its invention, generative adversarial networks (GANs) have shown outstanding results in many applications. GANs are powerful, yet resource-hungry deep learning models. The main difference between GANs and ordinary deep learning models is the nature of their output and training instability. For example, GANs output can be a whole image versus other models detecting objects or classifying images. Thus, the architecture and numeric precision of the network affect the quality and speed of the solution. Hence, accelerating GANs is pivotal. Data transfer is considered the main source of energy consumption, that is why memory compression is a very efficient technique to accelerate and optimize GANs. Two main types of memory compression exist: lossless and lossy ones. Lossless compression techniques are general among all models; thus, we will focus in this paper on lossy techniques. Lossy compression techniques are further classified into (a) pruning, (b) knowledge distillation, (c) low-rank factorization, (d) lowering numeric precision, and (e) encoding. In this paper, we survey lossy compression techniques for CNN-based GANs. Our findings showed the superiority of knowledge distillation over pruning alone and the gaps in the research field that needs to be explored like encoding and different combination of compression techniques.

Download Full-text

Towards High-Performance Deep Learning Models in Tool Wear Classification with Generative Adversarial Networks

Journal of Materials Processing Technology ◽

10.1016/j.jmatprotec.2021.117484 ◽

2021 ◽

pp. 117484

Author(s):

Dirk Alexander Molitor ◽

Christian Kubik ◽

Marco Becker ◽

Ruben Helmut Hetfleisch ◽

Fan Lyu ◽

...

Keyword(s):

Deep Learning ◽

Tool Wear ◽

High Performance ◽

Generative Adversarial Networks ◽

Learning Models ◽

Adversarial Networks ◽

Tool Wear Classification

Download Full-text

Knowledge distillation in deep learning and its applications

PeerJ Computer Science ◽

10.7717/peerj-cs.474 ◽

2021 ◽

Vol 7 ◽

pp. e474

Author(s):

Abdolmaged Alkhulaifi ◽

Fahad Alsahli ◽

Irfan Ahmad

Keyword(s):

Deep Learning ◽

Mobile Phones ◽

Learning Models ◽

Student Model ◽

Embedded Devices ◽

Research Directions ◽

Resource Limited ◽

Knowledge Distillation ◽

Teacher Model

Deep learning based models are relatively large, and it is hard to deploy such models on resource-limited devices such as mobile phones and embedded devices. One possible solution is knowledge distillation whereby a smaller model (student model) is trained by utilizing the information from a larger model (teacher model). In this paper, we present an outlook of knowledge distillation techniques applied to deep learning models. To compare the performances of different techniques, we propose a new metric called distillation metric which compares different knowledge distillation solutions based on models' sizes and accuracy scores. Based on the survey, some interesting conclusions are drawn and presented in this paper including the current challenges and possible research directions.

Download Full-text

Improving Cloud-based ECG Monitoring, Detection and Classification using GAN

10.54216/fpa.020201 ◽

2020 ◽

pp. 42-49

Author(s):

admin admin ◽

◽

Monika Gupta

Keyword(s):

Deep Learning ◽

Training Data ◽

Generative Adversarial Networks ◽

Learning Models ◽

Healthcare Applications ◽

Ecg Signals ◽

Cardiac Abnormalities ◽

Adversarial Networks ◽

Data Points

Internet of Things (IoT) based healthcare applications have grown exponentially over the past decade. With the increasing number of fatalities due to cardiovascular diseases (CVD), it is the need of the hour to detect any signs of cardiac abnormalities as early as possible. This calls for automation on the detection and classification of said cardiac abnormalities by physicians. The problem here is that, there is not enough data to train Deep Learning models to classify ECG signals accurately because of sensitive nature of data and the rarity of certain cases involved in CVDs. In this paper, we propose a framework which involves Generative Adversarial Networks (GAN) to create synthetic training data for the classes with less data points to improve the performance of Deep Learning models trained with the dataset. With data being input from sensors via cloud and this model to classify the ECG signals, we expect the framework to be functional, accurate and efficient.

Download Full-text

A Novel Generative Method for Machine Fault Diagnosis

Journal of Sensors ◽

10.1155/2022/5420478 ◽

2022 ◽

Vol 2022 ◽

pp. 1-11

Author(s):

Zhipeng Dong ◽

Yucheng Liu ◽

Jianshe Kang ◽

Shaohui Zhang

Keyword(s):

Deep Learning ◽

Fault Diagnosis ◽

Test Sample ◽

Training Sample ◽

Generative Adversarial Networks ◽

Mechanical Equipment ◽

Learning Models ◽

Machine Fault Diagnosis ◽

Machine Fault ◽

Generative Method

Deep learning is widely used in fault diagnosis of mechanical equipment and has achieved good results. However, these deep learning models require a large number of labeled samples for training, which is difficult to obtain enough labeled samples in the actual production process. However, it is easier to obtain unlabeled samples in the industrial environment. To overcome this problem, this paper proposes a novel method to generative enough label samples for training deep learning models. Unlike the generative adversarial networks, which required complex computing time, the calculation of the proposed novel generative method is simple and effective. First, we calculate the Euclidean distance between the training sample and the test sample; then, the weight coefficient between the training sample and the test sample is settled to generate pseudosamples; finally, combine with the pseudosamples, the deep learning method is training for machine fault diagnosis. In order to verify the effectiveness of the proposed method, two experiment datasets with planetary gearboxes and wind gearboxes are carried out with different activation functions. Experimental results show that the proposed method is effective for most activation function models.

Download Full-text

Literature Review of Deep Network Compression

Informatics ◽

10.3390/informatics8040077 ◽

2021 ◽

Vol 8 (4) ◽

pp. 77

Author(s):

Ali Alqahtani ◽

Xianghua Xie ◽

Mark W. Jones

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Low Rank ◽

Vast Number ◽

Deep Networks ◽

Factorization Methods ◽

Network Compression ◽

Rank Factorization ◽

Pruning Methods

Deep networks often possess a vast number of parameters, and their significant redundancy in parameterization has become a widely-recognized property. This presents significant challenges and restricts many deep learning applications, making the focus on reducing the complexity of models while maintaining their powerful performance. In this paper, we present an overview of popular methods and review recent works on compressing and accelerating deep neural networks. We consider not only pruning methods but also quantization methods, and low-rank factorization methods. This review also intends to clarify these major concepts, and highlights their characteristics, advantages, and shortcomings.

Download Full-text

Analysis and application of generative-adeversarial networks for producing high quality images

Ergodesign ◽

10.30987/2658-4026-2020-4-167-176 ◽

2020 ◽

Vol 2020 (4) ◽

pp. 167-176

Author(s):

Yuriy Malakhov ◽

Aleksandr Androsov ◽

Andrey Averchenkov

Keyword(s):

Deep Learning ◽

Super Resolution ◽

Generative Adversarial Networks ◽

Learning Models ◽

High Quality ◽

Adversarial Networks ◽

Network Operation

The article discusses generative adversarial networks for obtaining high quality images. Models, architecture and comparison of network operation are presented. The features of building deep learning models in the process of performing the super-resolution task, as well as methods associated with improving performance, are considered.

Download Full-text

Ensemble Learning of Lightweight Deep Learning Models Using Knowledge Distillation for Image Classification

Mathematics ◽

10.3390/math8101652 ◽

2020 ◽

Vol 8 (10) ◽

pp. 1652

Author(s):

Jaeyong Kang ◽

Jeonghwan Gwak

Keyword(s):

Deep Learning ◽

Image Classification ◽

Limited Resources ◽

Ensemble Model ◽

Learning Models ◽

Model Compression ◽

Knowledge Distillation ◽

Feature Based ◽

Classification Tasks ◽

Teacher Model

In recent years, deep learning models have been used successfully in almost every field including both industry and academia, especially for computer vision tasks. However, these models are huge in size, with millions (and billions) of parameters, and thus cannot be deployed on the systems and devices with limited resources (e.g., embedded systems and mobile phones). To tackle this, several techniques on model compression and acceleration have been proposed. As a representative type of them, knowledge distillation suggests a way to effectively learn a small student model from large teacher model(s). It has attracted increasing attention since it showed its promising performance. In the work, we propose an ensemble model that combines feature-based, response-based, and relation-based lightweight knowledge distillation models for simple image classification tasks. In our knowledge distillation framework, we use ResNet−20 as a student network and ResNet−110 as a teacher network. Experimental results demonstrate that our proposed ensemble model outperforms other knowledge distillation models as well as the large teacher model for image classification tasks, with less computational power than the teacher model.

Download Full-text

Efficient Prediction of In Vitro Piroxicam Release and Diffusion From Topical Films Based on Biopolymers Using Deep Learning Models and Generative Adversarial Networks

Journal of Pharmaceutical Sciences ◽

10.1016/j.xphs.2021.01.032 ◽

2021 ◽

Author(s):

Hentabli Salma ◽

Yahoum Madiha Melha ◽

Lefnaoui Sonia ◽

Hentabli Hamza ◽

Naomie Salim

Keyword(s):

Deep Learning ◽

Generative Adversarial Networks ◽

Learning Models ◽

Adversarial Networks ◽

Efficient Prediction ◽

And Diffusion

Download Full-text

Generation of High-Precision Ground Penetrating Radar Images Using Improved Least Square Generative Adversarial Networks

Remote Sensing ◽

10.3390/rs13224590 ◽

2021 ◽

Vol 13 (22) ◽

pp. 4590

Author(s):

Yunpeng Yue ◽

Hai Liu ◽

Xu Meng ◽

Yinguang Li ◽

Yanliang Du

Keyword(s):

Deep Learning ◽

High Precision ◽

Ground Penetrating Radar ◽

Data Augmentation ◽

Least Square ◽

Generative Adversarial Networks ◽

Learning Models ◽

Radar Images ◽

Adversarial Networks ◽

Ground Penetrating

Deep learning models have achieved success in image recognition and have shown great potential for interpretation of ground penetrating radar (GPR) data. However, training reliable deep learning models requires massive labeled data, which are usually not easy to obtain due to the high costs of data acquisition and field validation. This paper proposes an improved least square generative adversarial networks (LSGAN) model which employs the loss functions of LSGAN and convolutional neural networks (CNN) to generate GPR images. This model can generate high-precision GPR data to address the scarcity of labelled GPR data. We evaluate the proposed model using Frechet Inception Distance (FID) evaluation index and compare it with other existing GAN models and find it outperforms the other two models on a lower FID score. In addition, the adaptability of the LSGAN-generated images for GPR data augmentation is investigated by YOLOv4 model, which is employed to detect rebars in field GPR images. It is verified that inclusion of LSGAN-generated images in the training GPR dataset can increase the target diversity and improve the detection precision by 10%, compared with the model trained on the dataset containing 500 field GPR images.

Download Full-text

Compression of Deep Learning Models for Text: A Survey

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3487045 ◽

2022 ◽

Vol 16 (4) ◽

pp. 1-55

Author(s):

Manish Gupta ◽

Puneet Agrawal

Keyword(s):

Deep Learning ◽

Language Processing ◽

Short Term Memory ◽

Response Times ◽

Tensor Decomposition ◽

Learning Models ◽

Knowledge Distillation ◽

Gated Recurrent Units ◽

Work Done ◽

Small Models

In recent years, the fields of natural language processing (NLP) and information retrieval (IR) have made tremendous progress thanks to deep learning models like Recurrent Neural Networks (RNNs), Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTMs) networks, and Transformer [ 121 ] based models like Bidirectional Encoder Representations from Transformers (BERT) [ 24 ], Generative Pre-training Transformer (GPT-2) [ 95 ], Multi-task Deep Neural Network (MT-DNN) [ 74 ], Extra-Long Network (XLNet) [ 135 ], Text-to-text transfer transformer (T5) [ 96 ], T-NLG [ 99 ], and GShard [ 64 ]. But these models are humongous in size. On the other hand, real-world applications demand small model size, low response times, and low computational power wattage. In this survey, we discuss six different types of methods (Pruning, Quantization, Knowledge Distillation (KD), Parameter Sharing, Tensor Decomposition, and Sub-quadratic Transformer-based methods) for compression of such models to enable their deployment in real industry NLP projects. Given the critical need of building applications with efficient and small models, and the large amount of recently published work in this area, we believe that this survey organizes the plethora of work done by the “deep learning for NLP” community in the past few years and presents it as a coherent story.

Download Full-text