Ensemble Learning of Lightweight Deep Learning Models Using Knowledge Distillation for Image Classification

Jaeyong Kang; Jeonghwan Gwak

doi:10.3390/math8101652

Ensemble Learning of Lightweight Deep Learning Models Using Knowledge Distillation for Image Classification

Mathematics ◽

10.3390/math8101652 ◽

2020 ◽

Vol 8 (10) ◽

pp. 1652

Author(s):

Jaeyong Kang ◽

Jeonghwan Gwak

Keyword(s):

Deep Learning ◽

Image Classification ◽

Limited Resources ◽

Ensemble Model ◽

Learning Models ◽

Model Compression ◽

Knowledge Distillation ◽

Feature Based ◽

Classification Tasks ◽

Teacher Model

In recent years, deep learning models have been used successfully in almost every field including both industry and academia, especially for computer vision tasks. However, these models are huge in size, with millions (and billions) of parameters, and thus cannot be deployed on the systems and devices with limited resources (e.g., embedded systems and mobile phones). To tackle this, several techniques on model compression and acceleration have been proposed. As a representative type of them, knowledge distillation suggests a way to effectively learn a small student model from large teacher model(s). It has attracted increasing attention since it showed its promising performance. In the work, we propose an ensemble model that combines feature-based, response-based, and relation-based lightweight knowledge distillation models for simple image classification tasks. In our knowledge distillation framework, we use ResNet−20 as a student network and ResNet−110 as a teacher network. Experimental results demonstrate that our proposed ensemble model outperforms other knowledge distillation models as well as the large teacher model for image classification tasks, with less computational power than the teacher model.

Download Full-text

Knowledge distillation in deep learning and its applications

PeerJ Computer Science ◽

10.7717/peerj-cs.474 ◽

2021 ◽

Vol 7 ◽

pp. e474

Author(s):

Abdolmaged Alkhulaifi ◽

Fahad Alsahli ◽

Irfan Ahmad

Keyword(s):

Deep Learning ◽

Mobile Phones ◽

Learning Models ◽

Student Model ◽

Embedded Devices ◽

Research Directions ◽

Resource Limited ◽

Knowledge Distillation ◽

Teacher Model

Deep learning based models are relatively large, and it is hard to deploy such models on resource-limited devices such as mobile phones and embedded devices. One possible solution is knowledge distillation whereby a smaller model (student model) is trained by utilizing the information from a larger model (teacher model). In this paper, we present an outlook of knowledge distillation techniques applied to deep learning models. To compare the performances of different techniques, we propose a new metric called distillation metric which compares different knowledge distillation solutions based on models' sizes and accuracy scores. Based on the survey, some interesting conclusions are drawn and presented in this paper including the current challenges and possible research directions.

Download Full-text

Detecting Pneumonia Using Convolutions and Dynamic Capsule Routing for Chest X-ray Images

Sensors ◽

10.3390/s20041068 ◽

2020 ◽

Vol 20 (4) ◽

pp. 1068 ◽

Cited By ~ 10

Author(s):

Ansh Mittal ◽

Deepika Kumar ◽

Mamta Mittal ◽

Tanzila Saba ◽

Ibrahim Abunadi ◽

...

Keyword(s):

Deep Learning ◽

Simple Model ◽

Image Classification ◽

State Of The Art ◽

Test Accuracy ◽

Learning Models ◽

X Ray ◽

Chest X Ray ◽

Classification Tasks

An entity’s existence in an image can be depicted by the activity instantiation vector from a group of neurons (called capsule). Recently, multi-layered capsules, called CapsNet, have proven to be state-of-the-art for image classification tasks. This research utilizes the prowess of this algorithm to detect pneumonia from chest X-ray (CXR) images. Here, an entity in the CXR image can help determine if the patient (whose CXR is used) is suffering from pneumonia or not. A simple model of capsules (also known as Simple CapsNet) has provided results comparable to best Deep Learning models that had been used earlier. Subsequently, a combination of convolutions and capsules is used to obtain two models that outperform all models previously proposed. These models—Integration of convolutions with capsules (ICC) and Ensemble of convolutions with capsules (ECC)—detect pneumonia with a test accuracy of 95.33% and 95.90%, respectively. The latter model is studied in detail to obtain a variant called EnCC, where n = 3, 4, 8, 16. Here, the E4CC model works optimally and gives test accuracy of 96.36%. All these models had been trained, validated, and tested on 5857 images from Mendeley.

Download Full-text

Quantum algorithm for quicker clinical prognostic analysis: an application and experimental study using CT scan images of COVID-19 patients

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01588-6 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Kinshuk Sengupta ◽

Praveen Ranjan Srivastava

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Image Classification ◽

Machine Learning Algorithms ◽

Classification Task ◽

Clinical Image ◽

Prototype Model ◽

Learning Models ◽

Accuracy Measure ◽

Quantum Machine Learning

Abstract Background In medical diagnosis and clinical practice, diagnosing a disease early is crucial for accurate treatment, lessening the stress on the healthcare system. In medical imaging research, image processing techniques tend to be vital in analyzing and resolving diseases with a high degree of accuracy. This paper establishes a new image classification and segmentation method through simulation techniques, conducted over images of COVID-19 patients in India, introducing the use of Quantum Machine Learning (QML) in medical practice. Methods This study establishes a prototype model for classifying COVID-19, comparing it with non-COVID pneumonia signals in Computed tomography (CT) images. The simulation work evaluates the usage of quantum machine learning algorithms, while assessing the efficacy for deep learning models for image classification problems, and thereby establishes performance quality that is required for improved prediction rate when dealing with complex clinical image data exhibiting high biases. Results The study considers a novel algorithmic implementation leveraging quantum neural network (QNN). The proposed model outperformed the conventional deep learning models for specific classification task. The performance was evident because of the efficiency of quantum simulation and faster convergence property solving for an optimization problem for network training particularly for large-scale biased image classification task. The model run-time observed on quantum optimized hardware was 52 min, while on K80 GPU hardware it was 1 h 30 min for similar sample size. The simulation shows that QNN outperforms DNN, CNN, 2D CNN by more than 2.92% in gain in accuracy measure with an average recall of around 97.7%. Conclusion The results suggest that quantum neural networks outperform in COVID-19 traits’ classification task, comparing to deep learning w.r.t model efficacy and training time. However, a further study needs to be conducted to evaluate implementation scenarios by integrating the model within medical devices.

Download Full-text

An Ensemble Energy Consumption Forecasting Model Based on Spatial-Temporal Clustering Analysis in Residential Buildings

Energies ◽

10.3390/en14113020 ◽

2021 ◽

Vol 14 (11) ◽

pp. 3020

Author(s):

Anam-Nawaz Khan ◽

Naeem Iqbal ◽

Atif Rizwan ◽

Rashid Ahmad ◽

Do-Hyeun Kim

Keyword(s):

Cluster Analysis ◽

Deep Learning ◽

Residential Buildings ◽

Ensemble Forecasting ◽

Forecasting Model ◽

Ensemble Model ◽

Learning Models ◽

Short Term ◽

Floor Level ◽

Consumption Data

Due to the availability of smart metering infrastructure, high-resolution electric consumption data is readily available to study the dynamics of residential electric consumption at finely resolved spatial and temporal scales. Analyzing the electric consumption data enables the policymakers and building owners to understand consumer’s demand-consumption behaviors. Furthermore, analysis and accurate forecasting of electric consumption are substantial for consumer involvement in time-of-use tariffs, critical peak pricing, and consumer-specific demand response initiatives. Alongside its vast economic and sustainability implications, such as energy wastage and decarbonization of the energy sector, accurate consumption forecasting facilitates power system planning and stable grid operations. Energy consumption forecasting is an active research area; despite the abundance of devised models, electric consumption forecasting in residential buildings remains challenging due to high occupant energy use behavior variability. Hence the search for an appropriate model for accurate electric consumption forecasting is ever continuing. To this aim, this paper presents a spatial and temporal ensemble forecasting model for short-term electric consumption forecasting. The proposed work involves exploring electric consumption profiles at the apartment level through cluster analysis based on the k-means algorithm. The ensemble forecasting model consists of two deep learning models; Long Short-Term Memory Unit (LSTM) and Gated Recurrent Unit (GRU). First, the apartment-level historical electric consumption data is clustered. Later the clusters are aggregated based on consumption profiles of consumers. At the building and floor level, the ensemble models are trained using aggregated electric consumption data. The proposed ensemble model forecasts the electric consumption at three spatial scales apartment, building, and floor level for hourly, daily, and weekly forecasting horizon. Furthermore, the impact of spatial-temporal granularity and cluster analysis on the prediction accuracy is analyzed. The dataset used in this study comprises high-resolution electric consumption data acquired through smart meters recorded on an hourly basis over the period of one year. The consumption data belongs to four multifamily residential buildings situated in an urban area of South Korea. To prove the effectiveness of our proposed forecasting model, we compared our model with widely known machine learning models and deep learning variants. The results achieved by our proposed ensemble scheme verify that model has learned the sequential behavior of electric consumption by producing superior performance with the lowest MAPE of 4.182 and 4.54 at building and floor level prediction, respectively. The experimental findings suggest that the model has efficiently captured the dynamic electric consumption characteristics to exploit ensemble model diversities and achieved lower forecasting error. The proposed ensemble forecasting scheme is well suited for predictive modeling and short-term load forecasting.

Download Full-text

Low–High-Power Consumption Architectures for Deep-Learning Models Applied to Hyperspectral Image Classification

IEEE Geoscience and Remote Sensing Letters ◽

10.1109/lgrs.2018.2881045 ◽

2019 ◽

Vol 16 (5) ◽

pp. 776-780 ◽

Cited By ~ 8

Author(s):

Juan M. Haut ◽

Sergio Bernabe ◽

Mercedes E. Paoletti ◽

Ruben Fernandez-Beltran ◽

Antonio Plaza ◽

...

Keyword(s):

Deep Learning ◽

Power Consumption ◽

Image Classification ◽

High Power ◽

Hyperspectral Image ◽

Learning Models ◽

Hyperspectral Image Classification ◽

High Power Consumption

Download Full-text

Efficient Deep Learning Models for DGA Domain Detection

Security and Communication Networks ◽

10.1155/2021/8887881 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Juhong Namgung ◽

Siwoon Son ◽

Yang-Sae Moon

Keyword(s):

Deep Learning ◽

Short Term Memory ◽

Ensemble Model ◽

Learning Models ◽

Short Term ◽

Domain Names ◽

Additional Information ◽

Domain Sequence ◽

Long Short Term Memory ◽

And Control

In recent years, cyberattacks using command and control (C&C) servers have significantly increased. To hide their C&C servers, attackers often use a domain generation algorithm (DGA), which automatically generates domain names for the C&C servers. Accordingly, extensive research on DGA domain detection has been conducted. However, existing methods cannot accurately detect continuously generated DGA domains and can easily be evaded by an attacker. Recently, long short-term memory- (LSTM-) based deep learning models have been introduced to detect DGA domains in real time using only domain names without feature extraction or additional information. In this paper, we propose an efficient DGA domain detection method based on bidirectional LSTM (BiLSTM), which learns bidirectional information as opposed to unidirectional information learned by LSTM. We further maximize the detection performance with a convolutional neural network (CNN) + BiLSTM ensemble model using Attention mechanism, which allows the model to learn both local and global information in a domain sequence. Experimental results show that existing CNN and LSTM models achieved F1-scores of 0.9384 and 0.9597, respectively, while the proposed BiLSTM and ensemble models achieved higher F1-scores of 0.9618 and 0.9666, respectively. In addition, the ensemble model achieved the best performance for most DGA domain classes, enabling more accurate DGA domain detection than existing models.

Download Full-text

WORKFLOWFOR TRAINING AND SERVING DEEP LEARNING MODELS FOR IMAGE CLASSIFICATION AND OBJECT DETECTION - APPLICATION TO FAULT DETECTION ON ELECTRIC POLES

10.1049/icp.2021.1557 ◽

2021 ◽

Author(s):

C. Coello ◽

R. Sanchez ◽

S. de Lange ◽

J. Halvorsen ◽

M. Bertani-Økland ◽

...

Keyword(s):

Deep Learning ◽

Fault Detection ◽

Object Detection ◽

Image Classification ◽

Learning Models

Download Full-text

Histopathological Image and Lymphoma Image Classification using customized Deep Learning models and different optimization algorithms

2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT) ◽

10.1109/icccnt49239.2020.9225616 ◽

2020 ◽

Author(s):

Ambarish Ganguly ◽

Rik Das ◽

S. K. Setua

Keyword(s):

Deep Learning ◽

Image Classification ◽

Optimization Algorithms ◽

Learning Models ◽

Histopathological Image

Download Full-text

The Unreasonable Effectiveness of the Baseline: Discussing SVMs in Legal Text Classification

10.3233/faia210317 ◽

2021 ◽

Author(s):

Benjamin Clavié ◽

Marc Alphonsus

Keyword(s):

Deep Learning ◽

Language Processing ◽

Text Classification ◽

Traditional Approach ◽

Error Reduction ◽

Support Vector ◽

Learning Models ◽

Legal Text ◽

Classification Tasks ◽

Legal Domain

We aim to highlight an interesting trend to contribute to the ongoing debate around advances within legal Natural Language Processing. Recently, the focus for most legal text classification tasks has shifted towards large pre-trained deep learning models such as BERT. In this paper, we show that a more traditional approach based on Support Vector Machine classifiers reaches competitive performance with deep learning models. We also highlight that error reduction obtained by using specialised BERT-based models over baselines is noticeably smaller in the legal domain when compared to general language tasks. We discuss some hypotheses for these results to support future discussions.

Download Full-text

A comparative review on deep learning models for text classification

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v19.i1.pp325-335 ◽

2020 ◽

Vol 19 (1) ◽

pp. 325

Author(s):

Muhammad Zulqarnain ◽

Rozaida Ghazali ◽

Yana Mazwin Mohmad Hassim ◽

Muhammad Rehan

Keyword(s):

Neural Network ◽

Deep Learning ◽

Language Processing ◽

Text Classification ◽

Question Answering ◽

Learning Models ◽

Semantic Classification ◽

Analysis Question ◽

Comparative Review ◽

Classification Tasks

<p>Text classification is a fundamental task in several areas of natural language processing (NLP), including words semantic classification, sentiment analysis, question answering, or dialog management. This paper investigates three basic architectures of deep learning models for the tasks of text classification: Deep Belief Neural (DBN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), these three main types of deep learning architectures, are largely explored to handled various classification tasks. DBN have excellent learning capabilities to extracts highly distinguishable features and good for general purpose. CNN have supposed to be better at extracting the position of various related features while RNN is modeling in sequential of long-term dependencies. This paper work shows the systematic comparison of DBN, CNN, and RNN on text classification tasks. Finally, we show the results of deep models by research experiment. The aim of this paper to provides basic guidance about the deep learning models that which models are best for the task of text classification.</p>

Download Full-text