scholarly journals Ensemble Learning of Lightweight Deep Learning Models Using Knowledge Distillation for Image Classification

Mathematics ◽  
2020 ◽  
Vol 8 (10) ◽  
pp. 1652
Author(s):  
Jaeyong Kang ◽  
Jeonghwan Gwak

In recent years, deep learning models have been used successfully in almost every field including both industry and academia, especially for computer vision tasks. However, these models are huge in size, with millions (and billions) of parameters, and thus cannot be deployed on the systems and devices with limited resources (e.g., embedded systems and mobile phones). To tackle this, several techniques on model compression and acceleration have been proposed. As a representative type of them, knowledge distillation suggests a way to effectively learn a small student model from large teacher model(s). It has attracted increasing attention since it showed its promising performance. In the work, we propose an ensemble model that combines feature-based, response-based, and relation-based lightweight knowledge distillation models for simple image classification tasks. In our knowledge distillation framework, we use ResNet−20 as a student network and ResNet−110 as a teacher network. Experimental results demonstrate that our proposed ensemble model outperforms other knowledge distillation models as well as the large teacher model for image classification tasks, with less computational power than the teacher model.

2021 ◽  
Vol 7 ◽  
pp. e474
Author(s):  
Abdolmaged Alkhulaifi ◽  
Fahad Alsahli ◽  
Irfan Ahmad

Deep learning based models are relatively large, and it is hard to deploy such models on resource-limited devices such as mobile phones and embedded devices. One possible solution is knowledge distillation whereby a smaller model (student model) is trained by utilizing the information from a larger model (teacher model). In this paper, we present an outlook of knowledge distillation techniques applied to deep learning models. To compare the performances of different techniques, we propose a new metric called distillation metric which compares different knowledge distillation solutions based on models' sizes and accuracy scores. Based on the survey, some interesting conclusions are drawn and presented in this paper including the current challenges and possible research directions.


Sensors ◽  
2020 ◽  
Vol 20 (4) ◽  
pp. 1068 ◽  
Author(s):  
Ansh Mittal ◽  
Deepika Kumar ◽  
Mamta Mittal ◽  
Tanzila Saba ◽  
Ibrahim Abunadi ◽  
...  

An entity’s existence in an image can be depicted by the activity instantiation vector from a group of neurons (called capsule). Recently, multi-layered capsules, called CapsNet, have proven to be state-of-the-art for image classification tasks. This research utilizes the prowess of this algorithm to detect pneumonia from chest X-ray (CXR) images. Here, an entity in the CXR image can help determine if the patient (whose CXR is used) is suffering from pneumonia or not. A simple model of capsules (also known as Simple CapsNet) has provided results comparable to best Deep Learning models that had been used earlier. Subsequently, a combination of convolutions and capsules is used to obtain two models that outperform all models previously proposed. These models—Integration of convolutions with capsules (ICC) and Ensemble of convolutions with capsules (ECC)—detect pneumonia with a test accuracy of 95.33% and 95.90%, respectively. The latter model is studied in detail to obtain a variant called EnCC, where n = 3, 4, 8, 16. Here, the E4CC model works optimally and gives test accuracy of 96.36%. All these models had been trained, validated, and tested on 5857 images from Mendeley.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Kinshuk Sengupta ◽  
Praveen Ranjan Srivastava

Abstract Background In medical diagnosis and clinical practice, diagnosing a disease early is crucial for accurate treatment, lessening the stress on the healthcare system. In medical imaging research, image processing techniques tend to be vital in analyzing and resolving diseases with a high degree of accuracy. This paper establishes a new image classification and segmentation method through simulation techniques, conducted over images of COVID-19 patients in India, introducing the use of Quantum Machine Learning (QML) in medical practice. Methods This study establishes a prototype model for classifying COVID-19, comparing it with non-COVID pneumonia signals in Computed tomography (CT) images. The simulation work evaluates the usage of quantum machine learning algorithms, while assessing the efficacy for deep learning models for image classification problems, and thereby establishes performance quality that is required for improved prediction rate when dealing with complex clinical image data exhibiting high biases. Results The study considers a novel algorithmic implementation leveraging quantum neural network (QNN). The proposed model outperformed the conventional deep learning models for specific classification task. The performance was evident because of the efficiency of quantum simulation and faster convergence property solving for an optimization problem for network training particularly for large-scale biased image classification task. The model run-time observed on quantum optimized hardware was 52 min, while on K80 GPU hardware it was 1 h 30 min for similar sample size. The simulation shows that QNN outperforms DNN, CNN, 2D CNN by more than 2.92% in gain in accuracy measure with an average recall of around 97.7%. Conclusion The results suggest that quantum neural networks outperform in COVID-19 traits’ classification task, comparing to deep learning w.r.t model efficacy and training time. However, a further study needs to be conducted to evaluate implementation scenarios by integrating the model within medical devices.


Energies ◽  
2021 ◽  
Vol 14 (11) ◽  
pp. 3020
Author(s):  
Anam-Nawaz Khan ◽  
Naeem Iqbal ◽  
Atif Rizwan ◽  
Rashid Ahmad ◽  
Do-Hyeun Kim

Due to the availability of smart metering infrastructure, high-resolution electric consumption data is readily available to study the dynamics of residential electric consumption at finely resolved spatial and temporal scales. Analyzing the electric consumption data enables the policymakers and building owners to understand consumer’s demand-consumption behaviors. Furthermore, analysis and accurate forecasting of electric consumption are substantial for consumer involvement in time-of-use tariffs, critical peak pricing, and consumer-specific demand response initiatives. Alongside its vast economic and sustainability implications, such as energy wastage and decarbonization of the energy sector, accurate consumption forecasting facilitates power system planning and stable grid operations. Energy consumption forecasting is an active research area; despite the abundance of devised models, electric consumption forecasting in residential buildings remains challenging due to high occupant energy use behavior variability. Hence the search for an appropriate model for accurate electric consumption forecasting is ever continuing. To this aim, this paper presents a spatial and temporal ensemble forecasting model for short-term electric consumption forecasting. The proposed work involves exploring electric consumption profiles at the apartment level through cluster analysis based on the k-means algorithm. The ensemble forecasting model consists of two deep learning models; Long Short-Term Memory Unit (LSTM) and Gated Recurrent Unit (GRU). First, the apartment-level historical electric consumption data is clustered. Later the clusters are aggregated based on consumption profiles of consumers. At the building and floor level, the ensemble models are trained using aggregated electric consumption data. The proposed ensemble model forecasts the electric consumption at three spatial scales apartment, building, and floor level for hourly, daily, and weekly forecasting horizon. Furthermore, the impact of spatial-temporal granularity and cluster analysis on the prediction accuracy is analyzed. The dataset used in this study comprises high-resolution electric consumption data acquired through smart meters recorded on an hourly basis over the period of one year. The consumption data belongs to four multifamily residential buildings situated in an urban area of South Korea. To prove the effectiveness of our proposed forecasting model, we compared our model with widely known machine learning models and deep learning variants. The results achieved by our proposed ensemble scheme verify that model has learned the sequential behavior of electric consumption by producing superior performance with the lowest MAPE of 4.182 and 4.54 at building and floor level prediction, respectively. The experimental findings suggest that the model has efficiently captured the dynamic electric consumption characteristics to exploit ensemble model diversities and achieved lower forecasting error. The proposed ensemble forecasting scheme is well suited for predictive modeling and short-term load forecasting.


2019 ◽  
Vol 16 (5) ◽  
pp. 776-780 ◽  
Author(s):  
Juan M. Haut ◽  
Sergio Bernabe ◽  
Mercedes E. Paoletti ◽  
Ruben Fernandez-Beltran ◽  
Antonio Plaza ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Juhong Namgung ◽  
Siwoon Son ◽  
Yang-Sae Moon

In recent years, cyberattacks using command and control (C&C) servers have significantly increased. To hide their C&C servers, attackers often use a domain generation algorithm (DGA), which automatically generates domain names for the C&C servers. Accordingly, extensive research on DGA domain detection has been conducted. However, existing methods cannot accurately detect continuously generated DGA domains and can easily be evaded by an attacker. Recently, long short-term memory- (LSTM-) based deep learning models have been introduced to detect DGA domains in real time using only domain names without feature extraction or additional information. In this paper, we propose an efficient DGA domain detection method based on bidirectional LSTM (BiLSTM), which learns bidirectional information as opposed to unidirectional information learned by LSTM. We further maximize the detection performance with a convolutional neural network (CNN) + BiLSTM ensemble model using Attention mechanism, which allows the model to learn both local and global information in a domain sequence. Experimental results show that existing CNN and LSTM models achieved F1-scores of 0.9384 and 0.9597, respectively, while the proposed BiLSTM and ensemble models achieved higher F1-scores of 0.9618 and 0.9666, respectively. In addition, the ensemble model achieved the best performance for most DGA domain classes, enabling more accurate DGA domain detection than existing models.


2021 ◽  
Author(s):  
Benjamin Clavié ◽  
Marc Alphonsus

We aim to highlight an interesting trend to contribute to the ongoing debate around advances within legal Natural Language Processing. Recently, the focus for most legal text classification tasks has shifted towards large pre-trained deep learning models such as BERT. In this paper, we show that a more traditional approach based on Support Vector Machine classifiers reaches competitive performance with deep learning models. We also highlight that error reduction obtained by using specialised BERT-based models over baselines is noticeably smaller in the legal domain when compared to general language tasks. We discuss some hypotheses for these results to support future discussions.


Author(s):  
Muhammad Zulqarnain ◽  
Rozaida Ghazali ◽  
Yana Mazwin Mohmad Hassim ◽  
Muhammad Rehan

<p>Text classification is a fundamental task in several areas of natural language processing (NLP), including words semantic classification, sentiment analysis, question answering, or dialog management. This paper investigates three basic architectures of deep learning models for the tasks of text classification: Deep Belief Neural (DBN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), these three main types of deep learning architectures, are largely explored to handled various classification tasks. DBN have excellent learning capabilities to extracts highly distinguishable features and good for general purpose. CNN have supposed to be better at extracting the position of various related features while RNN is modeling in sequential of long-term dependencies. This paper work shows the systematic comparison of DBN, CNN, and RNN on text classification tasks. Finally, we show the results of deep models by research experiment. The aim of this paper to provides basic guidance about the deep learning models that which models are best for the task of text classification.</p>


Sign in / Sign up

Export Citation Format

Share Document