scholarly journals Continuous Training and Deployment of Deep Learning Models

Author(s):  
Ioannis Prapas ◽  
Behrouz Derakhshan ◽  
Alireza Rezaei Mahdiraji ◽  
Volker Markl

AbstractDeep Learning (DL) has consistently surpassed other Machine Learning methods and achieved state-of-the-art performance in multiple cases. Several modern applications like financial and recommender systems require models that are constantly updated with fresh data. The prominent approach for keeping a DL model fresh is to trigger full retraining from scratch when enough new data are available. However, retraining large and complex DL models is time-consuming and compute-intensive. This makes full retraining costly, wasteful, and slow. In this paper, we present an approach to continuously train and deploy DL models. First, we enable continuous training through proactive training that combines samples of historical data with new streaming data. Second, we enable continuous deployment through gradient sparsification that allows us to send a small percentage of the model updates per training iteration. Our experimental results with LeNet5 on MNIST and modern DL models on CIFAR-10 show that proactive training keeps models fresh with comparable—if not superior—performance to full retraining at a fraction of the time. Combined with gradient sparsification, sparse proactive training enables very fast updates of a deployed model with arbitrarily large sparsity, reducing communication per iteration up to four orders of magnitude, with minimal—if any—losses in model quality. Sparse training, however, comes at a price; it incurs overhead on the training that depends on the size of the model and increases the training time by factors ranging from 1.25 to 3 in our experiments. Arguably, a small price to pay for successfully enabling the continuous training and deployment of large DL models.

2021 ◽  
Author(s):  
messaoud Ahmed Ouameur ◽  
Dương Tuấn Anh Lê ◽  
Gwanggil Jeon ◽  
Felipe A.P. De Figueiredo ◽  
Daniel Massicotte

<div>Abstract— Even though, reconfigurable intelligent surfaces (RISs) are adopted in various scenarios to enable the implementation of a smart radio environment, there are still challenging issues for its real-time operation due to the need for a costly full dimensional channel estimation with offline exhaustive search or online exhaustive beamtraining. The application of the deep learning (DL) tools is favored to enable feasible solutions. In this work, we propose two low training overhead and energy efficient adversarial bandit-based schemes with outstanding performance gains compared to reference DL based reflection beamforming methods. The resulting deep learning models are also discussed using state of-the art model quality prediction trends.</div>


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Yue Wang ◽  
Yiming Jiang ◽  
Julong Lan

When traditional machine learning methods are applied to network intrusion detection, they need to rely on expert knowledge to extract feature vectors in advance, which incurs lack of flexibility and versatility. Recently, deep learning methods have shown superior performance compared with traditional machine learning methods. Deep learning methods can learn the raw data directly, but they are faced with expensive computing cost. To solve this problem, a preprocessing method based on multipacket input unit and compression is proposed, which takes m data packets as the input unit to maximize the retention of information and greatly compresses the raw traffic to shorten the data learning and training time. In our proposed method, the CNN network structure is optimized and the weights of some convolution layers are assigned directly by using the Gabor filter. Experimental results on the benchmark data set show that compared with the existing models, the proposed method improves the detection accuracy by 2.49% and reduces the training time by 62.1%. In addition, the experiments show that the proposed compression method has obvious advantages in detection accuracy and computational efficiency compared with the existing compression methods.


2021 ◽  
Author(s):  
messaoud Ahmed Ouameur ◽  
Dương Tuấn Anh Lê ◽  
Gwanggil Jeon ◽  
Felipe A.P. De Figueiredo ◽  
Daniel Massicotte

<div>Abstract— Even though, reconfigurable intelligent surfaces (RISs) are adopted in various scenarios to enable the implementation of a smart radio environment, there are still challenging issues for its real-time operation due to the need for a costly full dimensional channel estimation with offline exhaustive search or online exhaustive beamtraining. The application of the deep learning (DL) tools is favored to enable feasible solutions. In this work, we propose two low training overhead and energy efficient adversarial bandit-based schemes with outstanding performance gains compared to reference DL based reflection beamforming methods. The resulting deep learning models are also discussed using state of-the art model quality prediction trends.</div>


2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


2020 ◽  
Author(s):  
Saeed Nosratabadi ◽  
Amir Mosavi ◽  
Puhong Duan ◽  
Pedram Ghamisi ◽  
Ferdinand Filip ◽  
...  

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.


2021 ◽  
Vol 11 (5) ◽  
pp. 2284
Author(s):  
Asma Maqsood ◽  
Muhammad Shahid Farid ◽  
Muhammad Hassan Khan ◽  
Marcin Grzegorzek

Malaria is a disease activated by a type of microscopic parasite transmitted from infected female mosquito bites to humans. Malaria is a fatal disease that is endemic in many regions of the world. Quick diagnosis of this disease will be very valuable for patients, as traditional methods require tedious work for its detection. Recently, some automated methods have been proposed that exploit hand-crafted feature extraction techniques however, their accuracies are not reliable. Deep learning approaches modernize the world with their superior performance. Convolutional Neural Networks (CNN) are vastly scalable for image classification tasks that extract features through hidden layers of the model without any handcrafting. The detection of malaria-infected red blood cells from segmented microscopic blood images using convolutional neural networks can assist in quick diagnosis, and this will be useful for regions with fewer healthcare experts. The contributions of this paper are two-fold. First, we evaluate the performance of different existing deep learning models for efficient malaria detection. Second, we propose a customized CNN model that outperforms all observed deep learning models. It exploits the bilateral filtering and image augmentation techniques for highlighting features of red blood cells before training the model. Due to image augmentation techniques, the customized CNN model is generalized and avoids over-fitting. All experimental evaluations are performed on the benchmark NIH Malaria Dataset, and the results reveal that the proposed algorithm is 96.82% accurate in detecting malaria from the microscopic blood smears.


Author(s):  
S. Priya ◽  
R. Annie Uthra

AbstractIn present times, data science become popular to support and improve decision-making process. Due to the accessibility of a wide application perspective of data streaming, class imbalance and concept drifting become crucial learning problems. The advent of deep learning (DL) models finds useful for the classification of concept drift in data streaming applications. This paper presents an effective class imbalance with concept drift detection (CIDD) using Adadelta optimizer-based deep neural networks (ADODNN), named CIDD-ADODNN model for the classification of highly imbalanced streaming data. The presented model involves four processes namely preprocessing, class imbalance handling, concept drift detection, and classification. The proposed model uses adaptive synthetic (ADASYN) technique for handling class imbalance data, which utilizes a weighted distribution for diverse minority class examples based on the level of difficulty in learning. Next, a drift detection technique called adaptive sliding window (ADWIN) is employed to detect the existence of the concept drift. Besides, ADODNN model is utilized for the classification processes. For increasing the classifier performance of the DNN model, ADO-based hyperparameter tuning process takes place to determine the optimal parameters of the DNN model. The performance of the presented model is evaluated using three streaming datasets namely intrusion detection (NSL KDDCup) dataset, Spam dataset, and Chess dataset. A detailed comparative results analysis takes place and the simulation results verified the superior performance of the presented model by obtaining a maximum accuracy of 0.9592, 0.9320, and 0.7646 on the applied KDDCup, Spam, and Chess dataset, respectively.


2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


Sign in / Sign up

Export Citation Format

Share Document