Model Accuracy and Runtime Tradeoff in Distributed Deep Learning: A Systematic Study

Deep learning with a large number of parame-ters requires distributed training, where model accuracy and runtime are two important factors to be considered. However, there has been no systematic study of the tradeoff between these two factors during the model training process. This paper presents Rudra, a parameter server based distributed computing framework tuned for training large-scale deep neural networks. Using variants of the asynchronous stochastic gradient descent algorithm we study the impact of synchronization protocol, stale gradient updates, minibatch size, learning rates, and number of learners on runtime performance and model accuracy. We introduce a new learningrate modulation strategy to counter the effect of stale gradients and propose a new synchronization protocol that can effectively bound the staleness in gradients, improve runtime performance and achieve good model accuracy. Our empirical investigation reveals a principled approach for distributed training of neural networks: the mini-batch size per learner should be reduced as more learners are added to the system to preserve the model accuracy. We validate this approach using commonly-used image classification benchmarks: CIFAR10 and ImageNet.

Download Full-text

Developing a Loss Prediction-based Asynchronous Stochastic Gradient Descent Algorithm for Distributed Training of Deep Neural Networks

49th International Conference on Parallel Processing - ICPP ◽

10.1145/3404397.3404432 ◽

2020 ◽

Author(s):

Junyu Li ◽

Ligang He ◽

Shenyuan Ren ◽

Rui Mao

Keyword(s):

Neural Networks ◽

Gradient Descent ◽

Deep Neural Networks ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Descent Algorithm ◽

Distributed Training ◽

Gradient Descent Algorithm ◽

Loss Prediction

Download Full-text

Implementation of an incremental deep learning model for survival prediction of cardiovascular patients

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v10.i1.pp101-109 ◽

2021 ◽

Vol 10 (1) ◽

pp. 101

Author(s):

Sanaa Elyassami ◽

Achraf Ait Kaddour

Keyword(s):

Neural Networks ◽

Heart Disease ◽

Deep Learning ◽

Learning Model ◽

Classification Model ◽

Stochastic Gradient Descent ◽

Survival Prediction ◽

Proposed Model ◽

The Impact ◽

Deep Learning Model

<span lang="EN-US">Cardiovascular diseases remain the leading cause of death, taking an estimated 17.9 million lives each year and representing 31% of all global deaths. The patient records including blood reports, cardiac echo reports, and physician’s notes can be used to perform feature analysis and to accurately classify heart disease patients. In this paper, an incremental deep learning model was developed and trained with stochastic gradient descent using feedforward neural networks. The chi-square test and the dropout regularization have been incorporated into the model to improve the generalization capabilities and the performance of the heart disease patients' classification model. The impact of the learning rate and the depth of neural networks on the performance were explored. The hyperbolic tangent, the rectifier linear unit, the Maxout, and the exponential rectifier linear unit were used as activation functions for the hidden and the output layer neurons. To avoid over-optimistic results, the performance of the proposed model was evaluated using balanced accuracy and the overall predictive value in addition to the accuracy, sensitivity, and specificity. The obtained results are promising, and the proposed model can be applied to a larger dataset and used by physicians to accurately classify heart disease patients.</span>

Download Full-text

Share Market Data Prediction Strategies using Deep Learning Algorithm

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813666191209093139 ◽

2019 ◽

Vol 13 ◽

Author(s):

A John. ◽

D. Praveen Dominic ◽

M. Adimoolam ◽

N. M. Balamurugan

Keyword(s):

Neural Network ◽

Deep Learning ◽

Stock Market ◽

Predictive Analytics ◽

Learning Algorithm ◽

Market Price ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Mining Machine ◽

Gradient Descent Algorithm

Background:: Predictive analytics has a multiplicity of statistical schemes from predictive modelling, data mining, machine learning. It scrutinizes present and chronological data to make predictions about expectations or if not unexplained measures. Most predictive models are used for business analytics to overcome loses and profit gaining. Predictive analytics is used to exploit the pattern in old and historical data. Objective: People used to follow some strategies for predicting stock value to invest in the more profit-gaining stocks and those strategies to search the stock market prices which are incorporated in some intelligent methods and tools. Such strategies will increase the investor’s profits and also minimize their risks. So prediction plays a vital role in stock market gaining and is also a very intricate and challenging process. Method: The proposed optimized strategies are the Deep Neural Network with Stochastic Gradient for stock prediction. The Neural Network is trained using Back-propagation neural networks algorithm and stochastic gradient descent algorithm as optimal strategies. Results: The experiment is conducted for stock market price prediction using python language with the visual package. In this experiment RELIANCE.NS, TATAMOTORS.NS, and TATAGLOBAL.NS dataset are taken as input dataset and it is downloaded from National Stock Exchange site. The artificial neural network component including Deep Learning model is most effective for more than 100,000 data points to train this model. This proposed model is developed on daily prices of stock market price to understand how to build model with better performance than existing national exchange method.

Download Full-text

Data augmentation for computed tomography angiography via synthetic image generation and neural domain adaptation

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2020-0015 ◽

2020 ◽

Vol 6 (1) ◽

Author(s):

Malte Seemann ◽

Lennart Bargsten ◽

Alexander Schlaefer

Keyword(s):

Computed Tomography ◽

Neural Networks ◽

Deep Learning ◽

Medical Imaging ◽

Computed Tomography Angiography ◽

Data Augmentation ◽

Domain Adaptation ◽

Synthetic Image ◽

Wide Range ◽

The Impact

AbstractDeep learning methods produce promising results when applied to a wide range of medical imaging tasks, including segmentation of artery lumen in computed tomography angiography (CTA) data. However, to perform sufficiently, neural networks have to be trained on large amounts of high quality annotated data. In the realm of medical imaging, annotations are not only quite scarce but also often not entirely reliable. To tackle both challenges, we developed a two-step approach for generating realistic synthetic CTA data for the purpose of data augmentation. In the first step moderately realistic images are generated in a purely numerical fashion. In the second step these images are improved by applying neural domain adaptation. We evaluated the impact of synthetic data on lumen segmentation via convolutional neural networks (CNNs) by comparing resulting performances. Improvements of up to 5% in terms of Dice coefficient and 20% for Hausdorff distance represent a proof of concept that the proposed augmentation procedure can be used to enhance deep learning-based segmentation for artery lumen in CTA images.

Download Full-text

Optical Recognition of Handwritten Logic Formulas Using Neural Networks

Electronics ◽

10.3390/electronics10222761 ◽

2021 ◽

Vol 10 (22) ◽

pp. 2761

Author(s):

Vaios Ampelakiotis ◽

Isidoros Perikos ◽

Ioannis Hatzilygeroudis ◽

George Tsihrintzis

Keyword(s):

Neural Networks ◽

Character Recognition ◽

Gradient Descent ◽

Feedforward Neural Networks ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Training Algorithms ◽

Gradient Descent Algorithm ◽

Two Stages ◽

And Training

In this paper, we present a handwritten character recognition (HCR) system that aims to recognize first-order logic handwritten formulas and create editable text files of the recognized formulas. Dense feedforward neural networks (NNs) are utilized, and their performance is examined under various training conditions and methods. More specifically, after three training algorithms (backpropagation, resilient propagation and stochastic gradient descent) had been tested, we created and trained an NN with the stochastic gradient descent algorithm, optimized by the Adam update rule, which was proved to be the best, using a trainset of 16,750 handwritten image samples of 28 × 28 each and a testset of 7947 samples. The final accuracy achieved is 90.13%. The general methodology followed consists of two stages: the image processing and the NN design and training. Finally, an application has been created that implements the methodology and automatically recognizes handwritten logic formulas. An interesting feature of the application is that it allows for creating new, user-oriented training sets and parameter settings, and thus new NN models.

Download Full-text

MapReduce and Optimized Deep Network for Rainfall Prediction in Agriculture

The Computer Journal ◽

10.1093/comjnl/bxz164 ◽

2020 ◽

Vol 63 (6) ◽

pp. 900-912

Author(s):

Oswalt Manoj S ◽

Ananth J P

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Prediction Models ◽

Short Term Memory ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Mean Square ◽

Rainfall Prediction ◽

Gradient Descent Algorithm ◽

Major Factors

Abstract Rainfall prediction is the active area of research as it enables the farmers to move with the effective decision-making regarding agriculture in both cultivation and irrigation. The existing prediction models are scary as the prediction of rainfall depended on three major factors including the humidity, rainfall and rainfall recorded in the previous years, which resulted in huge time consumption and leveraged huge computational efforts associated with the analysis. Thus, this paper introduces the rainfall prediction model based on the deep learning network, convolutional long short-term memory (convLSTM) system, which promises a prediction based on the spatial-temporal patterns. The weights of the convLSTM are tuned optimally using the proposed Salp-stochastic gradient descent algorithm (S-SGD), which is the integration of Salp swarm algorithm (SSA) in the stochastic gradient descent (SGD) algorithm in order to facilitate the global optimal tuning of the weights and to assure a better prediction accuracy. On the other hand, the proposed deep learning framework is built in the MapReduce framework that enables the effective handling of the big data. The analysis using the rainfall prediction database reveals that the proposed model acquired the minimal mean square error (MSE) and percentage root mean square difference (PRD) of 0.001 and 0.0021.

Download Full-text

Locally adaptive activation functions with slope recovery for deep and physics-informed neural networks

Proceedings of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rspa.2020.0334 ◽

2020 ◽

Vol 476 (2239) ◽

pp. 20200334 ◽

Cited By ~ 2

Author(s):

Ameya D. Jagtap ◽

Kenji Kawaguchi ◽

George Em Karniadakis

Keyword(s):

Neural Networks ◽

Adaptive Learning ◽

Gradient Descent ◽

Activation Function ◽

Stochastic Gradient Descent ◽

Activation Functions ◽

Gradient Descent Algorithm ◽

Locally Adaptive ◽

The Matrix ◽

Base Method

We propose two approaches of locally adaptive activation functions namely, layer-wise and neuron-wise locally adaptive activation functions, which improve the performance of deep and physics-informed neural networks. The local adaptation of activation function is achieved by introducing a scalable parameter in each layer (layer-wise) and for every neuron (neuron-wise) separately, and then optimizing it using a variant of stochastic gradient descent algorithm. In order to further increase the training speed, an activation slope-based slope recovery term is added in the loss function, which further accelerates convergence, thereby reducing the training cost. On the theoretical side, we prove that in the proposed method, the gradient descent algorithms are not attracted to sub-optimal critical points or local minima under practical conditions on the initialization and learning rate, and that the gradient dynamics of the proposed method is not achievable by base methods with any (adaptive) learning rates. We further show that the adaptive activation methods accelerate the convergence by implicitly multiplying conditioning matrices to the gradient of the base method without any explicit computation of the conditioning matrix and the matrix–vector product. The different adaptive activation functions are shown to induce different implicit conditioning matrices. Furthermore, the proposed methods with the slope recovery are shown to accelerate the training process.

Download Full-text

A Novel Stochastic Gradient Descent Algorithm Based on Grouping over Heterogeneous Cluster Systems for Distributed Deep Learning

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) ◽

10.1109/ccgrid.2019.00053 ◽

2019 ◽

Author(s):

Wenbin Jiang ◽

Geyan Ye ◽

Laurence T. Yang ◽

Jian Zhu ◽

Yang Ma ◽

...

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Heterogeneous Cluster ◽

Cluster Systems ◽

Descent Algorithm ◽

Gradient Descent Algorithm

Download Full-text

An Efficient, Distributed Stochastic Gradient Descent Algorithm for Deep-Learning Applications

2017 46th International Conference on Parallel Processing (ICPP) ◽

10.1109/icpp.2017.10 ◽

2017 ◽

Cited By ~ 2

Author(s):

Guojing Cong ◽

Onkar Bhardwaj ◽

Minwei Feng

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Descent Algorithm ◽

Gradient Descent Algorithm

Download Full-text

Plant Diseases Identification through a Discount Momentum Optimizer in Deep Learning

Applied Sciences ◽

10.3390/app11209468 ◽

2021 ◽

Vol 11 (20) ◽

pp. 9468

Author(s):

Yunyun Sun ◽

Yutong Liu ◽

Haocheng Zhou ◽

Huijuan Hu

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

Adaptive Learning ◽

Learning Rate ◽

Plant Diseases ◽

Stochastic Gradient Descent ◽

Automatic Identification ◽

Deep Convolutional Neural Networks ◽

Adaptive Learning Rate

Deep learning proves its promising results in various domains. The automatic identification of plant diseases with deep convolutional neural networks attracts a lot of attention at present. This article extends stochastic gradient descent momentum optimizer and presents a discount momentum (DM) deep learning optimizer for plant diseases identification. To examine the recognition and generalization capability of the DM optimizer, we discuss the hyper-parameter tuning and convolutional neural networks models across the plantvillage dataset. We further conduct comparison experiments on popular non-adaptive learning rate methods. The proposed approach achieves an average validation accuracy of no less than 97% for plant diseases prediction on several state-of-the-art deep learning models and holds a low sensitivity to hyper-parameter settings. Experimental results demonstrate that the DM method can bring a higher identification performance, while still maintaining a competitive performance over other non-adaptive learning rate methods in terms of both training speed and generalization.

Download Full-text