scholarly journals Model Accuracy and Runtime Tradeoff in Distributed Deep Learning: A Systematic Study

Author(s):  
Suyog Gupta ◽  
Wei Zhang ◽  
Fei Wang

Deep learning with a large number of parame-ters requires distributed training, where model accuracy and runtime are two important factors to be considered. However, there has been no systematic study of the tradeoff between these two factors during the model training process. This paper presents Rudra, a parameter server based distributed computing framework tuned for training large-scale deep neural networks. Using variants of the asynchronous stochastic gradient descent algorithm we study the impact of synchronization protocol, stale gradient updates, minibatch size, learning rates, and number of learners on runtime performance and model accuracy. We introduce a new learningrate modulation strategy to counter the effect of stale gradients and propose a new synchronization protocol that can effectively bound the staleness in gradients, improve runtime performance and achieve good model accuracy. Our empirical investigation reveals a principled approach for distributed training of neural networks: the mini-batch size per learner should be reduced as more learners are added to the system to preserve the model accuracy. We validate this approach using commonly-used image classification benchmarks: CIFAR10 and ImageNet.

Author(s):  
Sanaa Elyassami ◽  
Achraf Ait Kaddour

<span lang="EN-US">Cardiovascular diseases remain the leading cause of death, taking an estimated 17.9 million lives each year and representing 31% of all global deaths. The patient records including blood reports, cardiac echo reports, and physician’s notes can be used to perform feature analysis and to accurately classify heart disease patients. In this paper, an incremental deep learning model was developed and trained with stochastic gradient descent using feedforward neural networks. The chi-square test and the dropout regularization have been incorporated into the model to improve the generalization capabilities and the performance of the heart disease patients' classification model. The impact of the learning rate and the depth of neural networks on the performance were explored. The hyperbolic tangent, the rectifier linear unit, the Maxout, and the exponential rectifier linear unit were used as activation functions for the hidden and the output layer neurons. To avoid over-optimistic results, the performance of the proposed model was evaluated using balanced accuracy and the overall predictive value in addition to the accuracy, sensitivity, and specificity. The obtained results are promising, and the proposed model can be applied to a larger dataset and used by physicians to accurately classify heart disease patients.</span>


Author(s):  
A John. ◽  
D. Praveen Dominic ◽  
M. Adimoolam ◽  
N. M. Balamurugan

Background:: Predictive analytics has a multiplicity of statistical schemes from predictive modelling, data mining, machine learning. It scrutinizes present and chronological data to make predictions about expectations or if not unexplained measures. Most predictive models are used for business analytics to overcome loses and profit gaining. Predictive analytics is used to exploit the pattern in old and historical data. Objective: People used to follow some strategies for predicting stock value to invest in the more profit-gaining stocks and those strategies to search the stock market prices which are incorporated in some intelligent methods and tools. Such strategies will increase the investor’s profits and also minimize their risks. So prediction plays a vital role in stock market gaining and is also a very intricate and challenging process. Method: The proposed optimized strategies are the Deep Neural Network with Stochastic Gradient for stock prediction. The Neural Network is trained using Back-propagation neural networks algorithm and stochastic gradient descent algorithm as optimal strategies. Results: The experiment is conducted for stock market price prediction using python language with the visual package. In this experiment RELIANCE.NS, TATAMOTORS.NS, and TATAGLOBAL.NS dataset are taken as input dataset and it is downloaded from National Stock Exchange site. The artificial neural network component including Deep Learning model is most effective for more than 100,000 data points to train this model. This proposed model is developed on daily prices of stock market price to understand how to build model with better performance than existing national exchange method.


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Malte Seemann ◽  
Lennart Bargsten ◽  
Alexander Schlaefer

AbstractDeep learning methods produce promising results when applied to a wide range of medical imaging tasks, including segmentation of artery lumen in computed tomography angiography (CTA) data. However, to perform sufficiently, neural networks have to be trained on large amounts of high quality annotated data. In the realm of medical imaging, annotations are not only quite scarce but also often not entirely reliable. To tackle both challenges, we developed a two-step approach for generating realistic synthetic CTA data for the purpose of data augmentation. In the first step moderately realistic images are generated in a purely numerical fashion. In the second step these images are improved by applying neural domain adaptation. We evaluated the impact of synthetic data on lumen segmentation via convolutional neural networks (CNNs) by comparing resulting performances. Improvements of up to 5% in terms of Dice coefficient and 20% for Hausdorff distance represent a proof of concept that the proposed augmentation procedure can be used to enhance deep learning-based segmentation for artery lumen in CTA images.


Electronics ◽  
2021 ◽  
Vol 10 (22) ◽  
pp. 2761
Author(s):  
Vaios Ampelakiotis ◽  
Isidoros Perikos ◽  
Ioannis Hatzilygeroudis ◽  
George Tsihrintzis

In this paper, we present a handwritten character recognition (HCR) system that aims to recognize first-order logic handwritten formulas and create editable text files of the recognized formulas. Dense feedforward neural networks (NNs) are utilized, and their performance is examined under various training conditions and methods. More specifically, after three training algorithms (backpropagation, resilient propagation and stochastic gradient descent) had been tested, we created and trained an NN with the stochastic gradient descent algorithm, optimized by the Adam update rule, which was proved to be the best, using a trainset of 16,750 handwritten image samples of 28 × 28 each and a testset of 7947 samples. The final accuracy achieved is 90.13%. The general methodology followed consists of two stages: the image processing and the NN design and training. Finally, an application has been created that implements the methodology and automatically recognizes handwritten logic formulas. An interesting feature of the application is that it allows for creating new, user-oriented training sets and parameter settings, and thus new NN models.


2020 ◽  
Vol 63 (6) ◽  
pp. 900-912
Author(s):  
Oswalt Manoj S ◽  
Ananth J P

Abstract Rainfall prediction is the active area of research as it enables the farmers to move with the effective decision-making regarding agriculture in both cultivation and irrigation. The existing prediction models are scary as the prediction of rainfall depended on three major factors including the humidity, rainfall and rainfall recorded in the previous years, which resulted in huge time consumption and leveraged huge computational efforts associated with the analysis. Thus, this paper introduces the rainfall prediction model based on the deep learning network, convolutional long short-term memory (convLSTM) system, which promises a prediction based on the spatial-temporal patterns. The weights of the convLSTM are tuned optimally using the proposed Salp-stochastic gradient descent algorithm (S-SGD), which is the integration of Salp swarm algorithm (SSA) in the stochastic gradient descent (SGD) algorithm in order to facilitate the global optimal tuning of the weights and to assure a better prediction accuracy. On the other hand, the proposed deep learning framework is built in the MapReduce framework that enables the effective handling of the big data. The analysis using the rainfall prediction database reveals that the proposed model acquired the minimal mean square error (MSE) and percentage root mean square difference (PRD) of 0.001 and 0.0021.


Author(s):  
Ameya D. Jagtap ◽  
Kenji Kawaguchi ◽  
George Em Karniadakis

We propose two approaches of locally adaptive activation functions namely, layer-wise and neuron-wise locally adaptive activation functions, which improve the performance of deep and physics-informed neural networks. The local adaptation of activation function is achieved by introducing a scalable parameter in each layer (layer-wise) and for every neuron (neuron-wise) separately, and then optimizing it using a variant of stochastic gradient descent algorithm. In order to further increase the training speed, an activation slope-based slope recovery term is added in the loss function, which further accelerates convergence, thereby reducing the training cost. On the theoretical side, we prove that in the proposed method, the gradient descent algorithms are not attracted to sub-optimal critical points or local minima under practical conditions on the initialization and learning rate, and that the gradient dynamics of the proposed method is not achievable by base methods with any (adaptive) learning rates. We further show that the adaptive activation methods accelerate the convergence by implicitly multiplying conditioning matrices to the gradient of the base method without any explicit computation of the conditioning matrix and the matrix–vector product. The different adaptive activation functions are shown to induce different implicit conditioning matrices. Furthermore, the proposed methods with the slope recovery are shown to accelerate the training process.


2021 ◽  
Vol 11 (20) ◽  
pp. 9468
Author(s):  
Yunyun Sun ◽  
Yutong Liu ◽  
Haocheng Zhou ◽  
Huijuan Hu

Deep learning proves its promising results in various domains. The automatic identification of plant diseases with deep convolutional neural networks attracts a lot of attention at present. This article extends stochastic gradient descent momentum optimizer and presents a discount momentum (DM) deep learning optimizer for plant diseases identification. To examine the recognition and generalization capability of the DM optimizer, we discuss the hyper-parameter tuning and convolutional neural networks models across the plantvillage dataset. We further conduct comparison experiments on popular non-adaptive learning rate methods. The proposed approach achieves an average validation accuracy of no less than 97% for plant diseases prediction on several state-of-the-art deep learning models and holds a low sensitivity to hyper-parameter settings. Experimental results demonstrate that the DM method can bring a higher identification performance, while still maintaining a competitive performance over other non-adaptive learning rate methods in terms of both training speed and generalization.


Sign in / Sign up

Export Citation Format

Share Document