Neural network training based on quasi-Newton method using Nesterov's accelerated gradient

Author(s):  
Hiroshi Ninomiya
2018 ◽  
Vol 26 (8) ◽  
pp. 1575-1579 ◽  
Author(s):  
Qiang Liu ◽  
Jia Liu ◽  
Ruoyu Sang ◽  
Jiajun Li ◽  
Tao Zhang ◽  
...  

Author(s):  
Hesam Karim ◽  
Sharareh R. Niakan ◽  
Reza Safdari

<span lang="EN-US">Heart disease is the first cause of death in different countries. Artificial neural network (ANN) technique can be used to predict or classification patients getting a heart disease. There are different training algorithms for ANN. We compared eight neural network training algorithms for classification of heart disease data from UCI repository containing 303 samples. Performance measures of each algorithm containing the speed of training, the number of epochs, accuracy, and mean square error (MSE) were obtained and analyzed. Our results showed that training time for gradient descent algorithms was longer than other training algorithms (8-10 seconds). In contrast, Quasi-Newton algorithms were faster than others (&lt;=0 second). MSE for all algorithms was between 0.117 and 0.228. While there was a significant association between training algorithms and training time (p&lt;0.05), the number of neurons in hidden layer had not any significant effect on the MSE and/or accuracy of the models (p&gt;0.05). Based on our findings, for development an ANN classification model for heart diseases, it is best to use Quasi-Newton training algorithms because of the best speed and accuracy.</span>


2021 ◽  
Vol 12 (3) ◽  
pp. 554-574
Author(s):  
Shahrzad Mahboubi ◽  
Indrapriyadarsini S ◽  
Hiroshi Ninomiya ◽  
Hideki Asai

Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 711
Author(s):  
Mina Basirat ◽  
Bernhard C. Geiger ◽  
Peter M. Roth

Information plane analysis, describing the mutual information between the input and a hidden layer and between a hidden layer and the target over time, has recently been proposed to analyze the training of neural networks. Since the activations of a hidden layer are typically continuous-valued, this mutual information cannot be computed analytically and must thus be estimated, resulting in apparently inconsistent or even contradicting results in the literature. The goal of this paper is to demonstrate how information plane analysis can still be a valuable tool for analyzing neural network training. To this end, we complement the prevailing binning estimator for mutual information with a geometric interpretation. With this geometric interpretation in mind, we evaluate the impact of regularization and interpret phenomena such as underfitting and overfitting. In addition, we investigate neural network learning in the presence of noisy data and noisy labels.


Sign in / Sign up

Export Citation Format

Share Document