scholarly journals A Novel Learning Rate Schedule in Optimization for Neural Networks and It’s Convergence

Symmetry ◽  
2020 ◽  
Vol 12 (4) ◽  
pp. 660 ◽  
Author(s):  
Jieun Park ◽  
Dokkyun Yi ◽  
Sangmin Ji

The process of machine learning is to find parameters that minimize the cost function constructed by learning the data. This is called optimization and the parameters at that time are called the optimal parameters in neural networks. In the process of finding the optimization, there were attempts to solve the symmetric optimization or initialize the parameters symmetrically. Furthermore, in order to obtain the optimal parameters, the existing methods have used methods in which the learning rate is decreased over the iteration time or is changed according to a certain ratio. These methods are a monotonically decreasing method at a constant rate according to the iteration time. Our idea is to make the learning rate changeable unlike the monotonically decreasing method. We introduce a method to find the optimal parameters which adaptively changes the learning rate according to the value of the cost function. Therefore, when the cost function is optimized, the learning is complete and the optimal parameters are obtained. This paper proves that the method ensures convergence to the optimal parameters. This means that our method achieves a minimum of the cost function (or effective learning). Numerical experiments demonstrate that learning is good effective when using the proposed learning rate schedule in various situations.

2021 ◽  
Vol 11 (2) ◽  
pp. 850
Author(s):  
Dokkyun Yi ◽  
Sangmin Ji ◽  
Jieun Park

Artificial intelligence (AI) is achieved by optimizing the cost function constructed from learning data. Changing the parameters in the cost function is an AI learning process (or AI learning for convenience). If AI learning is well performed, then the value of the cost function is the global minimum. In order to obtain the well-learned AI learning, the parameter should be no change in the value of the cost function at the global minimum. One useful optimization method is the momentum method; however, the momentum method has difficulty stopping the parameter when the value of the cost function satisfies the global minimum (non-stop problem). The proposed method is based on the momentum method. In order to solve the non-stop problem of the momentum method, we use the value of the cost function to our method. Therefore, as the learning method processes, the mechanism in our method reduces the amount of change in the parameter by the effect of the value of the cost function. We verified the method through proof of convergence and numerical experiments with existing methods to ensure that the learning works well.


Author(s):  
TAO WANG ◽  
XIAOLIANG XING ◽  
XINHUA ZHUANG

In this paper, we describe an optimal learning algorithm for designing one-layer neural networks by means of global minimization. Taking the properties of a well-defined neural network into account, we derive a cost function to measure the goodness of the network quantitatively. The connection weights are determined by the gradient descent rule to minimize the cost function. The optimal learning algorithm is formed as either the unconstraint-based or the constraint-based minimization problem. It ensures the realization of each desired associative mapping with the best noise reduction ability in the sense of optimization. We also investigate the storage capacity of the neural network, the degree of noise reduction for a desired associative mapping, and the convergence of the learning algorithm in an analytic way. Finally, a large number of computer experimental results are presented.


2019 ◽  
Vol 70 (1) ◽  
pp. 46-51
Author(s):  
Ivan Sekaj ◽  
Martin Ernek

Abstract The contribution presents the use of Genetic Algorithm for searching of the optimal parameters of a set of speed controllers of an isolated power-electricity island. Nine PI-controllers are designed. The cost function which is minimised using the Genetic Algorithm represents the integral of the control error area. Robustness aspects of the control design are considered as well.


Author(s):  
Tuan Hoang ◽  
Thanh-Toan Do ◽  
Tam V. Nguyen ◽  
Ngai-Man Cheung

This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. However, this approach would result in some mismatch: the gradient descent updates full-precision weights, but it does not update the quantized weights. To address this issue, we propose a novel method that enables direct updating of quantized weights with learnable quantization levels to minimize the cost function using gradient descent. Second, to obtain low bit-width activations, existing works consider all channels equally. However, the activation quantizers could be biased toward a few channels with high-variance. To address this issue, we propose a method to take into account the quantization errors of individual channels. With this approach, we can learn activation quantizers that minimize the quantization errors in the majority of channels. Experimental results demonstrate that our proposed method achieves state-of-the-art performance on the image classification task, using AlexNet, ResNet and MobileNetV2 architectures on CIFAR-100 and ImageNet datasets.


2021 ◽  
Author(s):  
Amir Valizadeh

Abstract In this paper, an alternative way to backpropagation is introduced and tested, which results in faster convergence of the cost function.


2020 ◽  
Vol 15 ◽  
pp. 48
Author(s):  
J. Frédéric Bonnans ◽  
Justina Gianatti

We propose a model for the COVID-19 epidemic where the population is partitioned into classes corresponding to ages (that remain constant during the epidemic). The main feature is to take into account the infection age of the infected population. This allows to better simulate the infection propagation that crucially depend on the infection age. We discuss how to estimate the coefficients from data available in the future, and introduce a confinement variable as control. The cost function is a compromise between a confinement term, the hospitalization peak and the death toll. Our numerical experiments allow to evaluate the interest of confinement varying with age classes.


2019 ◽  
Vol 147 (7) ◽  
pp. 2579-2602 ◽  
Author(s):  
Hiroshi Sumata ◽  
Frank Kauker ◽  
Michael Karcher ◽  
Rüdiger Gerdes

Abstract The uniqueness of optimal parameter sets of an Arctic sea ice simulation is investigated. A set of parameter optimization experiments is performed using an automatic parameter optimization system, which simultaneously optimizes 15 dynamic and thermodynamic process parameters. The system employs a stochastic approach (genetic algorithm) to find the global minimum of a cost function. The cost function is defined by the model–observation misfit and observational uncertainties of three sea ice properties (concentration, thickness, drift) covering the entire Arctic Ocean over more than two decades. A total of 11 independent optimizations are carried out to examine the uniqueness of the minimum of the cost function and the associated optimal parameter sets. All 11 optimizations asymptotically reduce the value of the cost functions toward an apparent global minimum and provide strikingly similar sea ice fields. The corresponding optimal parameters, however, exhibit a large spread, showing the existence of multiple optimal solutions. The result shows that the utilized sea ice observations, even though covering more than two decades, cannot constrain the process parameters toward a unique solution. A correlation analysis shows that the optimal parameters are interrelated and covariant. A principal component analysis reveals that the first three (six) principal components explain 70% (90%) of the total variance of the optimal parameter sets, indicating a contraction of the parameter space. Analysis of the associated ocean fields exhibits a large spread of these fields over the 11 optimized parameter sets, suggesting an importance of ocean properties to achieve a dynamically consistent view of the coupled sea ice–ocean system.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Idris Kharroubi ◽  
Thomas Lim ◽  
Xavier Warin

AbstractWe study the approximation of backward stochastic differential equations (BSDEs for short) with a constraint on the gains process. We first discretize the constraint by applying a so-called facelift operator at times of a grid. We show that this discretely constrained BSDE converges to the continuously constrained one as the mesh grid converges to zero. We then focus on the approximation of the discretely constrained BSDE. For that we adopt a machine learning approach. We show that the facelift can be approximated by an optimization problem over a class of neural networks under constraints on the neural network and its derivative. We then derive an algorithm converging to the discretely constrained BSDE as the number of neurons goes to infinity. We end by numerical experiments.


Actuators ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 30
Author(s):  
Pornthep Preechayasomboon ◽  
Eric Rombokas

Soft robotic actuators are now being used in practical applications; however, they are often limited to open-loop control that relies on the inherent compliance of the actuator. Achieving human-like manipulation and grasping with soft robotic actuators requires at least some form of sensing, which often comes at the cost of complex fabrication and purposefully built sensor structures. In this paper, we utilize the actuating fluid itself as a sensing medium to achieve high-fidelity proprioception in a soft actuator. As our sensors are somewhat unstructured, their readings are difficult to interpret using linear models. We therefore present a proof of concept of a method for deriving the pose of the soft actuator using recurrent neural networks. We present the experimental setup and our learned state estimator to show that our method is viable for achieving proprioception and is also robust to common sensor failures.


Sign in / Sign up

Export Citation Format

Share Document