A Novel Learning Rate Schedule in Optimization for Neural Networks and It’s Convergence

The process of machine learning is to find parameters that minimize the cost function constructed by learning the data. This is called optimization and the parameters at that time are called the optimal parameters in neural networks. In the process of finding the optimization, there were attempts to solve the symmetric optimization or initialize the parameters symmetrically. Furthermore, in order to obtain the optimal parameters, the existing methods have used methods in which the learning rate is decreased over the iteration time or is changed according to a certain ratio. These methods are a monotonically decreasing method at a constant rate according to the iteration time. Our idea is to make the learning rate changeable unlike the monotonically decreasing method. We introduce a method to find the optimal parameters which adaptively changes the learning rate according to the value of the cost function. Therefore, when the cost function is optimized, the learning is complete and the optimal parameters are obtained. This paper proves that the method ensures convergence to the optimal parameters. This means that our method achieves a minimum of the cost function (or effective learning). Numerical experiments demonstrate that learning is good effective when using the proposed learning rate schedule in various situations.

Download Full-text

An Adaptive Optimization Method Based on Learning Rate Schedule for Neural Networks

Applied Sciences ◽

10.3390/app11020850 ◽

2021 ◽

Vol 11 (2) ◽

pp. 850

Author(s):

Dokkyun Yi ◽

Sangmin Ji ◽

Jieun Park

Keyword(s):

Artificial Intelligence ◽

Cost Function ◽

Numerical Experiments ◽

Global Minimum ◽

Optimization Method ◽

Learning Method ◽

Adaptive Optimization ◽

The Cost ◽

Proof Of Convergence ◽

Learning Data

Artificial intelligence (AI) is achieved by optimizing the cost function constructed from learning data. Changing the parameters in the cost function is an AI learning process (or AI learning for convenience). If AI learning is well performed, then the value of the cost function is the global minimum. In order to obtain the well-learned AI learning, the parameter should be no change in the value of the cost function at the global minimum. One useful optimization method is the momentum method; however, the momentum method has difficulty stopping the parameter when the value of the cost function satisfies the global minimum (non-stop problem). The proposed method is based on the momentum method. In order to solve the non-stop problem of the momentum method, we use the value of the cost function to our method. Therefore, as the learning method processes, the mechanism in our method reduces the amount of change in the parameter by the effect of the value of the cost function. We verified the method through proof of convergence and numerical experiments with existing methods to ensure that the learning works well.

Download Full-text

Feedforward neural networks for Bayes-optimal classification: investigations into the influence of the composition of the training set on the cost function

Proceedings of 13th International Conference on Pattern Recognition ◽

10.1109/icpr.1996.547419 ◽

1996 ◽

Cited By ~ 1

Author(s):

A. Doering ◽

H. Witte

Keyword(s):

Neural Networks ◽

Cost Function ◽

Feedforward Neural Networks ◽

Training Set ◽

Optimal Classification ◽

The Cost

Download Full-text

CHARACTERIZING ONE-LAYER ASSOCIATIVE NEURAL NETWORKS WITH OPTIMAL NOISE-REDUCTION ABILITY

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001492000497 ◽

1992 ◽

Vol 06 (05) ◽

pp. 1009-1025 ◽

Cited By ~ 1

Author(s):

TAO WANG ◽

XIAOLIANG XING ◽

XINHUA ZHUANG

Keyword(s):

Neural Network ◽

Neural Networks ◽

Cost Function ◽

Noise Reduction ◽

Gradient Descent ◽

Storage Capacity ◽

Learning Algorithm ◽

Optimal Learning ◽

The Neural Network ◽

The Cost

In this paper, we describe an optimal learning algorithm for designing one-layer neural networks by means of global minimization. Taking the properties of a well-defined neural network into account, we derive a cost function to measure the goodness of the network quantitatively. The connection weights are determined by the gradient descent rule to minimize the cost function. The optimal learning algorithm is formed as either the unconstraint-based or the constraint-based minimization problem. It ensures the realization of each desired associative mapping with the best noise reduction ability in the sense of optimization. We also investigate the storage capacity of the neural network, the degree of noise reduction for a desired associative mapping, and the convergence of the learning algorithm in an analytic way. Finally, a large number of computer experimental results are presented.

Download Full-text

Controller design of isolated power-electricity island using genetic algorithm

Journal of Electrical Engineering ◽

10.2478/jee-2019-0006 ◽

2019 ◽

Vol 70 (1) ◽

pp. 46-51

Author(s):

Ivan Sekaj ◽

Martin Ernek

Keyword(s):

Genetic Algorithm ◽

Cost Function ◽

Controller Design ◽

Control Design ◽

Optimal Parameters ◽

Control Error ◽

Pi Controllers ◽

The Cost

Abstract The contribution presents the use of Genetic Algorithm for searching of the optimal parameters of a set of speed controllers of an isolated power-electricity island. Nine PI-controllers are designed. The cost function which is minimised using the Genetic Algorithm represents the integral of the control error area. Robustness aspects of the control design are considered as well.

Download Full-text

Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/292 ◽

2020 ◽

Author(s):

Tuan Hoang ◽

Thanh-Toan Do ◽

Tam V. Nguyen ◽

Ngai-Man Cheung

Keyword(s):

Neural Networks ◽

Cost Function ◽

Image Classification ◽

Convolutional Neural Networks ◽

Gradient Descent ◽

Deep Neural Networks ◽

State Of The Art ◽

Deep Convolutional Neural Networks ◽

Novel Method ◽

The Cost

This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. However, this approach would result in some mismatch: the gradient descent updates full-precision weights, but it does not update the quantized weights. To address this issue, we propose a novel method that enables direct updating of quantized weights with learnable quantization levels to minimize the cost function using gradient descent. Second, to obtain low bit-width activations, existing works consider all channels equally. However, the activation quantizers could be biased toward a few channels with high-variance. To address this issue, we propose a method to take into account the quantization errors of individual channels. With this approach, we can learn activation quantizers that minimize the quantization errors in the majority of channels. Experimental results demonstrate that our proposed method achieves state-of-the-art performance on the image classification task, using AlexNet, ResNet and MobileNetV2 architectures on CIFAR-100 and ImageNet datasets.

Download Full-text

A way to reduce the time consumption effect of for-loops for training neural networks: optimized propagation

10.21203/rs.3.rs-776504/v1 ◽

2021 ◽

Author(s):

Amir Valizadeh

Keyword(s):

Neural Networks ◽

Cost Function ◽

Time Consumption ◽

The Cost

Abstract In this paper, an alternative way to backpropagation is introduced and tested, which results in faster convergence of the cost function.

Download Full-text

Optimal control techniques based on infection age for the study of the COVID-19 epidemic

Mathematical Modelling of Natural Phenomena ◽

10.1051/mmnp/2020035 ◽

2020 ◽

Vol 15 ◽

pp. 48

Author(s):

J. Frédéric Bonnans ◽

Justina Gianatti

Keyword(s):

Optimal Control ◽

Cost Function ◽

Numerical Experiments ◽

Age Classes ◽

Control Techniques ◽

Infection Age ◽

Death Toll ◽

Infected Population ◽

The Future ◽

The Cost

We propose a model for the COVID-19 epidemic where the population is partitioned into classes corresponding to ages (that remain constant during the epidemic). The main feature is to take into account the infection age of the infected population. This allows to better simulate the infection propagation that crucially depend on the infection age. We discuss how to estimate the coefficients from data available in the future, and introduce a confinement variable as control. The cost function is a compromise between a confinement term, the hospitalization peak and the death toll. Our numerical experiments allow to evaluate the interest of confinement varying with age classes.

Download Full-text

Covariance of Optimal Parameters of an Arctic Sea Ice–Ocean Model

Monthly Weather Review ◽

10.1175/mwr-d-18-0375.1 ◽

2019 ◽

Vol 147 (7) ◽

pp. 2579-2602 ◽

Cited By ~ 1

Author(s):

Hiroshi Sumata ◽

Frank Kauker ◽

Michael Karcher ◽

Rüdiger Gerdes

Keyword(s):

Sea Ice ◽

Cost Function ◽

Parameter Optimization ◽

Global Minimum ◽

Optimal Parameter ◽

Arctic Sea Ice ◽

Optimal Parameters ◽

Large Spread ◽

Arctic Sea ◽

The Cost

Abstract The uniqueness of optimal parameter sets of an Arctic sea ice simulation is investigated. A set of parameter optimization experiments is performed using an automatic parameter optimization system, which simultaneously optimizes 15 dynamic and thermodynamic process parameters. The system employs a stochastic approach (genetic algorithm) to find the global minimum of a cost function. The cost function is defined by the model–observation misfit and observational uncertainties of three sea ice properties (concentration, thickness, drift) covering the entire Arctic Ocean over more than two decades. A total of 11 independent optimizations are carried out to examine the uniqueness of the minimum of the cost function and the associated optimal parameter sets. All 11 optimizations asymptotically reduce the value of the cost functions toward an apparent global minimum and provide strikingly similar sea ice fields. The corresponding optimal parameters, however, exhibit a large spread, showing the existence of multiple optimal solutions. The result shows that the utilized sea ice observations, even though covering more than two decades, cannot constrain the process parameters toward a unique solution. A correlation analysis shows that the optimal parameters are interrelated and covariant. A principal component analysis reveals that the first three (six) principal components explain 70% (90%) of the total variance of the optimal parameter sets, indicating a contraction of the parameter space. Analysis of the associated ocean fields exhibits a large spread of these fields over the 11 optimized parameter sets, suggesting an importance of ocean properties to achieve a dynamically consistent view of the coupled sea ice–ocean system.

Download Full-text

Discretization and machine learning approximation of BSDEs with a constraint on the Gains-process

Monte Carlo Methods and Applications ◽

10.1515/mcma-2020-2080 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Idris Kharroubi ◽

Thomas Lim ◽

Xavier Warin

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Differential Equations ◽

Numerical Experiments ◽

Optimization Problem ◽

Learning Approach ◽

The Neural Network ◽

Machine Learning Approach ◽

Mesh Grid

AbstractWe study the approximation of backward stochastic differential equations (BSDEs for short) with a constraint on the gains process. We first discretize the constraint by applying a so-called facelift operator at times of a grid. We show that this discretely constrained BSDE converges to the continuously constrained one as the mesh grid converges to zero. We then focus on the approximation of the discretely constrained BSDE. For that we adopt a machine learning approach. We show that the facelift can be approximated by an optimization problem over a class of neural networks under constraints on the neural network and its derivative. We then derive an algorithm converging to the discretely constrained BSDE as the number of neurons goes to infinity. We end by numerical experiments.

Download Full-text

Sensuator: A Hybrid Sensor–Actuator Approach to Soft Robotic Proprioception Using Recurrent Neural Networks

Actuators ◽

10.3390/act10020030 ◽

2021 ◽

Vol 10 (2) ◽

pp. 30

Author(s):

Pornthep Preechayasomboon ◽

Eric Rombokas

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Linear Models ◽

Open Loop ◽

Proof Of Concept ◽

State Estimator ◽

Loop Control ◽

Practical Applications ◽

Soft Actuator ◽

The Cost

Soft robotic actuators are now being used in practical applications; however, they are often limited to open-loop control that relies on the inherent compliance of the actuator. Achieving human-like manipulation and grasping with soft robotic actuators requires at least some form of sensing, which often comes at the cost of complex fabrication and purposefully built sensor structures. In this paper, we utilize the actuating fluid itself as a sensing medium to achieve high-fidelity proprioception in a soft actuator. As our sensors are somewhat unstructured, their readings are difficult to interpret using linear models. We therefore present a proof of concept of a method for deriving the pose of the soft actuator using recurrent neural networks. We present the experimental setup and our learned state estimator to show that our method is viable for achieving proprioception and is also robust to common sensor failures.

Download Full-text