A way to reduce the time consumption effect of for-loops for training neural networks: optimized propagation

The process of machine learning is to find parameters that minimize the cost function constructed by learning the data. This is called optimization and the parameters at that time are called the optimal parameters in neural networks. In the process of finding the optimization, there were attempts to solve the symmetric optimization or initialize the parameters symmetrically. Furthermore, in order to obtain the optimal parameters, the existing methods have used methods in which the learning rate is decreased over the iteration time or is changed according to a certain ratio. These methods are a monotonically decreasing method at a constant rate according to the iteration time. Our idea is to make the learning rate changeable unlike the monotonically decreasing method. We introduce a method to find the optimal parameters which adaptively changes the learning rate according to the value of the cost function. Therefore, when the cost function is optimized, the learning is complete and the optimal parameters are obtained. This paper proves that the method ensures convergence to the optimal parameters. This means that our method achieves a minimum of the cost function (or effective learning). Numerical experiments demonstrate that learning is good effective when using the proposed learning rate schedule in various situations.

Download Full-text

CHARACTERIZING ONE-LAYER ASSOCIATIVE NEURAL NETWORKS WITH OPTIMAL NOISE-REDUCTION ABILITY

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001492000497 ◽

1992 ◽

Vol 06 (05) ◽

pp. 1009-1025 ◽

Cited By ~ 1

Author(s):

TAO WANG ◽

XIAOLIANG XING ◽

XINHUA ZHUANG

Keyword(s):

Neural Network ◽

Neural Networks ◽

Cost Function ◽

Noise Reduction ◽

Gradient Descent ◽

Storage Capacity ◽

Learning Algorithm ◽

Optimal Learning ◽

The Neural Network ◽

The Cost

In this paper, we describe an optimal learning algorithm for designing one-layer neural networks by means of global minimization. Taking the properties of a well-defined neural network into account, we derive a cost function to measure the goodness of the network quantitatively. The connection weights are determined by the gradient descent rule to minimize the cost function. The optimal learning algorithm is formed as either the unconstraint-based or the constraint-based minimization problem. It ensures the realization of each desired associative mapping with the best noise reduction ability in the sense of optimization. We also investigate the storage capacity of the neural network, the degree of noise reduction for a desired associative mapping, and the convergence of the learning algorithm in an analytic way. Finally, a large number of computer experimental results are presented.

Download Full-text

Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/292 ◽

2020 ◽

Author(s):

Tuan Hoang ◽

Thanh-Toan Do ◽

Tam V. Nguyen ◽

Ngai-Man Cheung

Keyword(s):

Neural Networks ◽

Cost Function ◽

Image Classification ◽

Convolutional Neural Networks ◽

Gradient Descent ◽

Deep Neural Networks ◽

State Of The Art ◽

Deep Convolutional Neural Networks ◽

Novel Method ◽

The Cost

This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. However, this approach would result in some mismatch: the gradient descent updates full-precision weights, but it does not update the quantized weights. To address this issue, we propose a novel method that enables direct updating of quantized weights with learnable quantization levels to minimize the cost function using gradient descent. Second, to obtain low bit-width activations, existing works consider all channels equally. However, the activation quantizers could be biased toward a few channels with high-variance. To address this issue, we propose a method to take into account the quantization errors of individual channels. With this approach, we can learn activation quantizers that minimize the quantization errors in the majority of channels. Experimental results demonstrate that our proposed method achieves state-of-the-art performance on the image classification task, using AlexNet, ResNet and MobileNetV2 architectures on CIFAR-100 and ImageNet datasets.

Download Full-text

Sensuator: A Hybrid Sensor–Actuator Approach to Soft Robotic Proprioception Using Recurrent Neural Networks

Actuators ◽

10.3390/act10020030 ◽

2021 ◽

Vol 10 (2) ◽

pp. 30

Author(s):

Pornthep Preechayasomboon ◽

Eric Rombokas

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Linear Models ◽

Open Loop ◽

Proof Of Concept ◽

State Estimator ◽

Loop Control ◽

Practical Applications ◽

Soft Actuator ◽

The Cost

Soft robotic actuators are now being used in practical applications; however, they are often limited to open-loop control that relies on the inherent compliance of the actuator. Achieving human-like manipulation and grasping with soft robotic actuators requires at least some form of sensing, which often comes at the cost of complex fabrication and purposefully built sensor structures. In this paper, we utilize the actuating fluid itself as a sensing medium to achieve high-fidelity proprioception in a soft actuator. As our sensors are somewhat unstructured, their readings are difficult to interpret using linear models. We therefore present a proof of concept of a method for deriving the pose of the soft actuator using recurrent neural networks. We present the experimental setup and our learned state estimator to show that our method is viable for achieving proprioception and is also robust to common sensor failures.

Download Full-text

An Adaptive Optimization Method Based on Learning Rate Schedule for Neural Networks

Applied Sciences ◽

10.3390/app11020850 ◽

2021 ◽

Vol 11 (2) ◽

pp. 850

Author(s):

Dokkyun Yi ◽

Sangmin Ji ◽

Jieun Park

Keyword(s):

Artificial Intelligence ◽

Cost Function ◽

Numerical Experiments ◽

Global Minimum ◽

Optimization Method ◽

Learning Method ◽

Adaptive Optimization ◽

The Cost ◽

Proof Of Convergence ◽

Learning Data

Artificial intelligence (AI) is achieved by optimizing the cost function constructed from learning data. Changing the parameters in the cost function is an AI learning process (or AI learning for convenience). If AI learning is well performed, then the value of the cost function is the global minimum. In order to obtain the well-learned AI learning, the parameter should be no change in the value of the cost function at the global minimum. One useful optimization method is the momentum method; however, the momentum method has difficulty stopping the parameter when the value of the cost function satisfies the global minimum (non-stop problem). The proposed method is based on the momentum method. In order to solve the non-stop problem of the momentum method, we use the value of the cost function to our method. Therefore, as the learning method processes, the mechanism in our method reduces the amount of change in the parameter by the effect of the value of the cost function. We verified the method through proof of convergence and numerical experiments with existing methods to ensure that the learning works well.

Download Full-text

Sorting permutations by fragmentation-weighted operations

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720020500067 ◽

2020 ◽

Vol 18 (02) ◽

pp. 2050006 ◽

Cited By ~ 1

Author(s):

Alexsandro Oliveira Alexandrino ◽

Carla Negri Lintzmayer ◽

Zanoni Dias

Keyword(s):

Approximation Algorithms ◽

Computational Biology ◽

Cost Function ◽

Traditional Approach ◽

Upper Bounds ◽

Evolutionary Distance ◽

Lower And Upper Bounds ◽

Approximation Factor ◽

New Type ◽

The Cost

One of the main problems in Computational Biology is to find the evolutionary distance among species. In most approaches, such distance only involves rearrangements, which are mutations that alter large pieces of the species’ genome. When we represent genomes as permutations, the problem of transforming one genome into another is equivalent to the problem of Sorting Permutations by Rearrangement Operations. The traditional approach is to consider that any rearrangement has the same probability to happen, and so, the goal is to find a minimum sequence of operations which sorts the permutation. However, studies have shown that some rearrangements are more likely to happen than others, and so a weighted approach is more realistic. In a weighted approach, the goal is to find a sequence which sorts the permutations, such that the cost of that sequence is minimum. This work introduces a new type of cost function, which is related to the amount of fragmentation caused by a rearrangement. We present some results about the lower and upper bounds for the fragmentation-weighted problems and the relation between the unweighted and the fragmentation-weighted approach. Our main results are 2-approximation algorithms for five versions of this problem involving reversals and transpositions. We also give bounds for the diameters concerning these problems and provide an improved approximation factor for simple permutations considering transpositions.

Download Full-text

Maximum Likelihood Ensemble Filter: Theoretical Aspects

Monthly Weather Review ◽

10.1175/mwr2946.1 ◽

2005 ◽

Vol 133 (6) ◽

pp. 1710-1726 ◽

Cited By ~ 222

Author(s):

Milija Zupanski

Keyword(s):

Maximum Likelihood ◽

Data Assimilation ◽

Cost Function ◽

Ensemble Data Assimilation ◽

Error Covariance ◽

Ensemble Data ◽

Analysis Error ◽

Nonlinear Observation ◽

The Cost ◽

Maximum Likelihood Ensemble Filter

Abstract A new ensemble-based data assimilation method, named the maximum likelihood ensemble filter (MLEF), is presented. The analysis solution maximizes the likelihood of the posterior probability distribution, obtained by minimization of a cost function that depends on a general nonlinear observation operator. The MLEF belongs to the class of deterministic ensemble filters, since no perturbed observations are employed. As in variational and ensemble data assimilation methods, the cost function is derived using a Gaussian probability density function framework. Like other ensemble data assimilation algorithms, the MLEF produces an estimate of the analysis uncertainty (e.g., analysis error covariance). In addition to the common use of ensembles in calculation of the forecast error covariance, the ensembles in MLEF are exploited to efficiently calculate the Hessian preconditioning and the gradient of the cost function. A sufficient number of iterative minimization steps is 2–3, because of superior Hessian preconditioning. The MLEF method is well suited for use with highly nonlinear observation operators, for a small additional computational cost of minimization. The consistent treatment of nonlinear observation operators through optimization is an advantage of the MLEF over other ensemble data assimilation algorithms. The cost of MLEF is comparable to the cost of existing ensemble Kalman filter algorithms. The method is directly applicable to most complex forecast models and observation operators. In this paper, the MLEF method is applied to data assimilation with the one-dimensional Korteweg–de Vries–Burgers equation. The tested observation operator is quadratic, in order to make the assimilation problem more challenging. The results illustrate the stability of the MLEF performance, as well as the benefit of the cost function minimization. The improvement is noted in terms of the rms error, as well as the analysis error covariance. The statistics of innovation vectors (observation minus forecast) also indicate a stable performance of the MLEF algorithm. Additional experiments suggest the amplified benefit of targeted observations in ensemble data assimilation.

Download Full-text

Using the cost function to generate Marshallian demand systems

Empirical Economics ◽

10.1007/s001819900012 ◽

2000 ◽

Vol 25 (2) ◽

pp. 209-227 ◽

Cited By ~ 9

Author(s):

Keith R. McLaren ◽

Peter D. Rossitter ◽

Alan A. Powell

Keyword(s):

Cost Function ◽

Demand Systems ◽

The Cost

Download Full-text

Optimal actuator placement in adaptive optics systems

Journal of Vibration and Control ◽

10.1177/10775463211032449 ◽

2021 ◽

pp. 107754632110324

Author(s):

Berk Altıner ◽

Bilal Erol ◽

Akın Delibaşı

Keyword(s):

Cost Function ◽

Adaptive Optics ◽

Disturbance Attenuation ◽

Linear Quadratic ◽

Placement Problem ◽

Convex Optimization Problem ◽

Wavefront Aberrations ◽

The Cost ◽

Actuator Placement ◽

Quadratic Cost Function

Adaptive optics systems are powerful tools that are implemented to degrade the effects of wavefront aberrations. In this article, the optimal actuator placement problem is addressed for the improvement of disturbance attenuation capability of adaptive optics systems due to the fact that actuator placement is directly related to the enhancement of system performance. For this purpose, the linear-quadratic cost function is chosen, so that optimized actuator layouts can be specialized according to the type of wavefront aberrations. It is then considered as a convex optimization problem, and the cost function is formulated for the disturbance attenuation case. The success of the presented method is demonstrated by simulation results.

Download Full-text