nonlinear function approximation
Recently Published Documents


TOTAL DOCUMENTS

31
(FIVE YEARS 6)

H-INDEX

5
(FIVE YEARS 1)

Author(s):  
Saikat Majumder

Wavelet neural networks are a class of single hidden layer neural networks consisting of wavelets as activation functions. Wavelet neural networks (WNN) are an alternative to the classical multilayer perceptron neural networks for arbitrary nonlinear function approximation and can provide compact network representation. In this chapter, a tutorial introduction to different types of WNNs and their architecture is given, along with its training algorithm. Subsequently, a novel application of WNN for equalization of nonlinear satellite communication channel is presented. Nonlinearity in a satellite communication channel is mainly caused due to use of transmitter power amplifiers near its saturation region to improve efficiency. Two models describing amplitude and phase distortion caused in a power amplifier are explained. Performance of the proposed equalizer is evaluated and compared to an existing equalizer in literature.


Author(s):  
Kenny Young ◽  
Baoxiang Wang ◽  
Matthew E. Taylor

Reinforcement learning (RL) has had many successes, but significant hyperparameter tuning is commonly required to achieve good performance. Furthermore, when nonlinear function approximation is used, non-stationarity in the state representation can lead to learning instability. A variety of techniques exist to combat this --- most notably experience replay or the use of parallel actors. These techniques stabilize learning by making the RL problem more similar to the supervised setting. However, they come at the cost of moving away from the RL problem as it is typically formulated, that is, a single agent learning online without maintaining a large database of training examples. To address these issues, we propose Metatrace, a meta-gradient descent based algorithm to tune the step-size online. Metatrace leverages the structure of eligibility traces, and works for both tuning a scalar step-size and a respective step-size for each parameter. We empirically evaluate Metatrace for actor-critic on the Arcade Learning Environment. Results show Metatrace can speed up learning, and improve performance in non-stationary settings.


Author(s):  
Carles Gelada ◽  
Marc G. Bellemare

In this paper we revisit the method of off-policy corrections for reinforcement learning (COP-TD) pioneered by Hallak et al. (2017). Under this method, online updates to the value function are reweighted to avoid divergence issues typical of off-policy learning. While Hallak et al.’s solution is appealing, it cannot easily be transferred to nonlinear function approximation. First, it requires a projection step onto the probability simplex; second, even though the operator describing the expected behavior of the off-policy learning algorithm is convergent, it is not known to be a contraction mapping, and hence, may be more unstable in practice. We address these two issues by introducing a discount factor into COP-TD. We analyze the behavior of discounted COP-TD and find it better behaved from a theoretical perspective. We also propose an alternative soft normalization penalty that can be minimized online and obviates the need for an explicit projection step. We complement our analysis with an empirical evaluation of the two techniques in an off-policy setting on the game Pong from the Atari domain where we find discounted COP-TD to be better behaved in practice than the soft normalization penalty. Finally, we perform a more extensive evaluation of discounted COP-TD in 5 games of the Atari domain, where we find performance gains for our approach.


2019 ◽  
Vol 1 (2) ◽  
pp. 745-755 ◽  
Author(s):  
Zeng ◽  
Tan ◽  
Matsunaga ◽  
Shirai

A Support Vector Machine (SVM) for regression is a popular machine learning model that aims to solve nonlinear function approximation problems wherein explicit model equations are difficult to formulate. The performance of an SVM depends largely on the selection of its parameters. Choosing between an SVM that solves an optimization problem with inequality constrains and one that solves the least square of errors (LS-SVM) adds to the complexity. Various methods have been proposed for tuning parameters, but no article puts the SVM and LS-SVM side by side to discuss the issue using a large dataset from the real world, which could be problematic for existing parameter tuning methods. We investigated both the SVM and LS-SVM with an artificial dataset and a dataset of more than 200,000 points used for the reconstruction of the global surface ocean CO2 concentration. The results reveal that: (1) the two models are most sensitive to the parameter of the kernel function, which lies in a narrow range for scaled input data; (2) the optimal values of other parameters do not change much for different datasets; and (3) the LS-SVM performs better than the SVM in general. The LS-SVM is recommended, as it has less parameters to be tuned and yields a smaller bias. Nevertheless, the SVM has advantages of consuming less computer resources and taking less time to train. The results suggest initial parameter guesses for using the models.


2017 ◽  
Vol 67 (6) ◽  
pp. 603 ◽  
Author(s):  
Hari Om Verma ◽  
N. K. Peyada

<p class="p1">The parameter estimation of unstable aircraft using extreme learning machine method is presented. In the past, conventional methods such as output error method, filter error method, equation error method and non-conventional method such as artificial neural-network based methods have been used for aircraft’s aerodynamic parameter estimation. Nowadays, a trend of finding an accurate nonlinear function approximation is required to represent the aircraft’s equations-of-motion. Such type of nonlinear function approximation is usually achieved using artificial neural-network which is trained with the aircraft input-output flight data using a training algorithm. The accuracy of estimated parameters, which is achieved using the trained network, is highly dependent on the generalisation capability of the network which can be improved using extreme learning machine based network in contrast to artificial neural-network. To estimate the unstable aircraft parameters from the simulated flight data, Gauss-Newton based optimisation method has been used with a predefined aerodynamic model using the trained network. Further, the confidence of the estimated parameters has been shown in comparison to that of the standard parameter estimation methods in terms of the Cramer-Rao bounds.</p>


Sign in / Sign up

Export Citation Format

Share Document