A method to improve the performance of multilayer perceptron by utilizing various activation functions in the last hidden layer and the least squares method

The machine learning area has recently gained prominence and articial neural networks are among the most popular techniques in this eld. Such techniques have the learning capacity that occurs during an iterative process of model tting. Multilayer perceptron (MLP) is one of the rst networks that emerged and, for thisarchitecture, backpropagation and its modications are widely used learning algorithms. In this article, the learning of the MLP neural network was approached from the Bayesian perspective by using Monte Carlo via Markov Chains (MCMC) simulations. The MLP architecture consists of the input, hidden and output layers. In the structure, there are several weights that connect each neuron in each layer. The input layer is composedof the covariates of the model. In the hidden layer there are activation functions. In the output layer, there are the result which is compared with the observed value and the loss function is calculated. We analyzed the network learning through simulated data of known weights in order to understand the estimation by the Bayesian method. Subsequently, we predicted the price of WTI oil and obtained a credibility interval for theforecasts. We provide an R implementation and the datasets as supplementary materials.

Download Full-text

Selection of Activation Functions in the Last Hidden Layer of the Multilayer Perceptron

Artificial Intelligence and Soft Computing - Lecture Notes in Computer Science ◽

10.1007/978-3-642-29347-4_9 ◽

2012 ◽

pp. 72-80

Author(s):

Krzysztof Halawa

Keyword(s):

Multilayer Perceptron ◽

Activation Functions ◽

Hidden Layer ◽

Selection Of

Download Full-text

Hidden-layer size reducing for multilayer neural networks using the orthogonal least-squares method

10.1109/sice.1997.624936 ◽

2002 ◽

Cited By ~ 3

Author(s):

Zi-Jiang Yang

Keyword(s):

Neural Networks ◽

Least Squares ◽

Least Squares Method ◽

Orthogonal Least Squares ◽

Multilayer Neural Networks ◽

Hidden Layer

Download Full-text

‘Least squares’ method—theorem or law?

Production Engineer ◽

10.1049/tpe.1980.0124 ◽

1980 ◽

Vol 59 (9) ◽

pp. 8

Author(s):

D.E. Turnbull

Keyword(s):

Least Squares ◽

Least Squares Method

Download Full-text

Precision Measurement of Audio Frequency Voltage Based on the Least Squares Method and Application of the Method for Voltage Ratio Measurement

IEEJ Transactions on Fundamentals and Materials ◽

10.1541/ieejfms1990.116.5_430 ◽

1996 ◽

Vol 116 (5) ◽

pp. 430-438 ◽

Cited By ~ 1

Author(s):

Toshihiko Yoshioka ◽

Megumi Sato ◽

Touru Yamazaki

Keyword(s):

Least Squares ◽

Precision Measurement ◽

Least Squares Method ◽

Audio Frequency ◽

Ratio Measurement ◽

Voltage Ratio

Download Full-text

An Augmented Iteratively Re-weighted and Refined Least Squares Method for lp Minimization Problem in Inverse Modeling of Potential Field Magnetic Data

Journal of Geological Research ◽

10.30564/jgr.v1i3.1316 ◽

2020 ◽

Vol 1 (3) ◽

Author(s):

Maysam Abedi

Keyword(s):

Magnetic Field ◽

Magnetic Susceptibility ◽

Least Squares ◽

Minimization Problem ◽

Potential Field ◽

Field Data ◽

Least Squares Method ◽

Porphyry Copper ◽

Magnetic Data ◽

Magnetic Field Data

The presented work examines application of an Augmented Iteratively Re-weighted and Refined Least Squares method (AIRRLS) to construct a 3D magnetic susceptibility property from potential field magnetic anomalies. This algorithm replaces an lp minimization problem by a sequence of weighted linear systems in which the retrieved magnetic susceptibility model is successively converged to an optimum solution, while the regularization parameter is the stopping iteration numbers. To avoid the natural tendency of causative magnetic sources to concentrate at shallow depth, a prior depth weighting function is incorporated in the original formulation of the objective function. The speed of lp minimization problem is increased by inserting a pre-conditioner conjugate gradient method (PCCG) to solve the central system of equation in cases of large scale magnetic field data. It is assumed that there is no remanent magnetization since this study focuses on inversion of a geological structure with low magnetic susceptibility property. The method is applied on a multi-source noise-corrupted synthetic magnetic field data to demonstrate its suitability for 3D inversion, and then is applied to a real data pertaining to a geologically plausible porphyry copper unit. The real case study located in Semnan province of Iran consists of an arc-shaped porphyry andesite covered by sedimentary units which may have potential of mineral occurrences, especially porphyry copper. It is demonstrated that such structure extends down at depth, and consequently exploratory drilling is highly recommended for acquiring more pieces of information about its potential for ore-bearing mineralization.

Download Full-text

Analysis Linking the Tensor Structure to the Least-Squares Method.

10.21236/ada142159 ◽

1984 ◽

Cited By ~ 1

Author(s):

G. Blaha

Keyword(s):

Least Squares ◽

Least Squares Method ◽

Tensor Structure

Download Full-text

Analysis of Non-Linear Activation Functions for Classification Tasks Using Convolutional Neural Networks

Recent Patents on Computer Science ◽

10.2174/2213275911666181025143029 ◽

2019 ◽

Vol 12 (3) ◽

pp. 156-161 ◽

Cited By ~ 3

Author(s):

Aman Dureja ◽

Payal Pahwa

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Activation Function ◽

Primary Objective ◽

Experimental Comparison ◽

Activation Functions ◽

Practical Applications ◽

Network Activation ◽

Non Linear ◽

Hidden Layer

Background: In making the deep neural network, activation functions play an important role. But the choice of activation functions also affects the network in term of optimization and to retrieve the better results. Several activation functions have been introduced in machine learning for many practical applications. But which activation function should use at hidden layer of deep neural networks was not identified. Objective: The primary objective of this analysis was to describe which activation function must be used at hidden layers for deep neural networks to solve complex non-linear problems. Methods: The configuration for this comparative model was used by using the datasets of 2 classes (Cat/Dog). The number of Convolutional layer used in this network was 3 and the pooling layer was also introduced after each layer of CNN layer. The total of the dataset was divided into the two parts. The first 8000 images were mainly used for training the network and the next 2000 images were used for testing the network. Results: The experimental comparison was done by analyzing the network by taking different activation functions on each layer of CNN network. The validation error and accuracy on Cat/Dog dataset were analyzed using activation functions (ReLU, Tanh, Selu, PRelu, Elu) at number of hidden layers. Overall the Relu gave best performance with the validation loss at 25th Epoch 0.3912 and validation accuracy at 25th Epoch 0.8320. Conclusion: It is found that a CNN model with ReLU hidden layers (3 hidden layers here) gives best results and improve overall performance better in term of accuracy and speed. These advantages of ReLU in CNN at number of hidden layers are helpful to effectively and fast retrieval of images from the databases.

Download Full-text

Theoretical accuracy of the last squares method in the isotope dilution analysis

Collection of Czechoslovak Chemical Communications ◽

10.1135/cccc19840805 ◽

1984 ◽

Vol 49 (4) ◽

pp. 805-820

Author(s):

Ján Klas

Keyword(s):

Least Squares ◽

Isotope Dilution ◽

Least Squares Method ◽

Isotope Dilution Analysis ◽

Straight Line ◽

The One ◽

Two Parameter ◽

Theoretical Accuracy

The accuracy of the least squares method in the isotope dilution analysis is studied using two models, viz a model of a two-parameter straight line and a model of a one-parameter straight line.The equations for the direct and the inverse isotope dilution methods are transformed into linear coordinates, and the intercept and slope of the two-parameter straight line and the slope of the one-parameter straight line are evaluated and treated.

Download Full-text