Relating the Slope of the Activation Function and the Learning Rate Within a Recurrent Neural Network

1999 ◽  
Vol 11 (5) ◽  
pp. 1069-1077 ◽  
Author(s):  
Danilo P. Mandic ◽  
Jonathon A. Chambers

A relationship between the learning rate η in the learning algorithm, and the slope β in the nonlinear activation function, for a class of recurrent neural networks (RNNs) trained by the real-time recurrent learning algorithm is provided. It is shown that an arbitrary RNN can be obtained via the referent RNN, with some deterministic rules imposed on its weights and the learning rate. Such relationships reduce the number of degrees of freedom when solving the nonlinear optimization task of finding the optimal RNN parameters.

2020 ◽  
Vol 34 (04) ◽  
pp. 5306-5314
Author(s):  
Takamasa Okudono ◽  
Masaki Waga ◽  
Taro Sekiyama ◽  
Ichiro Hasuo

We present a method to extract a weighted finite automaton (WFA) from a recurrent neural network (RNN). Our method is based on the WFA learning algorithm by Balle and Mohri, which is in turn an extension of Angluin's classic L* algorithm. Our technical novelty is in the use of regression methods for the so-called equivalence queries, thus exploiting the internal state space of an RNN to prioritize counterexample candidates. This way we achieve a quantitative/weighted extension of the recent work by Weiss, Goldberg and Yahav that extracts DFAs. We experimentally evaluate the accuracy, expressivity and efficiency of the extracted WFAs.


2002 ◽  
Vol 14 (9) ◽  
pp. 2043-2051 ◽  
Author(s):  
Boonyanit Mathayomchan ◽  
Randall D. Beer

A center-crossing recurrent neural network is one in which the null- (hyper) surfaces of each neuron intersect at their exact centers of symmetry, ensuring that each neuron's activation function is centered over the range of net inputs that it receives. We demonstrate that relative to a random initial population, seeding the initial population of an evolutionary search with center-crossing networks significantly improves both the frequency and the speed with which high-fitness oscillatory circuits evolve on a simple walking task. The improvement is especially striking at low mutation variances. Our results suggest that seeding with center-crossing networks may often be beneficial, since a wider range of dynamics is more likely to be easily accessible from a population of center-crossing networks than from a population of random networks.


Filomat ◽  
2020 ◽  
Vol 34 (15) ◽  
pp. 5009-5018
Author(s):  
Lei Ding ◽  
Lin Xiao ◽  
Kaiqing Zhou ◽  
Yonghong Lan ◽  
Yongsheng Zhang

Compared to the linear activation function, a suitable nonlinear activation function can accelerate the convergence speed. Based on this finding, we propose two modified Zhang neural network (ZNN) models using different nonlinear activation functions to tackle the complex-valued systems of linear equation (CVSLE) problems in this paper. To fulfill this goal, we first propose a novel neural network called NRNN-SBP model by introducing the sign-bi-power activation function. Then, we propose another novel neural network called NRNN-IRN model by introducing the tunable activation function. Finally, simulative results demonstrate that the convergence speed of NRNN-SBP and the NRNN-IRN is faster than that of the FTRNN model. On the other hand, these results also reveal that different nonlinear activation function will have a different effect on the convergence rate for different CVSLE problems.


2019 ◽  
Vol 9 (16) ◽  
pp. 3391 ◽  
Author(s):  
Santiago Pascual ◽  
Joan Serrà ◽  
Antonio Bonafonte

Conversion from text to speech relies on the accurate mapping from linguistic to acoustic symbol sequences, for which current practice employs recurrent statistical models such as recurrent neural networks. Despite the good performance of such models (in terms of low distortion in the generated speech), their recursive structure with intermediate affine transformations tends to make them slow to train and to sample from. In this work, we explore two different mechanisms that enhance the operational efficiency of recurrent neural networks, and study their performance–speed trade-off. The first mechanism is based on the quasi-recurrent neural network, where expensive affine transformations are removed from temporal connections and placed only on feed-forward computational directions. The second mechanism includes a module based on the transformer decoder network, designed without recurrent connections but emulating them with attention and positioning codes. Our results show that the proposed decoder networks are competitive in terms of distortion when compared to a recurrent baseline, whilst being significantly faster in terms of CPU and GPU inference time. The best performing model is the one based on the quasi-recurrent mechanism, reaching the same level of naturalness as the recurrent neural network based model with a speedup of 11.2 on CPU and 3.3 on GPU.


2009 ◽  
Vol 21 (11) ◽  
pp. 3214-3227
Author(s):  
James Ting-Ho Lo

By a fundamental neural filtering theorem, a recurrent neural network with fixed weights is known to be capable of adapting to an uncertain environment. This letter reports some mathematical results on the performance of such adaptation for series-parallel identification of a dynamical system as compared with the performance of the best series-parallel identifier possible under the assumption that the precise value of the uncertain environmental process is given. In short, if an uncertain environmental process is observable (not necessarily constant) from the output of a dynamical system or constant (not necessarily observable), then a recurrent neural network exists as a series-parallel identifier of the dynamical system whose output approaches the output of an optimal series-parallel identifier using the environmental process as an additional input.


2021 ◽  
Vol 27 (11) ◽  
pp. 1193-1202
Author(s):  
Ashot Baghdasaryan ◽  
Hovhannes Bolibekyan

There are three main problems for theorem proving with a standard cut-free system for the first order minimal logic. The first problem is the possibility of looping. Secondly, it might generate proofs which are permutations of each other. Finally, during the proof some choice should be made to decide which rules to apply and where to use them. New systems with history mechanisms were introduced for solving the looping problems of automated theorem provers in the first order minimal logic. In order to solve the rule selection problem, recurrent neural networks are deployed and they are used to determine which formula from the context should be used on further steps. As a result, it yields to the reduction of time during theorem proving.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Yumin Dong ◽  
Xiang Li ◽  
Wei Liao ◽  
Dong Hou

In this paper, a quantum neural network with multilayer activation function is proposed by using multilayer Sigmoid function superposition and learning algorithm to adjust quantum interval. On this basis, the quasiuniform stability of fractional quantum neural networks with mixed delays is studied. According to the order of two different cases, the conditions of quasi uniform stability of networks are given by using the techniques of linear matrix inequality analysis, and the sufficiency of the conditions is proved. Finally, the feasibility of the conclusion is verified by experiments.


2020 ◽  
Vol 63 (5) ◽  
pp. 1327-1348
Author(s):  
Andrés F. Jiménez ◽  
Brenda V. Ortiz ◽  
Luca Bondesan ◽  
Guilherme Morata ◽  
Damianos Damianidis

HighlightsNARX and LSTM recurrent neural networks were evaluated for prediction of irrigation prescriptions.LSTM neural networks presented the best performance for irrigation scheduling using soil matric potential sensors.NARX neural networks had the best performance for predicting irrigation prescriptions using weather data.High performance for several time-ahead predictions using both recurrent neural networks, with R2 > 0.94.The results can be adopted as a decision-support tool in irrigation scheduling for fields with different types of soils.Abstract. The implementation of adequate irrigation strategies could be done through real-time monitoring of soil water status at several soil depths; however, this could also represent a complex nonlinear problem due to the plant-soil-weather relationships. In this study, two recurrent neural network (RNN) models were evaluated to estimate irrigation prescriptions. Data for this study were collected from an on-farm corn irrigation study conducted between 2017 and 2019 in Samson, Alabama. The study used hourly data of weather and soil matric potential (SMP) monitored at three soil depths from 13 sensor probes installed on a loamy fine sand soil and a sandy clay loam soil. Two neural network methods, i.e., a nonlinear autoregressive with exogenous (NARX) input system and long short-term memory (LSTM), were trained, validated, and tested with a maximum dataset of 20,052 records and a maximum of eight categorical attributes to estimate one-step irrigation prescriptions. The performance of both methods was evaluated by varying the model development parameters (neurons or blocks, dropout, and epochs) and determining their impact on the final model prediction. Results showed that both RNN models demonstrated good capability in the prediction of irrigation prescriptions for the soil types studied, with a coefficient of determination (R2) > 0.94 and root mean square error (RMSE) < 1.2 mm. The results of this study indicate that after training the RNNs using the dataset collected in the field, models using only SMP sensors at three soil depths obtained the best performance, followed by models that used only data of solar radiation, temperature, and relative humidity in the prediction of irrigation prescriptions. For future applicability, the RNN models can be extended using datasets from other places for training, which would allow the adoption of a unique data-driven soil moisture model for irrigation scheduling useful in a wide range of soil types. Keywords: Corn, Irrigation scheduling, Machine learning, Modeling, Soil matric potential sensor.


1999 ◽  
Vol 121 (4) ◽  
pp. 724-729 ◽  
Author(s):  
C. James Li ◽  
Yimin Fan

This paper describes a method to diagnose the most frequent faults of a screw compressor and assess magnitude of these faults by tracking changes in compressor’s dynamics. To determine the condition of the compressor, a feedforward neural network model is first employed to identify the dynamics of the compressor. A recurrent neural network is then used to classify the model into one of the three conditions including baseline, gaterotor wear and excessive friction. Finally, another recurrent neural network estimates the magnitude of a fault from the model. The method’s ability to generalize was evaluated. Experimental validation of the method was also performed. The results show significant improvement over the previous method which used only feedforward neural networks.


2021 ◽  
Vol 32 (4) ◽  
pp. 65-82
Author(s):  
Shengfei Lyu ◽  
Jiaqi Liu

Recurrent neural network (RNN) and convolutional neural network (CNN) are two prevailing architectures used in text classification. Traditional approaches combine the strengths of these two networks by straightly streamlining them or linking features extracted from them. In this article, a novel approach is proposed to maintain the strengths of RNN and CNN to a great extent. In the proposed approach, a bi-directional RNN encodes each word into forward and backward hidden states. Then, a neural tensor layer is used to fuse bi-directional hidden states to get word representations. Meanwhile, a convolutional neural network is utilized to learn the importance of each word for text classification. Empirical experiments are conducted on several datasets for text classification. The superior performance of the proposed approach confirms its effectiveness.


Sign in / Sign up

Export Citation Format

Share Document