EXPERIMENTAL COMPARISON OF THE EFFECT OF ORDER IN RECURRENT NEURAL NETWORKS

1993 ◽

Vol 07 (04) ◽

pp. 849-872 ◽

Cited By ~ 30

Author(s):

CLIFFORD B. MILLER ◽

C. LEE GILES

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Internal State ◽

Second Order ◽

Convergence Time ◽

Experimental Comparison ◽

Grammatical Inference ◽

Neural Net ◽

First Order ◽

Finite State

There has been much interest in increasing the computational power of neural networks. In addition there has been much interest in “designing” neural networks better suited to particular problems. Increasing the “order” of the connectivity of a neural network permits both. Though order has played a significant role in feedforward neural networks, its role in dynamically driven recurrent networks is still being understood. This work explores the effect of order in learning grammars. We present an experimental comparison of first order and second order recurrent neural networks, as applied to the task of grammatical inference. We show that for the small grammars studied these two neural net architectures have comparable learning and generalization power, and that both are reasonably capable of extracting the correct finite state automata for the language in question. However, for a larger randomly-generated ten-state grammar, second order networks significantly outperformed the first order networks, both in convergence time and generalization capability. We show that these networks learn faster the more neurons they have (our experiments used up to 10 hidden neurons), but that the solutions found by smaller networks are usually of better quality (in terms of generalization performance after training). Second order nets have the advantage that they converge more quickly to a solution and can find it more reliably than first order nets, but that the second order solutions tend to be of poorer quality than those of the first order if both architectures are trained to the same error tolerance. Despite this, second order nets can more successfully extract finite state machines using heuristic clustering techniques applied to the internal state representations. We speculate that this may be due to restrictions on the ability of first order architecture to fully make use of its internal state representation power and that this may have implications for the performance of the two architectures when scaled up to larger problems.

Download Full-text

Spike timing-dependent plasticity in sparse recurrent neural networks

IEICE Proceeding Series ◽

10.15248/proc.1.485 ◽

2014 ◽

Vol 1 ◽

pp. 485-488

Author(s):

Hideyuki Kato ◽

Tohru Ikeguchi

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Spike Timing ◽

Spike Timing Dependent Plasticity ◽

Dependent Plasticity

Download Full-text

Direct Adaptive Control of Process Systems Using Recurrent Neural Networks

1992 American Control Conference ◽

10.23919/acc.1992.4792020 ◽

1992 ◽

Author(s):

Sanjay Parthasarathy ◽

Alexander G. Parlos ◽

Amir F. Atiya

Keyword(s):

Neural Networks ◽

Adaptive Control ◽

Recurrent Neural Networks ◽

Process Systems ◽

Direct Adaptive Control

Download Full-text

L2 approximation properties of recurrent neural networks

1997 European Control Conference (ECC) ◽

10.23919/ecc.1997.7082360 ◽

1997 ◽

Cited By ~ 1

Author(s):

A. Ruiz ◽

D.H. Owens ◽

S. Townley

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Approximation Properties

Download Full-text

Levenshtein Augmentation Improves Performance of SMILES Based Deep-Learning Synthesis Prediction

10.26434/chemrxiv.12562121 ◽

2020 ◽

Author(s):

Dean Sumner ◽

Jiazhen He ◽

Amol Thakkar ◽

Ola Engkvist ◽

Esben Jannik Bjerrum

Keyword(s):

Neural Networks ◽

Pattern Recognition ◽

Deep Learning ◽

Recurrent Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Sequence Similarity ◽

Learning Models ◽

Underlying Network

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>

Download Full-text