Multi-Zone Unit for Recurrent Neural Networks

Recurrent neural networks (RNNs) have been widely used to deal with sequence learning problems. The input-dependent transition function, which folds new observations into hidden states to sequentially construct fixed-length representations of arbitrary-length sequences, plays a critical role in RNNs. Based on single space composition, transition functions in existing RNNs often have difficulty in capturing complicated long-range dependencies. In this paper, we introduce a new Multi-zone Unit (MZU) for RNNs. The key idea is to design a transition function that is capable of modeling multiple space composition. The MZU consists of three components: zone generation, zone composition, and zone aggregation. Experimental results on multiple datasets of the character-level language modeling task and the aspect-based sentiment analysis task demonstrate the superiority of the MZU.

Download Full-text

A New Hyper-Parameter Optimization Method for Power Load Forecast Based on Recurrent Neural Networks

Algorithms ◽

10.3390/a14060163 ◽

2021 ◽

Vol 14 (6) ◽

pp. 163

Author(s):

Yaru Li ◽

Yulai Zhang ◽

Yongping Cai

Keyword(s):

Neural Networks ◽

Parameter Optimization ◽

Recurrent Neural Networks ◽

Critical Role ◽

Optimization Method ◽

Bayesian Optimization ◽

Power Load ◽

Gradient Calculation ◽

Truncated Newton ◽

Selection Of

The selection of the hyper-parameters plays a critical role in the task of prediction based on the recurrent neural networks (RNN). Traditionally, the hyper-parameters of the machine learning models are selected by simulations as well as human experiences. In recent years, multiple algorithms based on Bayesian optimization (BO) are developed to determine the optimal values of the hyper-parameters. In most of these methods, gradients are required to be calculated. In this work, the particle swarm optimization (PSO) is used under the BO framework to develop a new method for hyper-parameter optimization. The proposed algorithm (BO-PSO) is free of gradient calculation and the particles can be optimized in parallel naturally. So the computational complexity can be effectively reduced which means better hyper-parameters can be obtained under the same amount of calculation. Experiments are done on real world power load data,where the proposed method outperforms the existing state-of-the-art algorithms,BO with limit-BFGS-bound (BO-L-BFGS-B) and BO with truncated-newton (BO-TNC),in terms of the prediction accuracy. The errors of the prediction result in different models show that BO-PSO is an effective hyper-parameter optimization method.

Download Full-text

Recurrent Neural Networks with Small Weights Implement Definite Memory Machines

Neural Computation ◽

10.1162/08997660360675080 ◽

2003 ◽

Vol 15 (8) ◽

pp. 1897-1929 ◽

Cited By ~ 32

Author(s):

Barbara Hammer ◽

Peter Tiňo

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Experimental Studies ◽

Activation Function ◽

Transition Function ◽

Recurrent Network ◽

Point Of View ◽

Recurrent Networks ◽

Contraction Parameter ◽

Arbitrary Precision

Recent experimental studies indicate that recurrent neural networks initialized with “small” weights are inherently biased toward definite memory machines (Tiňno, Čerňanský, & Beňušková, 2002a, 2002b). This article establishes a theoretical counterpart: transition function of recurrent network with small weights and squashing activation function is a contraction. We prove that recurrent networks with contractive transition function can be approximated arbitrarily well on input sequences of unbounded length by a definite memory machine. Conversely, every definite memory machine can be simulated by a recurrent network with contractive transition function. Hence, initialization with small weights induces an architectural bias into learning with recurrent neural networks. This bias might have benefits from the point of view of statistical learning theory: it emphasizes one possible region of the weight space where generalization ability can be formally proved. It is well known that standard recurrent neural networks are not distribution independent learnable in the probably approximately correct (PAC) sense if arbitrary precision and inputs are considered. We prove that recurrent networks with contractive transition function with a fixed contraction parameter fulfill the so-called distribution independent uniform convergence of empirical distances property and hence, unlike general recurrent networks, are distribution independent PAC learnable.

Download Full-text

Evaluation of Recurrent Neural Network and its Variants for Intrusion Detection System (IDS)

International Journal of Information System Modeling and Design ◽

10.4018/ijismd.2017070103 ◽

2017 ◽

Vol 8 (3) ◽

pp. 43-63 ◽

Cited By ~ 17

Author(s):

R Vinayakumar ◽

K.P. Soman ◽

Prabaharan Poornachandran

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Intrusion Detection ◽

Recurrent Neural Networks ◽

Machine Learning Algorithms ◽

Time Range ◽

Sequence Information ◽

Time Lags ◽

Data Set ◽

Arbitrary Length

This article describes how sequential data modeling is a relevant task in Cybersecurity. Sequences are attributed temporal characteristics either explicitly or implicitly. Recurrent neural networks (RNNs) are a subset of artificial neural networks (ANNs) which have appeared as a powerful, principle approach to learn dynamic temporal behaviors in an arbitrary length of large-scale sequence data. Furthermore, stacked recurrent neural networks (S-RNNs) have the potential to learn complex temporal behaviors quickly, including sparse representations. To leverage this, the authors model network traffic as a time series, particularly transmission control protocol / internet protocol (TCP/IP) packets in a predefined time range with a supervised learning method, using millions of known good and bad network connections. To find out the best architecture, the authors complete a comprehensive review of various RNN architectures with its network parameters and network structures. Ideally, as a test bed, they use the existing benchmark Defense Advanced Research Projects Agency / Knowledge Discovery and Data Mining (DARPA) / (KDD) Cup ‘99' intrusion detection (ID) contest data set to show the efficacy of these various RNN architectures. All the experiments of deep learning architectures are run up to 1000 epochs with a learning rate in the range [0.01-0.5] on a GPU-enabled TensorFlow and experiments of traditional machine learning algorithms are done using Scikit-learn. Experiments of families of RNN architecture achieved a low false positive rate in comparison to the traditional machine learning classifiers. The primary reason is that RNN architectures are able to store information for long-term dependencies over time-lags and to adjust with successive connection sequence information. In addition, the effectiveness of RNN architectures are shown for the UNSW-NB15 data set.

Download Full-text

Inner-process visualization of hidden states in recurrent neural networks

Proceedings of the 13th International Symposium on Visual Information Communication and Interaction ◽

10.1145/3430036.3430047 ◽

2020 ◽

Author(s):

Rafael Garcia ◽

Daniel Weiskopf

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Process Visualization ◽

Hidden States

Download Full-text

Local stability conditions for discrete-time cascade locally recurrent neural networks

International Journal of Applied Mathematics and Computer Science ◽

10.2478/v10006-010-0002-x ◽

2010 ◽

Vol 20 (1) ◽

pp. 23-34 ◽

Cited By ~ 4

Author(s):

Krzysztof Patan

Keyword(s):

Neural Networks ◽

Discrete Time ◽

Recurrent Neural Networks ◽

Local Stability ◽

Learning Problems ◽

Stability Conditions ◽

Neuron Models ◽

Optimization Task ◽

Dynamic Type ◽

Locally Recurrent

Local stability conditions for discrete-time cascade locally recurrent neural networksThe paper deals with a specific kind of discrete-time recurrent neural network designed with dynamic neuron models. Dynamics are reproduced within each single neuron, hence the network considered is a locally recurrent globally feedforward. A crucial problem with neural networks of the dynamic type is stability as well as stabilization in learning problems. The paper formulates local stability conditions for the analysed class of neural networks using Lyapunov's first method. Moreover, a stabilization problem is defined and solved as a constrained optimization task. In order to tackle this problem, a gradient projection method is adopted. The efficiency and usefulness of the proposed approach are justified by using a number of experiments.

Download Full-text

CommNets: Communicating Neural Network Architectures for Resource Constrained Systems

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019909 ◽

2019 ◽

Vol 33 ◽

pp. 9909-9910

Author(s):

Prince M Abudu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Recurrent Neural Networks ◽

Network Architecture ◽

Resource Constraints ◽

Constrained Systems ◽

Detection Accuracy ◽

Network Architectures ◽

Operating Environments ◽

Hidden States

Applications that require heterogeneous sensor deployments continue to face practical challenges owing to resource constraints within their operating environments (i.e. energy efficiency, computational power and reliability). This has motivated the need for effective ways of selecting a sensing strategy that maximizes detection accuracy for events of interest using available resources and data-driven approaches. Inspired by those limitations, we ask a fundamental question: whether state-of-the-art Recurrent Neural Networks can observe different series of data and communicate their hidden states to collectively solve an objective in a distributed fashion. We realize our answer by conducting a series of systematic analyses of a Communicating Recurrent Neural Network architecture on varying time-steps, objective functions and number of nodes. The experimental setup we employ models tasks synonymous with those in Wireless Sensor Networks. Our contributions show that Recurrent Neural Networks can communicate through their hidden states and we achieve promising results.

Download Full-text

Convolutional Recurrent Neural Networks for Text Classification

Journal of Database Management ◽

10.4018/jdm.2021100105 ◽

2021 ◽

Vol 32 (4) ◽

pp. 65-82

Author(s):

Shengfei Lyu ◽

Jiaqi Liu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Recurrent Neural Network ◽

Text Classification ◽

Recurrent Neural Networks ◽

Superior Performance ◽

Novel Approach ◽

Hidden States ◽

Traditional Approaches

Recurrent neural network (RNN) and convolutional neural network (CNN) are two prevailing architectures used in text classification. Traditional approaches combine the strengths of these two networks by straightly streamlining them or linking features extracted from them. In this article, a novel approach is proposed to maintain the strengths of RNN and CNN to a great extent. In the proposed approach, a bi-directional RNN encodes each word into forward and backward hidden states. Then, a neural tensor layer is used to fuse bi-directional hidden states to get word representations. Meanwhile, a convolutional neural network is utilized to learn the importance of each word for text classification. Empirical experiments are conducted on several datasets for text classification. The superior performance of the proposed approach confirms its effectiveness.

Download Full-text

Visual analytics tool for the interpretation of hidden states in recurrent neural networks

Visual Computing for Industry Biomedicine and Art ◽

10.1186/s42492-021-00090-0 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Rafael Garcia ◽

Tanja Munz ◽

Daniel Weiskopf

Keyword(s):

Neural Networks ◽

Language Processing ◽

Recurrent Neural Networks ◽

Visual Analytics ◽

Short Term Memory ◽

Input Sequence ◽

Color Coding ◽

Nonlinear Projection ◽

Brushing And Linking ◽

Hidden States

AbstractIn this paper, we introduce a visual analytics approach aimed at helping machine learning experts analyze the hidden states of layers in recurrent neural networks. Our technique allows the user to interactively inspect how hidden states store and process information throughout the feeding of an input sequence into the network. The technique can help answer questions, such as which parts of the input data have a higher impact on the prediction and how the model correlates each hidden state configuration with a certain output. Our visual analytics approach comprises several components: First, our input visualization shows the input sequence and how it relates to the output (using color coding). In addition, hidden states are visualized through a nonlinear projection into a 2-D visualization space using t-distributed stochastic neighbor embedding to understand the shape of the space of the hidden states. Trajectories are also employed to show the details of the evolution of the hidden state configurations. Finally, a time-multi-class heatmap matrix visualizes the evolution of the expected predictions for multi-class classifiers, and a histogram indicates the distances between the hidden states within the original space. The different visualizations are shown simultaneously in multiple views and support brushing-and-linking to facilitate the analysis of the classifications and debugging for misclassified input sequences. To demonstrate the capability of our approach, we discuss two typical use cases for long short-term memory models applied to two widely used natural language processing datasets.

Download Full-text

Solving Partially Observable Reinforcement Learning Problems with Recurrent Neural Networks

Lecture Notes in Computer Science - Neural Networks: Tricks of the Trade ◽

10.1007/978-3-642-35289-8_38 ◽

2012 ◽

pp. 709-733 ◽

Cited By ~ 3

Author(s):

Siegmund Duell ◽

Steffen Udluft ◽

Volkmar Sterzing

Keyword(s):

Neural Networks ◽

Reinforcement Learning ◽

Recurrent Neural Networks ◽

Learning Problems ◽

Partially Observable

Download Full-text

Electricity Price Forecasting Using Recurrent Neural Networks

10.20944/preprints201804.0286.v1 ◽

2018 ◽

Author(s):

Umut Ugurlu ◽

Ilkay Oksuz ◽

Oktay Tas

Keyword(s):

Neural Networks ◽

Electricity Markets ◽

Recurrent Neural Networks ◽

Short Term Memory ◽

Electricity Price ◽

Learning Problems ◽

Price Forecasting ◽

Electricity Price Forecasting ◽

Gated Recurrent Units ◽

Dependent Learning

Accurate electricity price forecasting has become a substantial requirement since the liberalization of the electricity markets. Due to the challenging nature of the electricity prices, which includes high volatility, sharp price spikes and seasonality, various types of electricity price forecasting models still compete and can not outperform each other consistently. Neural Networks have been successfully used in machine learning problems and Recurrent Neural Networks (RNNs) have been proposed to address time-dependent learning problems. In particular, Long Short Term Memory and Gated Recurrent Units (GRU) are tailor-made for time series price estimation. In this paper, we propose to use Gated Recurrent Units as a new technique for electricity price forecasting. We have trained a variety of algorithms with rolling 3-year window and compared the results with the RNNs. In our experiments, 3-layered GRUs outperformed all other neural network structures and state of the art statistical techniques in a statistically significant manner in the Turkish day-ahead market.

Download Full-text