Inner-process visualization of hidden states in recurrent neural networks

Recurrent neural networks (RNNs) have been widely used to deal with sequence learning problems. The input-dependent transition function, which folds new observations into hidden states to sequentially construct fixed-length representations of arbitrary-length sequences, plays a critical role in RNNs. Based on single space composition, transition functions in existing RNNs often have difficulty in capturing complicated long-range dependencies. In this paper, we introduce a new Multi-zone Unit (MZU) for RNNs. The key idea is to design a transition function that is capable of modeling multiple space composition. The MZU consists of three components: zone generation, zone composition, and zone aggregation. Experimental results on multiple datasets of the character-level language modeling task and the aspect-based sentiment analysis task demonstrate the superiority of the MZU.

Download Full-text

CommNets: Communicating Neural Network Architectures for Resource Constrained Systems

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019909 ◽

2019 ◽

Vol 33 ◽

pp. 9909-9910

Author(s):

Prince M Abudu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Recurrent Neural Networks ◽

Network Architecture ◽

Resource Constraints ◽

Constrained Systems ◽

Detection Accuracy ◽

Network Architectures ◽

Operating Environments ◽

Hidden States

Applications that require heterogeneous sensor deployments continue to face practical challenges owing to resource constraints within their operating environments (i.e. energy efficiency, computational power and reliability). This has motivated the need for effective ways of selecting a sensing strategy that maximizes detection accuracy for events of interest using available resources and data-driven approaches. Inspired by those limitations, we ask a fundamental question: whether state-of-the-art Recurrent Neural Networks can observe different series of data and communicate their hidden states to collectively solve an objective in a distributed fashion. We realize our answer by conducting a series of systematic analyses of a Communicating Recurrent Neural Network architecture on varying time-steps, objective functions and number of nodes. The experimental setup we employ models tasks synonymous with those in Wireless Sensor Networks. Our contributions show that Recurrent Neural Networks can communicate through their hidden states and we achieve promising results.

Download Full-text

Convolutional Recurrent Neural Networks for Text Classification

Journal of Database Management ◽

10.4018/jdm.2021100105 ◽

2021 ◽

Vol 32 (4) ◽

pp. 65-82

Author(s):

Shengfei Lyu ◽

Jiaqi Liu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Recurrent Neural Network ◽

Text Classification ◽

Recurrent Neural Networks ◽

Superior Performance ◽

Novel Approach ◽

Hidden States ◽

Traditional Approaches

Recurrent neural network (RNN) and convolutional neural network (CNN) are two prevailing architectures used in text classification. Traditional approaches combine the strengths of these two networks by straightly streamlining them or linking features extracted from them. In this article, a novel approach is proposed to maintain the strengths of RNN and CNN to a great extent. In the proposed approach, a bi-directional RNN encodes each word into forward and backward hidden states. Then, a neural tensor layer is used to fuse bi-directional hidden states to get word representations. Meanwhile, a convolutional neural network is utilized to learn the importance of each word for text classification. Empirical experiments are conducted on several datasets for text classification. The superior performance of the proposed approach confirms its effectiveness.

Download Full-text

Visual analytics tool for the interpretation of hidden states in recurrent neural networks

Visual Computing for Industry Biomedicine and Art ◽

10.1186/s42492-021-00090-0 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Rafael Garcia ◽

Tanja Munz ◽

Daniel Weiskopf

Keyword(s):

Neural Networks ◽

Language Processing ◽

Recurrent Neural Networks ◽

Visual Analytics ◽

Short Term Memory ◽

Input Sequence ◽

Color Coding ◽

Nonlinear Projection ◽

Brushing And Linking ◽

Hidden States

AbstractIn this paper, we introduce a visual analytics approach aimed at helping machine learning experts analyze the hidden states of layers in recurrent neural networks. Our technique allows the user to interactively inspect how hidden states store and process information throughout the feeding of an input sequence into the network. The technique can help answer questions, such as which parts of the input data have a higher impact on the prediction and how the model correlates each hidden state configuration with a certain output. Our visual analytics approach comprises several components: First, our input visualization shows the input sequence and how it relates to the output (using color coding). In addition, hidden states are visualized through a nonlinear projection into a 2-D visualization space using t-distributed stochastic neighbor embedding to understand the shape of the space of the hidden states. Trajectories are also employed to show the details of the evolution of the hidden state configurations. Finally, a time-multi-class heatmap matrix visualizes the evolution of the expected predictions for multi-class classifiers, and a histogram indicates the distances between the hidden states within the original space. The different visualizations are shown simultaneously in multiple views and support brushing-and-linking to facilitate the analysis of the classifications and debugging for misclassified input sequences. To demonstrate the capability of our approach, we discuss two typical use cases for long short-term memory models applied to two widely used natural language processing datasets.

Download Full-text

Spike timing-dependent plasticity in sparse recurrent neural networks

IEICE Proceeding Series ◽

10.15248/proc.1.485 ◽

2014 ◽

Vol 1 ◽

pp. 485-488

Author(s):

Hideyuki Kato ◽

Tohru Ikeguchi

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Spike Timing ◽

Spike Timing Dependent Plasticity ◽

Dependent Plasticity

Download Full-text

Direct Adaptive Control of Process Systems Using Recurrent Neural Networks

1992 American Control Conference ◽

10.23919/acc.1992.4792020 ◽

1992 ◽

Author(s):

Sanjay Parthasarathy ◽

Alexander G. Parlos ◽

Amir F. Atiya

Keyword(s):

Neural Networks ◽

Adaptive Control ◽

Recurrent Neural Networks ◽

Process Systems ◽

Direct Adaptive Control

Download Full-text

L2 approximation properties of recurrent neural networks

1997 European Control Conference (ECC) ◽

10.23919/ecc.1997.7082360 ◽

1997 ◽

Cited By ~ 1

Author(s):

A. Ruiz ◽

D.H. Owens ◽

S. Townley

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Approximation Properties

Download Full-text

Levenshtein Augmentation Improves Performance of SMILES Based Deep-Learning Synthesis Prediction

10.26434/chemrxiv.12562121 ◽

2020 ◽

Author(s):

Dean Sumner ◽

Jiazhen He ◽

Amol Thakkar ◽

Ola Engkvist ◽

Esben Jannik Bjerrum

Keyword(s):

Neural Networks ◽

Pattern Recognition ◽

Deep Learning ◽

Recurrent Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Sequence Similarity ◽

Learning Models ◽

Underlying Network

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>

Download Full-text