Inner-process visualization of hidden states in recurrent neural networks

Author(s):  
Rafael Garcia ◽  
Daniel Weiskopf
2020 ◽  
Vol 34 (04) ◽  
pp. 5150-5157
Author(s):  
Fandong Meng ◽  
Jinchao Zhang ◽  
Yang Liu ◽  
Jie Zhou

Recurrent neural networks (RNNs) have been widely used to deal with sequence learning problems. The input-dependent transition function, which folds new observations into hidden states to sequentially construct fixed-length representations of arbitrary-length sequences, plays a critical role in RNNs. Based on single space composition, transition functions in existing RNNs often have difficulty in capturing complicated long-range dependencies. In this paper, we introduce a new Multi-zone Unit (MZU) for RNNs. The key idea is to design a transition function that is capable of modeling multiple space composition. The MZU consists of three components: zone generation, zone composition, and zone aggregation. Experimental results on multiple datasets of the character-level language modeling task and the aspect-based sentiment analysis task demonstrate the superiority of the MZU.


Author(s):  
Prince M Abudu

Applications that require heterogeneous sensor deployments continue to face practical challenges owing to resource constraints within their operating environments (i.e. energy efficiency, computational power and reliability). This has motivated the need for effective ways of selecting a sensing strategy that maximizes detection accuracy for events of interest using available resources and data-driven approaches. Inspired by those limitations, we ask a fundamental question: whether state-of-the-art Recurrent Neural Networks can observe different series of data and communicate their hidden states to collectively solve an objective in a distributed fashion. We realize our answer by conducting a series of systematic analyses of a Communicating Recurrent Neural Network architecture on varying time-steps, objective functions and number of nodes. The experimental setup we employ models tasks synonymous with those in Wireless Sensor Networks. Our contributions show that Recurrent Neural Networks can communicate through their hidden states and we achieve promising results.


2021 ◽  
Vol 32 (4) ◽  
pp. 65-82
Author(s):  
Shengfei Lyu ◽  
Jiaqi Liu

Recurrent neural network (RNN) and convolutional neural network (CNN) are two prevailing architectures used in text classification. Traditional approaches combine the strengths of these two networks by straightly streamlining them or linking features extracted from them. In this article, a novel approach is proposed to maintain the strengths of RNN and CNN to a great extent. In the proposed approach, a bi-directional RNN encodes each word into forward and backward hidden states. Then, a neural tensor layer is used to fuse bi-directional hidden states to get word representations. Meanwhile, a convolutional neural network is utilized to learn the importance of each word for text classification. Empirical experiments are conducted on several datasets for text classification. The superior performance of the proposed approach confirms its effectiveness.


Author(s):  
Rafael Garcia ◽  
Tanja Munz ◽  
Daniel Weiskopf

AbstractIn this paper, we introduce a visual analytics approach aimed at helping machine learning experts analyze the hidden states of layers in recurrent neural networks. Our technique allows the user to interactively inspect how hidden states store and process information throughout the feeding of an input sequence into the network. The technique can help answer questions, such as which parts of the input data have a higher impact on the prediction and how the model correlates each hidden state configuration with a certain output. Our visual analytics approach comprises several components: First, our input visualization shows the input sequence and how it relates to the output (using color coding). In addition, hidden states are visualized through a nonlinear projection into a 2-D visualization space using t-distributed stochastic neighbor embedding to understand the shape of the space of the hidden states. Trajectories are also employed to show the details of the evolution of the hidden state configurations. Finally, a time-multi-class heatmap matrix visualizes the evolution of the expected predictions for multi-class classifiers, and a histogram indicates the distances between the hidden states within the original space. The different visualizations are shown simultaneously in multiple views and support brushing-and-linking to facilitate the analysis of the classifications and debugging for misclassified input sequences. To demonstrate the capability of our approach, we discuss two typical use cases for long short-term memory models applied to two widely used natural language processing datasets.


2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


Sign in / Sign up

Export Citation Format

Share Document