Dynamic Neural Turing Machine with Continuous and Discrete Addressing Schemes

2018 ◽  
Vol 30 (4) ◽  
pp. 857-884 ◽  
Author(s):  
Caglar Gulcehre ◽  
Sarath Chandar ◽  
Kyunghyun Cho ◽  
Yoshua Bengio

We extend the neural Turing machine (NTM) model into a dynamic neural Turing machine (D-NTM) by introducing trainable address vectors. This addressing scheme maintains for each memory cell two separate vectors, content and address vectors. This allows the D-NTM to learn a wide variety of location-based addressing strategies, including both linear and nonlinear ones. We implement the D-NTM with both continuous and discrete read and write mechanisms. We investigate the mechanisms and effects of learning to read and write into a memory through experiments on Facebook bAbI tasks using both a feedforward and GRU controller. We provide extensive analysis of our model and compare different variations of neural Turing machines on this task. We show that our model outperforms long short-term memory and NTM variants. We provide further experimental results on the sequential [Formula: see text]MNIST, Stanford Natural Language Inference, associative recall, and copy tasks.

Electronics ◽  
2019 ◽  
Vol 8 (6) ◽  
pp. 681 ◽  
Author(s):  
Praveen Edward James ◽  
Hou Kit Mun ◽  
Chockalingam Aravind Vaithilingam

The purpose of this work is to develop a spoken language processing system for smart device troubleshooting using human-machine interaction. This system combines a software Bidirectional Long Short Term Memory Cell (BLSTM)-based speech recognizer and a hardware LSTM-based language processor for Natural Language Processing (NLP) using the serial RS232 interface. Mel Frequency Cepstral Coefficient (MFCC)-based feature vectors from the speech signal are directly input into a BLSTM network. A dropout layer is added to the BLSTM layer to reduce over-fitting and improve robustness. The speech recognition component is a combination of an acoustic modeler, pronunciation dictionary, and a BLSTM network for generating query text, and executes in real time with an 81.5% Word Error Rate (WER) and average training time of 45 s. The language processor comprises a vectorizer, lookup dictionary, key encoder, Long Short Term Memory Cell (LSTM)-based training and prediction network, and dialogue manager, and transforms query intent to generate response text with a processing time of 0.59 s, 5% hardware utilization, and an F1 score of 95.2%. The proposed system has a 4.17% decrease in accuracy compared with existing systems. The existing systems use parallel processing and high-speed cache memories to perform additional training, which improves the accuracy. However, the performance of the language processor has a 36.7% decrease in processing time and 50% decrease in hardware utilization, making it suitable for troubleshooting smart devices.


2019 ◽  
Vol 8 (4) ◽  
pp. 5659-5663

In many aging countries, where the population distribution has shifted to old ages, the need for automatic monitoring devices to help an elderly person when they fall is very crucial. Smartphone is one of the best candidate devices for detecting fall because accelerometer and gyroscope sensors embedded in it respond based on human movements. People usually carry their smartphone in any position and can make fall detection method difficult to detect when fall occurs. This research explored the model for unconstraint human fall detection by using the sensors embedded in smartphone for carried/wearable- sensor-based method. We proposed robust model called Ans-Assist using modified cell of Long Short-Term Memory based model as fall recognition model which can detect human fall from any smartphone position (unconstraint). Some experimental results showed that Ans-Assist achieved 0.95 (± 0.028) average accuracy value using unconstraint smartphone positions. This model can adapt the input from accelerometer and gyroscope sensors which are responsive when human fall.


2020 ◽  
Author(s):  
Andrew Larkin ◽  
Perry Hystad

Abstract Contact with nature has been linked to human health, but little information is available for how individuals utilize urban nature. We developed a bidirectional long short-term memory model for classifying whether tweets describe the proposed pathways through which nature influences health: exercise, aesthetic stimulation, stress reduction, safety, air pollution mediation, and/or social interaction. To adjust for regional variations in urban nature context, we integrated OpenStreetMap data on nature and non-nature features for each long-short term memory cell. Training (n = 63073), development (n = 5000), and test (n = 5000) sets consisted of labeled tweets from Portland, Oregon. Tweets from New York City (NYC) (n = 5000) were also labeled to test generalizability. The model was applied retrospectively to 20 million tweets from 2017 and continuously to Meetup posts for 7,708 cities in North America. F1Scores ranged from 0.54 to 0.82 in the NYC dataset, a 24% to 92% improvement over current methods. Precision ranged from 0.58 to 0.83, while recall ranged from 0.39 to 0.81. Adding OpenStreetMap features led to greater percent and absolute F1Scores in NYC compared to Portland. Average F1Scores were greater in models with a nature label in addition to human behavior labels (0.59 vs. 0.65), suggesting health behaviors are influenced by urban nature.


2020 ◽  
Vol 10 (20) ◽  
pp. 7181
Author(s):  
Donghyun Lee ◽  
Jeong-Sik Park ◽  
Myoung-Wan Koo ◽  
Ji-Hwan Kim

The performance of a long short-term memory (LSTM) recurrent neural network (RNN)-based language model has been improved on language model benchmarks. Although a recurrent layer has been widely used, previous studies showed that an LSTM RNN-based language model (LM) cannot overcome the limitation of the context length. To train LMs on longer sequences, attention mechanism-based models have recently been used. In this paper, we propose a LM using a neural Turing machine (NTM) architecture based on localized content-based addressing (LCA). The NTM architecture is one of the attention-based model. However, the NTM encounters a problem with content-based addressing because all memory addresses need to be accessed for calculating cosine similarities. To address this problem, we propose an LCA method. The LCA method searches for the maximum of all cosine similarities generated from all memory addresses. Next, a specific memory area including the selected memory address is normalized with the softmax function. The LCA method is applied to pre-trained NTM-based LM during the test stage. The proposed architecture is evaluated on Penn Treebank and enwik8 LM tasks. The experimental results indicate that the proposed approach outperforms the previous NTM architecture.


2020 ◽  
Vol 12 (2) ◽  
pp. 84-99
Author(s):  
Li-Pang Chen

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.


2020 ◽  
Author(s):  
Abdolreza Nazemi ◽  
Johannes Jakubik ◽  
Andreas Geyer-Schulz ◽  
Frank J. Fabozzi

Sign in / Sign up

Export Citation Format

Share Document