Learning Simpler Language Models with the Differential State Framework

Learning useful information across long time lags is a critical and difficult problem for temporal neural models in tasks such as language modeling. Existing architectures that address the issue are often complex and costly to train. The differential state framework (DSF) is a simple and high-performing design that unifies previously introduced gated neural models. DSF models maintain longer-term memory by learning to interpolate between a fast-changing data-driven representation and a slowly changing, implicitly stable state. Within the DSF framework, a new architecture is presented, the delta-RNN. This model requires hardly any more parameters than a classical, simple recurrent network. In language modeling at the word and character levels, the delta-RNN outperforms popular complex architectures, such as the long short-term memory (LSTM) and the gated recurrent unit (GRU), and, when regularized, performs comparably to several state-of-the-art baselines. At the subword level, the delta-RNN's performance is comparable to that of complex gated architectures.

Download Full-text

Medical examination data prediction using simple recurrent network and long short-term memory

Proceedings of the Sixth International Conference on Emerging Databases Technologies, Applications, and Theory - EDB '16 ◽

10.1145/3007818.3007832 ◽

2016 ◽

Cited By ~ 4

Author(s):

Han-Gyu Kim ◽

Gil-Jin Jang ◽

Ho-Jin Choi ◽

Minho Kim ◽

Young-Won Kim ◽

...

Keyword(s):

Short Term Memory ◽

Recurrent Network ◽

Medical Examination ◽

Short Term ◽

Term Memory ◽

Data Prediction ◽

Simple Recurrent Network ◽

Long Short Term Memory

Download Full-text

Language Modeling Using Part-of-speech and Long Short-Term Memory Networks

2019 9th International Conference on Computer and Knowledge Engineering (ICCKE) ◽

10.1109/iccke48569.2019.8964806 ◽

2019 ◽

Author(s):

Sanaz Saki Norouzi ◽

Ahmad Akbari ◽

Babak Nasersharif

Keyword(s):

Short Term Memory ◽

Language Modeling ◽

Short Term ◽

Term Memory ◽

Part Of Speech ◽

Long Short Term Memory

Download Full-text

Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00115 ◽

2016 ◽

Vol 4 ◽

pp. 521-535 ◽

Cited By ~ 60

Author(s):

Tal Linzen ◽

Emmanuel Dupoux ◽

Yoav Goldberg

Keyword(s):

Language Processing ◽

Short Term Memory ◽

Structural Information ◽

Syntactic Structure ◽

Language Modeling ◽

Language Models ◽

Grammatical Structure ◽

Long Distance ◽

Target Number ◽

Statistical Regularities

The success of long short-term memory (LSTM) neural networks in language processing is typically attributed to their ability to capture long-distance statistical regularities. Linguistic regularities are often sensitive to syntactic structure; can such dependencies be captured by LSTMs, which do not have explicit structural representations? We begin addressing this question using number agreement in English subject-verb dependencies. We probe the architecture’s grammatical competence both using training objectives with an explicit grammatical target (number prediction, grammaticality judgments) and using language models. In the strongly supervised settings, the LSTM achieved very high overall accuracy (less than 1% errors), but errors increased when sequential and structural information conflicted. The frequency of such errors rose sharply in the language-modeling setting. We conclude that LSTMs can capture a non-trivial amount of grammatical structure given targeted supervision, but stronger architectures may be required to further reduce errors; furthermore, the language modeling signal is insufficient for capturing syntax-sensitive dependencies, and should be supplemented with more direct supervision if such dependencies need to be captured.

Download Full-text

Neural models for short-term memory: A quantitative study of average evoked potential waveform

Neuropsychologia ◽

10.1016/0028-3932(78)90107-0 ◽

1978 ◽

Vol 16 (2) ◽

pp. 201-212

Author(s):

William J. Hudspeth ◽

G. Brian Jones

Keyword(s):

Quantitative Study ◽

Evoked Potential ◽

Short Term Memory ◽

Neural Models ◽

Short Term ◽

Term Memory ◽

Potential Waveform

Download Full-text

Contextual Bidirectional Long Short-Term Memory Recurrent Neural Network Language Models: A Generative Approach to Sentiment Analysis

10.18653/v1/e17-1096 ◽

2017 ◽

Cited By ~ 9

Author(s):

Amr Mousa ◽

Björn Schuller

Keyword(s):

Neural Network ◽

Sentiment Analysis ◽

Recurrent Neural Network ◽

Short Term Memory ◽

Language Models ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Network Language ◽

Generative Approach

Download Full-text

DeepACLSTM: deep asymmetric convolutional long short-term memory neural models for protein secondary structure prediction

BMC Bioinformatics ◽

10.1186/s12859-019-2940-0 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 16

Author(s):

Yanbu Guo ◽

Weihua Li ◽

Bingyi Wang ◽

Huiqing Liu ◽

Dongming Zhou

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Short Term Memory ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Neural Models ◽

Short Term ◽

Protein Secondary Structure Prediction ◽

Term Memory ◽

Long Short Term Memory

Download Full-text

Multi Time Steps Prediction dengan Recurrent Neural Network Long Short Term Memory

Matrik Jurnal Manajemen Teknik Informatika dan Rekayasa Komputer ◽

10.30812/matrik.v18i1.344 ◽

2018 ◽

Vol 18 (1) ◽

pp. 115-124

Author(s):

Ahmad Ashril Rizal ◽

Siti Soraya

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Short Term Memory ◽

Recurrent Network ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Cell Cell

Tidak tersedianya sumber daya alam seperti migas, hasil hutan ataupun industri manufaktur yang berskala besar di pulau Lombok menyebabkan pariwisata telah menjadi sektor andalan dalam pembangunan daerah. Kontribusi sektor pariwisata menunjukkan trend yang semakin meningkat dari tahun ke tahun. Dampak positif pengeluaran wisatawan terhadap perekonomian terdistribusikan ke berbagai sektor. Akan tetapi, pemerinatah daerah umumnya akan melakukan persiapan wisata daerah hanya pada saat even lokal saja. Padahal kunjungan wisatawan bukan hanya karena faktor adanya event lokal saja. Persiapan pemerintah daerah dan pelaku wisata sangat penting untuk meningkatkan stabilitas kunjungan wisatawan. Penelitian ini mengkaji prediksi kunjungan wisatawan dengan pendekatan Recurrent Neural Network Long Short Term Memory (RNN LSTM). LSTM berisi informasi di luar aliran normal dari recurrent nertwork dalam gate cell. Cell membuat keputusan tentang apa yang harus disimpan dan kapan mengizinkan pembacaan, penulisan dan penghapusan, melalui gate yang terbuka dan tertutup. Gate menyampaikan informasi berdasarkan kekuatan yang masuk ke dalamnya dan akan difilter menjadi bobot dari gate itu sendiri. Bobot tersebut sama seperti bobot input dan hidden unit yang disesuaikan melalui proses leraning pada recurrent network. Hasil penelitian yang dilakukan dengan membangun model prediksi kunjungan wisatawan dengan RNN LSTM menggunakan multi time steps mendapatkan hasil RMSE sebesar 6888.37 pada data training dan 14684.33 pada data testing.

Download Full-text

Structured Sparsification of Gated Recurrent Neural Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5938 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4989-4996

Author(s):

Ekaterina Lobacheva ◽

Nadezhda Chirkova ◽

Alexander Markovich ◽

Dmitry Vetrov

Keyword(s):

Neural Network ◽

Neural Networks ◽

Text Classification ◽

Recurrent Neural Networks ◽

Short Term Memory ◽

Language Modeling ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Network Compression

One of the most popular approaches for neural network compression is sparsification — learning sparse weight matrices. In structured sparsification, weights are set to zero by groups corresponding to structure units, e. g. neurons. We further develop the structured sparsification approach for the gated recurrent neural networks, e. g. Long Short-Term Memory (LSTM). Specifically, in addition to the sparsification of individual weights and neurons, we propose sparsifying the preactivations of gates. This makes some gates constant and simplifies an LSTM structure. We test our approach on the text classification and language modeling tasks. Our method improves the neuron-wise compression of the model in most of the tasks. We also observe that the resulting structure of gate sparsity depends on the task and connect the learned structures to the specifics of the particular tasks.

Download Full-text