Rotational Unit of Memory: A Novel Representation Unit for RNNs with Scalable Applications

Stacking long short-term memory (LSTM) cells or gated recurrent units (GRUs) as part of a recurrent neural network (RNN) has become a standard approach to solving a number of tasks ranging from language modeling to text summarization. Although LSTMs and GRUs were designed to model long-range dependencies more accurately than conventional RNNs, they nevertheless have problems copying or recalling information from the long distant past. Here, we derive a phase-coded representation of the memory state, Rotational Unit of Memory (RUM), that unifies the concepts of unitary learning and associative memory. We show experimentally that RNNs based on RUMs can solve basic sequential tasks such as memory copying and memory recall much better than LSTMs/GRUs. We further demonstrate that by replacing LSTM/GRU with RUM units we can apply neural networks to real-world problems such as language modeling and text summarization, yielding results comparable to the state of the art.

Download Full-text

An Optimized Abstractive Text Summarization Model Using Peephole Convolutional LSTM

Symmetry ◽

10.3390/sym11101290 ◽

2019 ◽

Vol 11 (10) ◽

pp. 1290 ◽

Cited By ~ 2

Author(s):

Rahman ◽

Siddiqui

Keyword(s):

Language Processing ◽

Short Term Memory ◽

State Of The Art ◽

Text Summarization ◽

Short Term ◽

Term Memory ◽

Semantic Coherence ◽

Long Short Term Memory ◽

Central Composite ◽

Convolutional Lstm

Abstractive text summarization that generates a summary by paraphrasing a long text remains an open significant problem for natural language processing. In this paper, we present an abstractive text summarization model, multi-layered attentional peephole convolutional LSTM (long short-term memory) (MAPCoL) that automatically generates a summary from a long text. We optimize parameters of MAPCoL using central composite design (CCD) in combination with the response surface methodology (RSM), which gives the highest accuracy in terms of summary generation. We record the accuracy of our model (MAPCoL) on a CNN/DailyMail dataset. We perform a comparative analysis of the accuracy of MAPCoL with that of the state-of-the-art models in different experimental settings. The MAPCoL also outperforms the traditional LSTM-based models in respect of semantic coherence in the output summary.

Download Full-text

A New Delay Connection for Long Short-Term Memory Networks

International Journal of Neural Systems ◽

10.1142/s0129065717500617 ◽

2018 ◽

Vol 28 (06) ◽

pp. 1750061 ◽

Cited By ~ 3

Author(s):

Jianyong Wang ◽

Lei Zhang ◽

Yuanyuan Chen ◽

Zhang Yi

Keyword(s):

Short Term Memory ◽

State Of The Art ◽

Language Modeling ◽

Sequence Classification ◽

Short Term ◽

Learning Capability ◽

Term Memory ◽

Negative Effect ◽

Classification Tasks ◽

Long Short Term Memory

Connections play a crucial role in neural network (NN) learning because they determine how information flows in NNs. Suitable connection mechanisms may extensively enlarge the learning capability and reduce the negative effect of gradient problems. In this paper, a new delay connection is proposed for Long Short-Term Memory (LSTM) unit to develop a more sophisticated recurrent unit, called Delay Connected LSTM (DCLSTM). The proposed delay connection brings two main merits to DCLSTM with introducing no extra parameters. First, it allows the output of the DCLSTM unit to maintain LSTM, which is absent in the LSTM unit. Second, the proposed delay connection helps to bridge the error signals to previous time steps and allows it to be back-propagated across several layers without vanishing too quickly. To evaluate the performance of the proposed delay connections, the DCLSTM model with and without peephole connections was compared with four state-of-the-art recurrent model on two sequence classification tasks. DCLSTM model outperformed the other models with higher accuracy and F1[Formula: see text]score. Furthermore, the networks with multiple stacked DCLSTM layers and the standard LSTM layer were evaluated on Penn Treebank (PTB) language modeling. The DCLSTM model achieved lower perplexity (PPL)/bit-per-character (BPC) than the standard LSTM model. The experiments demonstrate that the learning of the DCLSTM models is more stable and efficient.

Download Full-text

Forecasting Energy Consumption of Wastewater Treatment Plants with a Transfer Learning Approach for Sustainable Cities

Electronics ◽

10.3390/electronics10101149 ◽

2021 ◽

Vol 10 (10) ◽

pp. 1149

Author(s):

Pedro Oliveira ◽

Bruno Fernandes ◽

Cesar Analide ◽

Paulo Novais

Keyword(s):

Wastewater Treatment ◽

Energy Consumption ◽

Transfer Learning ◽

Short Term Memory ◽

Wastewater Treatment Plants ◽

Learning Approach ◽

Sustainable Cities ◽

Short Term ◽

Long Short Term Memory ◽

Gated Recurrent Units

A major challenge of today’s society is to make large urban centres more sustainable. Improving the energy efficiency of the various infrastructures that make up cities is one aspect being considered when improving their sustainability, with Wastewater Treatment Plants (WWTPs) being one of them. Consequently, this study aims to conceive, tune, and evaluate a set of candidate deep learning models with the goal being to forecast the energy consumption of a WWTP, following a recursive multi-step approach. Three distinct types of models were experimented, in particular, Long Short-Term Memory networks (LSTMs), Gated Recurrent Units (GRUs), and uni-dimensional Convolutional Neural Networks (CNNs). Uni- and multi-variate settings were evaluated, as well as different methods for handling outliers. Promising forecasting results were obtained by CNN-based models, being this difference statistically significant when compared to LSTMs and GRUs, with the best model presenting an approximate overall error of 630 kWh when on a multi-variate setting. Finally, to overcome the problem of data scarcity in WWTPs, transfer learning processes were implemented, with promising results being achieved when using a pre-trained uni-variate CNN model, with the overall error reducing to 325 kWh.

Download Full-text

A Two-Layer Long Short-Term Memory Network for Bottleneck Prediction in Multi-Job Manufacturing Systems

Volume 3: Manufacturing Equipment and Systems ◽

10.1115/msec2018-6678 ◽

2018 ◽

Cited By ~ 1

Author(s):

Xingjian Lai ◽

Huanyi Shui ◽

Jun Ni

Keyword(s):

Manufacturing Systems ◽

Short Term Memory ◽

Complex Dynamics ◽

State Of The Art ◽

Short Term ◽

Term Memory ◽

Future Production ◽

Effective Manner ◽

Long Short Term Memory ◽

Factory Floor

Throughput bottlenecks define and constrain the productivity of a production line. Prediction of future bottlenecks provides a great support for decision-making on the factory floor, which can help to foresee and formulate appropriate actions before production to improve the system throughput in a cost-effective manner. Bottleneck prediction remains a challenging task in literature. The difficulty lies in the complex dynamics of manufacturing systems. There are multiple factors collaboratively affecting bottleneck conditions, such as machine performance, machine degradation, line structure, operator skill level, and product release schedules. These factors impact on one another in a nonlinear manner and exhibit long-term temporal dependencies. State-of-the-art research utilizes various assumptions to simplify the modeling by reducing the input dimensionality. As a result, those models cannot accurately reflect complex dynamics of the bottleneck in a manufacturing system. To tackle this problem, this paper will propose a systematic framework to design a two-layer Long Short-Term Memory (LSTM) network tailored to the dynamic bottleneck prediction problem in multi-job manufacturing systems. This neural network based approach takes advantage of historical high dimensional factory floor data to predict system bottlenecks dynamically considering the future production planning inputs. The model is demonstrated with data from an automotive underbody assembly line. The result shows that the proposed method can achieve higher prediction accuracy compared with current state-of-the-art approaches.

Download Full-text

Arabic dialect sentiment analysis with ZERO effort. \\ Case study: Algerian dialect

INTELIGENCIA ARTIFICIAL ◽

10.4114/intartif.vol23iss65pp124-135 ◽

2020 ◽

Vol 23 (65) ◽

pp. 124-135

Author(s):

Imane Guellil ◽

Marcelo Mendoza ◽

Faical Azouaou

Keyword(s):

Sentiment Analysis ◽

Short Term Memory ◽

State Of The Art ◽

Short Term ◽

Term Memory ◽

Ongoing Work ◽

Long Short Term Memory ◽

Large Corpus ◽

Unique Condition

This paper presents an analytic study showing that it is entirely possible to analyze the sentiment of an Arabic dialect without constructing any resources. The idea of this work is to use the resources dedicated to a given dialect \textit{X} for analyzing the sentiment of another dialect \textit{Y}. The unique condition is to have \textit{X} and \textit{Y} in the same category of dialects. We apply this idea on Algerian dialect, which is a Maghrebi Arabic dialect that suffers from limited available tools and other handling resources required for automatic sentiment analysis. To do this analysis, we rely on Maghrebi dialect resources and two manually annotated sentiment corpus for respectively Tunisian and Moroccan dialect. We also use a large corpus for Maghrebi dialect. We use a state-of-the-art system and propose a new deep learning architecture for automatically classify the sentiment of Arabic dialect (Algerian dialect). Experimental results show that F1-score is up to 83% and it is achieved by Multilayer Perceptron (MLP) with Tunisian corpus and with Long short-term memory (LSTM) with the combination of Tunisian and Moroccan. An improvement of 15% compared to its closest competitor was observed through this study. Ongoing work is aimed at manually constructing an annotated sentiment corpus for Algerian dialect and comparing the results

Download Full-text

Language Modeling Using Part-of-speech and Long Short-Term Memory Networks

2019 9th International Conference on Computer and Knowledge Engineering (ICCKE) ◽

10.1109/iccke48569.2019.8964806 ◽

2019 ◽

Author(s):

Sanaz Saki Norouzi ◽

Ahmad Akbari ◽

Babak Nasersharif

Keyword(s):

Short Term Memory ◽

Language Modeling ◽

Short Term ◽

Term Memory ◽

Part Of Speech ◽

Long Short Term Memory

Download Full-text

Building robust models for Human Activity Recognition from raw accelerometers data using Gated Recurrent Units and Long Short Term Memory Neural Networks

2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) ◽

10.1109/embc.2019.8857288 ◽

2019 ◽

Cited By ~ 1

Author(s):

Jeremiah Okai ◽

Stylianos Paraschiakos ◽

Marian Beekman ◽

Arno Knobbe ◽

Claudio Rebelo de Sa

Keyword(s):

Neural Networks ◽

Activity Recognition ◽

Human Activity ◽

Short Term Memory ◽

Human Activity Recognition ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Gated Recurrent Units

Download Full-text

JAZZ MELODY GENERATION USING RECURRENT NETWORKS AND REINFORCEMENT LEARNING

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213006002849 ◽

2006 ◽

Vol 15 (04) ◽

pp. 623-650

Author(s):

JUDY A. FRANKLIN

Keyword(s):

Reinforcement Learning ◽

Dynamic Systems ◽

Recurrent Neural Networks ◽

Short Term Memory ◽

State Of The Art ◽

Recurrent Network ◽

Recurrent Networks ◽

Short Term ◽

Long Short Term Memory ◽

Lstm Network

Recurrent (neural) networks have been deployed as models for learning musical processes, by computational scientists who study processes such as dynamic systems. Over time, more intricate music has been learned as the state of the art in recurrent networks improves. One particular recurrent network, the Long Short-Term Memory (LSTM) network shows promise for learning long songs, and generating new songs. We are experimenting with a module containing two inter-recurrent LSTM networks to cooperatively learn several human melodies, based on the songs' harmonic structures, and on the feedback inherent in the network. We show that these networks can learn to reproduce four human melodies. We then present as input new harmonizations, so as to generate new songs. We describe the reharmonizations, and show the new melodies that result. We also present a hierarchical structure for using reinforcement learning to choose LSTM modules during the course of melody generation.

Download Full-text

Abstractive Arabic Text Summarization Based on Deep Learning

Computational Intelligence and Neuroscience ◽

10.1155/2022/1566890 ◽

2022 ◽

Vol 2022 ◽

pp. 1-14

Author(s):

Y.M. Wazery ◽

Marwa E. Saleh ◽

Abdullah Alharbi ◽

Abdelmgeid A. Ali

Keyword(s):

Short Term Memory ◽

Attention Mechanism ◽

Text Summarization ◽

Arabic Text ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Arabic Text Summarization ◽

Abstractive Summarization ◽

Research Studies

Text summarization (TS) is considered one of the most difficult tasks in natural language processing (NLP). It is one of the most important challenges that stand against the modern computer system’s capabilities with all its new improvement. Many papers and research studies address this task in literature but are being carried out in extractive summarization, and few of them are being carried out in abstractive summarization, especially in the Arabic language due to its complexity. In this paper, an abstractive Arabic text summarization system is proposed, based on a sequence-to-sequence model. This model works through two components, encoder and decoder. Our aim is to develop the sequence-to-sequence model using several deep artificial neural networks to investigate which of them achieves the best performance. Different layers of Gated Recurrent Units (GRU), Long Short-Term Memory (LSTM), and Bidirectional Long Short-Term Memory (BiLSTM) have been used to develop the encoder and the decoder. In addition, the global attention mechanism has been used because it provides better results than the local attention mechanism. Furthermore, AraBERT preprocess has been applied in the data preprocessing stage that helps the model to understand the Arabic words and achieves state-of-the-art results. Moreover, a comparison between the skip-gram and the continuous bag of words (CBOW) word2Vec word embedding models has been made. We have built these models using the Keras library and run-on Google Colab Jupiter notebook to run seamlessly. Finally, the proposed system is evaluated through ROUGE-1, ROUGE-2, ROUGE-L, and BLEU evaluation metrics. The experimental results show that three layers of BiLSTM hidden states at the encoder achieve the best performance. In addition, our proposed system outperforms the other latest research studies. Also, the results show that abstractive summarization models that use the skip-gram word2Vec model outperform the models that use the CBOW word2Vec model.

Download Full-text

Text Summarization on Telugu e-News based on Long-Short Term Memory with Rectified Adam Optimizer

International Journal of Computing and Digital Systems ◽

10.12785/ijcds/110130 ◽

2022 ◽

Vol 11 (1) ◽

pp. 355-368

Author(s):

Kishore Kumar Mamidala ◽

Suresh Kumar Sanampudi

Keyword(s):

Short Term Memory ◽

Text Summarization ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Download Full-text