Long Short-Term Memory with Dynamic Skip Connections

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016481 ◽

2019 ◽

Vol 33 ◽

pp. 6481-6488 ◽

Cited By ~ 3

Author(s):

Tao Gui ◽

Qi Zhang ◽

Lujun Zhao ◽

Yaosong Lin ◽

Minlong Peng ◽

...

Keyword(s):

Language Processing ◽

Short Term Memory ◽

Training Data ◽

Sequential Data ◽

Short Term ◽

Term Memory ◽

Transition Functions ◽

Proposed Model ◽

Long Short Term Memory

In recent years, long short-term memory (LSTM) has been successfully used to model sequential data of variable length. However, LSTM can still experience difficulty in capturing long-term dependencies. In this work, we tried to alleviate this problem by introducing a dynamic skip connection, which can learn to directly connect two dependent words. Since there is no dependency information in the training data, we propose a novel reinforcement learning-based method to model the dependency relationship and connect dependent words. The proposed model computes the recurrent transition functions based on the skip connections, which provides a dynamic skipping advantage over RNNs that always tackle entire sentences sequentially. Our experimental results on three natural language processing tasks demonstrate that the proposed method can achieve better performance than existing methods. In the number prediction experiment, the proposed model outperformed LSTM with respect to accuracy by nearly 20%.

Download Full-text

An LSTM-Based Method with Attention Mechanism for Travel Time Prediction

Sensors ◽

10.3390/s19040861 ◽

2019 ◽

Vol 19 (4) ◽

pp. 861 ◽

Cited By ~ 21

Author(s):

Xiangdong Ran ◽

Zhiguang Shan ◽

Yufei Fang ◽

Chuang Lin

Keyword(s):

Short Term Memory ◽

Attention Mechanism ◽

Traffic Prediction ◽

Travel Time Prediction ◽

Short Term ◽

Term Memory ◽

Proposed Model ◽

Departure Time ◽

Long Short Term Memory

Traffic prediction is based on modeling the complex non-linear spatiotemporal traffic dynamics in road network. In recent years, Long Short-Term Memory has been applied to traffic prediction, achieving better performance. The existing Long Short-Term Memory methods for traffic prediction have two drawbacks: they do not use the departure time through the links for traffic prediction, and the way of modeling long-term dependence in time series is not direct in terms of traffic prediction. Attention mechanism is implemented by constructing a neural network according to its task and has recently demonstrated success in a wide range of tasks. In this paper, we propose an Long Short-Term Memory-based method with attention mechanism for travel time prediction. We present the proposed model in a tree structure. The proposed model substitutes a tree structure with attention mechanism for the unfold way of standard Long Short-Term Memory to construct the depth of Long Short-Term Memory and modeling long-term dependence. The attention mechanism is over the output layer of each Long Short-Term Memory unit. The departure time is used as the aspect of the attention mechanism and the attention mechanism integrates departure time into the proposed model. We use AdaGrad method for training the proposed model. Based on the datasets provided by Highways England, the experimental results show that the proposed model can achieve better accuracy than the Long Short-Term Memory and other baseline methods. The case study suggests that the departure time is effectively employed by using attention mechanism.

Download Full-text

An efficient sentiment analysis methodology based on long short-term memory networks

Complex & Intelligent Systems ◽

10.1007/s40747-021-00436-4 ◽

2021 ◽

Author(s):

J. Shobana ◽

M. Murali

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Short Term Memory ◽

Contextual Information ◽

Short Term ◽

Good Decision ◽

Term Memory ◽

Proposed Model ◽

Long Short Term Memory ◽

Current Research Interest

AbstractSentiment analysis is the process of determining the sentiment polarity (positivity, neutrality or negativity) of the text. As online markets have become more popular over the past decades, online retailers and merchants are asking their buyers to share their opinions about the products they have purchased. As a result, millions of reviews are generated daily, making it difficult to make a good decision about whether a consumer should buy a product. Analyzing these enormous concepts is difficult and time-consuming for product manufacturers. Deep learning is the current research interest in Natural language processing. In the proposed model, Skip-gram architecture is used for better feature extraction of semantic and contextual information of words. LSTM (long short-term memory) is used in the proposed model for understanding complex patterns in textual data. To improve the performance of the LSTM, weight parameters are optimized by the adaptive particle Swarm Optimization algorithm. Extensive experiments were conducted on four datasets proved that our proposed APSO-LSTM model secured higher accuracy over the classical methods such as traditional LSTM, ANN, and SVM. According to simulation results, the proposed model is outperforming other existing models in different metrics.

Download Full-text

LSTM based Ensemble Network to enhance the learning of long-term dependencies in chatbot

International Journal for Simulation and Multidisciplinary Design Optimization ◽

10.1051/smdo/2020019 ◽

2020 ◽

Vol 11 ◽

pp. 25

Author(s):

Shruti Patil ◽

Venkatesh M. Mudaliar ◽

Pooja Kamat ◽

Shilpa Gite

Keyword(s):

Short Term Memory ◽

Ensemble Methods ◽

Human Interaction ◽

Short Term ◽

Term Memory ◽

Ensemble Technique ◽

Specific Dimension ◽

Proposed Model ◽

Long Short Term Memory

A chatbot is a software that can reproduce a discussion portraying a specific dimension of articulation among people and machines utilizing Natural Human Language. With the advent of AI, chatbots have developed from being minor guideline-based models to progressively modern models. A striking highlight of the current chatbot frameworks is their capacity to maintain and support explicit highlights and settings of the discussions empowering them to have human interaction in real-time surroundings. The paper presents a detailed database concerning the models utilized to deal with the learning of long haul conditions in a chatbot. The paper proposes a novel crossbreed Long Short Term Memory based Ensemble model to retain the information in specific situations. The proposed model uses a characterized number of Long Short Term Memory Networks as a significant aspect of its working as one to create the aggregate forecast class for the information inquiry and conversation. We found that both of the ensemble methods LSTM and GRU work well in different dataset environments and the ensemble technique is an effective one in chatbot applications.

Download Full-text

A COMBINED DEEP LEARNING MODEL FOR PERSIAN SENTIMENT ANALYSIS

IIUM Engineering Journal ◽

10.31436/iiumej.v20i1.1036 ◽

2019 ◽

Vol 20 (1) ◽

pp. 129-139 ◽

Cited By ~ 2

Author(s):

Zahra Bokaee Nezhad ◽

Mohammad Ali Deihimi

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Sentiment Analysis ◽

Language Processing ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Proposed Model ◽

Long Short Term Memory ◽

Deep Learning Model

With increasing members in social media sites today, people tend to share their views about everything online. It is a convenient way to convey their messages to end users on a specific subject. Sentiment Analysis is a subfield of Natural Language Processing (NLP) that refers to the identification of users’ opinions toward specific topics. It is used in several fields such as marketing, customer services, etc. However, limited works have been done on Persian Sentiment Analysis. On the other hand, deep learning has recently become popular because of its successful role in several Natural Language Processing tasks. The objective of this paper is to propose a novel hybrid deep learning architecture for Persian Sentiment Analysis. According to the proposed model, local features are extracted by Convolutional Neural Networks (CNN) and long-term dependencies are learned by Long Short Term Memory (LSTM). Therefore, the model can harness both CNN's and LSTM's abilities. Furthermore, Word2vec is used for word representation as an unsupervised learning step. To the best of our knowledge, this is the first attempt where a hybrid deep learning model is used for Persian Sentiment Analysis. We evaluate the model on a Persian dataset that is introduced in this study. The experimental results show the effectiveness of the proposed model with an accuracy of 85%. ABSTRAK: Hari ini dengan ahli yang semakin meningkat di laman media sosial, orang cenderung untuk berkongsi pandangan mereka tentang segala-galanya dalam talian. Ini adalah cara mudah untuk menyampaikan mesej mereka kepada pengguna akhir mengenai subjek tertentu. Analisis Sentimen adalah subfield Pemprosesan Bahasa Semula Jadi yang merujuk kepada pengenalan pendapat pengguna ke arah topik tertentu. Ia digunakan dalam beberapa bidang seperti pemasaran, perkhidmatan pelanggan, dan sebagainya. Walau bagaimanapun, kerja-kerja terhad telah dilakukan ke atas Analisis Sentimen Parsi. Sebaliknya, pembelajaran mendalam baru menjadi popular kerana peranannya yang berjaya dalam beberapa tugas Pemprosesan Bahasa Asli (NLP). Objektif makalah ini adalah mencadangkan senibina pembelajaran hibrid yang baru dalam Analisis Sentimen Parsi. Menurut model yang dicadangkan, ciri-ciri tempatan ditangkap oleh Rangkaian Neural Convolutional (CNN) dan ketergantungan jangka panjang dipelajari oleh Long Short Term Memory (LSTM). Oleh itu, model boleh memanfaatkan kebolehan CNN dan LSTM. Selain itu, Word2vec digunakan untuk perwakilan perkataan sebagai langkah pembelajaran tanpa pengawasan. Untuk pengetahuan yang terbaik, ini adalah percubaan pertama di mana model pembelajaran mendalam hibrid digunakan untuk Analisis Sentimen Persia. Kami menilai model pada dataset Persia yang memperkenalkan dalam kajian ini. Keputusan eksperimen menunjukkan keberkesanan model yang dicadangkan dengan ketepatan 85%.

Download Full-text

Modeling Long-term Groundwater Levels By Exploring Deep Bidirectional Long Short-Term Memory using Hydro-climatic Data

Water Resources Management ◽

10.1007/s11269-021-02899-z ◽

2021 ◽

Author(s):

Sangita Dey ◽

Arabin Kumar Dey ◽

Rajesh Kumar Mall

Keyword(s):

Short Term Memory ◽

Climatic Data ◽

Short Term ◽

Term Memory ◽

Groundwater Levels ◽

Long Short Term Memory

Download Full-text

Air pollution forecasting application based on deep learning model and optimization algorithm

Clean Technologies and Environmental Policy ◽

10.1007/s10098-021-02080-5 ◽

2021 ◽

Author(s):

Azim Heydari ◽

Meysam Majidi Nezhad ◽

Davide Astiaso Garcia ◽

Farshid Keynia ◽

Livio De Santoli

Keyword(s):

Air Pollution ◽

Wind Speed ◽

Power Plant ◽

Air Temperature ◽

Short Term Memory ◽

Combined Cycle ◽

Short Term ◽

Term Memory ◽

Proposed Model ◽

Long Short Term Memory

AbstractAir pollution monitoring is constantly increasing, giving more and more attention to its consequences on human health. Since Nitrogen dioxide (NO2) and sulfur dioxide (SO2) are the major pollutants, various models have been developed on predicting their potential damages. Nevertheless, providing precise predictions is almost impossible. In this study, a new hybrid intelligent model based on long short-term memory (LSTM) and multi-verse optimization algorithm (MVO) has been developed to predict and analysis the air pollution obtained from Combined Cycle Power Plants. In the proposed model, long short-term memory model is a forecaster engine to predict the amount of produced NO2 and SO2 by the Combined Cycle Power Plant, where the MVO algorithm is used to optimize the LSTM parameters in order to achieve a lower forecasting error. In addition, in order to evaluate the proposed model performance, the model has been applied using real data from a Combined Cycle Power Plant in Kerman, Iran. The datasets include wind speed, air temperature, NO2, and SO2 for five months (May–September 2019) with a time step of 3-h. In addition, the model has been tested based on two different types of input parameters: type (1) includes wind speed, air temperature, and different lagged values of the output variables (NO2 and SO2); type (2) includes just lagged values of the output variables (NO2 and SO2). The obtained results show that the proposed model has higher accuracy than other combined forecasting benchmark models (ENN-PSO, ENN-MVO, and LSTM-PSO) considering different network input variables. Graphic abstract

Download Full-text

Sentence similarity evaluation using Sent2Vec and siamese neural network with parallel structure

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189593 ◽

2021 ◽

pp. 1-10

Author(s):

Hye-Jeong Song ◽

Tak-Sung Heo ◽

Jong-Dae Kim ◽

Chan-Young Park ◽

Yu-Seop Kim

Keyword(s):

Neural Network ◽

Language Processing ◽

Short Term Memory ◽

Parallel Structure ◽

Short Term ◽

Similarity Estimation ◽

Accurate Judgment ◽

Proposed Model ◽

Sentence Similarity ◽

Long Short Term Memory

Sentence similarity evaluation is a significant task used in machine translation, classification, and information extraction in the field of natural language processing. When two sentences are given, an accurate judgment should be made whether the meaning of the sentences is equivalent even if the words and contexts of the sentences are different. To this end, existing studies have measured the similarity of sentences by focusing on the analysis of words, morphemes, and letters. To measure sentence similarity, this study uses Sent2Vec, a sentence embedding, as well as morpheme word embedding. Vectors representing words are input to the 1-dimension convolutional neural network (1D-CNN) with various sizes of kernels and bidirectional long short-term memory (Bi-LSTM). Self-attention is applied to the features transformed through Bi-LSTM. Subsequently, vectors undergoing 1D-CNN and self-attention are converted through global max pooling and global average pooling to extract specific values, respectively. The vectors generated through the above process are concatenated to the vector generated through Sent2Vec and are represented as a single vector. The vector is input to softmax layer, and finally, the similarity between the two sentences is determined. The proposed model can improve the accuracy by up to 5.42% point compared with the conventional sentence similarity estimation models.

Download Full-text

Urban micro-climate prediction through long short-term memory network with long-term monitoring for on-site building energy estimation

Sustainable Cities and Society ◽

10.1016/j.scs.2021.103227 ◽

2021 ◽

pp. 103227

Author(s):

Muxing Zhang ◽

Xiaosong Zhang ◽

Siyi Guo ◽

Xiaodong Xu ◽

Jiayu Chen ◽

...

Keyword(s):

Short Term Memory ◽

Building Energy ◽

Short Term ◽

Energy Estimation ◽

Term Memory ◽

Memory Network ◽

Long Short Term Memory ◽

Long Term Monitoring ◽

Term Monitoring

Download Full-text

Production Forecasting with the Interwell Interference by Integrating Graph Convolutional and Long Short-Term Memory Neural Network

SPE Reservoir Evaluation & Engineering ◽

10.2118/208596-pa ◽

2021 ◽

pp. 1-17

Author(s):

Enda Du ◽

Yuetian Liu ◽

Ziyan Cheng ◽

Liang Xue ◽

Jing Ma ◽

...

Keyword(s):

Neural Network ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Production Forecasting ◽

Temporal Correlations ◽

Proposed Model ◽

The Mean ◽

Long Short Term Memory ◽

The Impact

Summary Accurate production forecasting is an essential task and accompanies the entire process of reservoir development. With the limitation of prediction principles and processes, the traditional approaches are difficult to make rapid predictions. With the development of artificial intelligence, the data-driven model provides an alternative approach for production forecasting. To fully take the impact of interwell interference on production into account, this paper proposes a deep learning-based hybrid model (GCN-LSTM), where graph convolutional network (GCN) is used to capture complicated spatial patterns between each well, and long short-term memory (LSTM) neural network is adopted to extract intricate temporal correlations from historical production data. To implement the proposed model more efficiently, two data preprocessing procedures are performed: Outliers in the data set are removed by using a box plot visualization, and measurement noise is reduced by a wavelet transform. The robustness and applicability of the proposed model are evaluated in two scenarios of different data types with the root mean square error (RMSE), the mean absolute error (MAE), and the mean absolute percentage error (MAPE). The results show that the proposed model can effectively capture spatial and temporal correlations to make a rapid and accurate oil production forecast.

Download Full-text

Ensemble-Based Feature Selection With Long Short-Term Memory for Classification of Network Intrusion

Advances in Social Networking and Online Communities - E-Collaboration Technologies and Strategies for Competitive Advantage Amid Challenging Times ◽

10.4018/978-1-7998-7764-6.ch008 ◽

2021 ◽

pp. 228-245

Author(s):

Preethi D. ◽

Neelu Khare

Keyword(s):

Feature Selection ◽

Performance Metrics ◽

Short Term Memory ◽

Short Term ◽

Chi Square ◽

Term Memory ◽

Network Intrusion ◽

Proposed Model ◽

Long Short Term Memory

This chapter presents an ensemble-based feature selection with long short-term memory (LSTM) model. A deep recurrent learning model is proposed for classifying network intrusion. This model uses ensemble-based feature selection (EFS) for selecting the appropriate features from the dataset and long short-term memory for the classification of network intrusions. The EFS combines five feature selection techniques, namely information gain, gain ratio, chi-square, correlation-based feature selection, and symmetric uncertainty-based feature selection. The experiments were conducted using the standard benchmark NSL-KDD dataset and implemented using tensor flow and python. The proposed model is evaluated using the classification performance metrics and also compared with all the 41 features without any feature selection as well as with each individual feature selection technique and classified using LSTM. The performance study showed that the proposed model performs better, with 99.8% accuracy, with a higher detection and lower false alarm rates.

Download Full-text