An Optimized Abstractive Text Summarization Model Using Peephole Convolutional LSTM

Abstractive text summarization that generates a summary by paraphrasing a long text remains an open significant problem for natural language processing. In this paper, we present an abstractive text summarization model, multi-layered attentional peephole convolutional LSTM (long short-term memory) (MAPCoL) that automatically generates a summary from a long text. We optimize parameters of MAPCoL using central composite design (CCD) in combination with the response surface methodology (RSM), which gives the highest accuracy in terms of summary generation. We record the accuracy of our model (MAPCoL) on a CNN/DailyMail dataset. We perform a comparative analysis of the accuracy of MAPCoL with that of the state-of-the-art models in different experimental settings. The MAPCoL also outperforms the traditional LSTM-based models in respect of semantic coherence in the output summary.

Download Full-text

A Two-Layer Long Short-Term Memory Network for Bottleneck Prediction in Multi-Job Manufacturing Systems

Volume 3: Manufacturing Equipment and Systems ◽

10.1115/msec2018-6678 ◽

2018 ◽

Cited By ~ 1

Author(s):

Xingjian Lai ◽

Huanyi Shui ◽

Jun Ni

Keyword(s):

Manufacturing Systems ◽

Short Term Memory ◽

Complex Dynamics ◽

State Of The Art ◽

Short Term ◽

Term Memory ◽

Future Production ◽

Effective Manner ◽

Long Short Term Memory ◽

Factory Floor

Throughput bottlenecks define and constrain the productivity of a production line. Prediction of future bottlenecks provides a great support for decision-making on the factory floor, which can help to foresee and formulate appropriate actions before production to improve the system throughput in a cost-effective manner. Bottleneck prediction remains a challenging task in literature. The difficulty lies in the complex dynamics of manufacturing systems. There are multiple factors collaboratively affecting bottleneck conditions, such as machine performance, machine degradation, line structure, operator skill level, and product release schedules. These factors impact on one another in a nonlinear manner and exhibit long-term temporal dependencies. State-of-the-art research utilizes various assumptions to simplify the modeling by reducing the input dimensionality. As a result, those models cannot accurately reflect complex dynamics of the bottleneck in a manufacturing system. To tackle this problem, this paper will propose a systematic framework to design a two-layer Long Short-Term Memory (LSTM) network tailored to the dynamic bottleneck prediction problem in multi-job manufacturing systems. This neural network based approach takes advantage of historical high dimensional factory floor data to predict system bottlenecks dynamically considering the future production planning inputs. The model is demonstrated with data from an automotive underbody assembly line. The result shows that the proposed method can achieve higher prediction accuracy compared with current state-of-the-art approaches.

Download Full-text

Arabic dialect sentiment analysis with ZERO effort. \\ Case study: Algerian dialect

INTELIGENCIA ARTIFICIAL ◽

10.4114/intartif.vol23iss65pp124-135 ◽

2020 ◽

Vol 23 (65) ◽

pp. 124-135

Author(s):

Imane Guellil ◽

Marcelo Mendoza ◽

Faical Azouaou

Keyword(s):

Sentiment Analysis ◽

Short Term Memory ◽

State Of The Art ◽

Short Term ◽

Term Memory ◽

Ongoing Work ◽

Long Short Term Memory ◽

Large Corpus ◽

Unique Condition

This paper presents an analytic study showing that it is entirely possible to analyze the sentiment of an Arabic dialect without constructing any resources. The idea of this work is to use the resources dedicated to a given dialect \textit{X} for analyzing the sentiment of another dialect \textit{Y}. The unique condition is to have \textit{X} and \textit{Y} in the same category of dialects. We apply this idea on Algerian dialect, which is a Maghrebi Arabic dialect that suffers from limited available tools and other handling resources required for automatic sentiment analysis. To do this analysis, we rely on Maghrebi dialect resources and two manually annotated sentiment corpus for respectively Tunisian and Moroccan dialect. We also use a large corpus for Maghrebi dialect. We use a state-of-the-art system and propose a new deep learning architecture for automatically classify the sentiment of Arabic dialect (Algerian dialect). Experimental results show that F1-score is up to 83% and it is achieved by Multilayer Perceptron (MLP) with Tunisian corpus and with Long short-term memory (LSTM) with the combination of Tunisian and Moroccan. An improvement of 15% compared to its closest competitor was observed through this study. Ongoing work is aimed at manually constructing an annotated sentiment corpus for Algerian dialect and comparing the results

Download Full-text

Chinese Text Classification Model Based on Deep Learning

Future Internet ◽

10.3390/fi10110113 ◽

2018 ◽

Vol 10 (11) ◽

pp. 113 ◽

Cited By ~ 17

Author(s):

Yue Li ◽

Xutao Wang ◽

Pengjian Xu

Keyword(s):

Neural Network ◽

Deep Learning ◽

Language Processing ◽

Chinese Text ◽

Text Classification ◽

Short Term Memory ◽

Classification Model ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Text classification is of importance in natural language processing, as the massive text information containing huge amounts of value needs to be classified into different categories for further use. In order to better classify text, our paper tries to build a deep learning model which achieves better classification results in Chinese text than those of other researchers’ models. After comparing different methods, long short-term memory (LSTM) and convolutional neural network (CNN) methods were selected as deep learning methods to classify Chinese text. LSTM is a special kind of recurrent neural network (RNN), which is capable of processing serialized information through its recurrent structure. By contrast, CNN has shown its ability to extract features from visual imagery. Therefore, two layers of LSTM and one layer of CNN were integrated to our new model: the BLSTM-C model (BLSTM stands for bi-directional long short-term memory while C stands for CNN.) LSTM was responsible for obtaining a sequence output based on past and future contexts, which was then input to the convolutional layer for extracting features. In our experiments, the proposed BLSTM-C model was evaluated in several ways. In the results, the model exhibited remarkable performance in text classification, especially in Chinese texts.

Download Full-text

Long Short-Term Memory with Dynamic Skip Connections

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016481 ◽

2019 ◽

Vol 33 ◽

pp. 6481-6488 ◽

Cited By ~ 3

Author(s):

Tao Gui ◽

Qi Zhang ◽

Lujun Zhao ◽

Yaosong Lin ◽

Minlong Peng ◽

...

Keyword(s):

Language Processing ◽

Short Term Memory ◽

Training Data ◽

Sequential Data ◽

Short Term ◽

Term Memory ◽

Transition Functions ◽

Proposed Model ◽

Long Short Term Memory

In recent years, long short-term memory (LSTM) has been successfully used to model sequential data of variable length. However, LSTM can still experience difficulty in capturing long-term dependencies. In this work, we tried to alleviate this problem by introducing a dynamic skip connection, which can learn to directly connect two dependent words. Since there is no dependency information in the training data, we propose a novel reinforcement learning-based method to model the dependency relationship and connect dependent words. The proposed model computes the recurrent transition functions based on the skip connections, which provides a dynamic skipping advantage over RNNs that always tackle entire sentences sequentially. Our experimental results on three natural language processing tasks demonstrate that the proposed method can achieve better performance than existing methods. In the number prediction experiment, the proposed model outperformed LSTM with respect to accuracy by nearly 20%.

Download Full-text

COVID-19 ChatBot

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38757 ◽

2021 ◽

Vol 9 (11) ◽

pp. 44-49

Author(s):

Satish Tirumalapudi

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

And Control ◽

Prediction Problems

Abstract: Chat bots are software applications that help users to communicate with the machine and get the required result, this is where Natural Language Processing (NLP) comes into the picture. Natural language processing is based on deep learning that enables computers to acquire meaning from inputs given by the users. Natural language processing techniques can make possible the use of natural language to express ideas, thus drastically increasing accessibility. NLP engines rely on the elements of intent, utterance, entity, context, and session. Here in this project, we will be using Deep learning techniques which will be trained on the dataset which contains categories, patterns, and responses. Long Short-Term Memory (LSTM) is a Recurrent Neural Network that is capable of learning order dependence in sequence prediction problems. One of the most popular RNN approaches is LSTM to identify and control a dynamic system. We use an RNN to classify the category user’s message belongs to and then will give a response from the list of responses. Keywords: NLP – Natural Language Processing, LSTM – Long Short Term Memory, RNN – Recurrent Neural Networks.

Download Full-text

Abstractive Arabic Text Summarization Based on Deep Learning

Computational Intelligence and Neuroscience ◽

10.1155/2022/1566890 ◽

2022 ◽

Vol 2022 ◽

pp. 1-14

Author(s):

Y.M. Wazery ◽

Marwa E. Saleh ◽

Abdullah Alharbi ◽

Abdelmgeid A. Ali

Keyword(s):

Short Term Memory ◽

Attention Mechanism ◽

Text Summarization ◽

Arabic Text ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Arabic Text Summarization ◽

Abstractive Summarization ◽

Research Studies

Text summarization (TS) is considered one of the most difficult tasks in natural language processing (NLP). It is one of the most important challenges that stand against the modern computer system’s capabilities with all its new improvement. Many papers and research studies address this task in literature but are being carried out in extractive summarization, and few of them are being carried out in abstractive summarization, especially in the Arabic language due to its complexity. In this paper, an abstractive Arabic text summarization system is proposed, based on a sequence-to-sequence model. This model works through two components, encoder and decoder. Our aim is to develop the sequence-to-sequence model using several deep artificial neural networks to investigate which of them achieves the best performance. Different layers of Gated Recurrent Units (GRU), Long Short-Term Memory (LSTM), and Bidirectional Long Short-Term Memory (BiLSTM) have been used to develop the encoder and the decoder. In addition, the global attention mechanism has been used because it provides better results than the local attention mechanism. Furthermore, AraBERT preprocess has been applied in the data preprocessing stage that helps the model to understand the Arabic words and achieves state-of-the-art results. Moreover, a comparison between the skip-gram and the continuous bag of words (CBOW) word2Vec word embedding models has been made. We have built these models using the Keras library and run-on Google Colab Jupiter notebook to run seamlessly. Finally, the proposed system is evaluated through ROUGE-1, ROUGE-2, ROUGE-L, and BLEU evaluation metrics. The experimental results show that three layers of BiLSTM hidden states at the encoder achieve the best performance. In addition, our proposed system outperforms the other latest research studies. Also, the results show that abstractive summarization models that use the skip-gram word2Vec model outperform the models that use the CBOW word2Vec model.

Download Full-text

Text Summarization on Telugu e-News based on Long-Short Term Memory with Rectified Adam Optimizer

International Journal of Computing and Digital Systems ◽

10.12785/ijcds/110130 ◽

2022 ◽

Vol 11 (1) ◽

pp. 355-368

Author(s):

Kishore Kumar Mamidala ◽

Suresh Kumar Sanampudi

Keyword(s):

Short Term Memory ◽

Text Summarization ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Download Full-text

Convolutional Auto-encoding of Sentence Topics for Image Paragraph Generation

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/132 ◽

2019 ◽

Cited By ~ 7

Author(s):

Jing Wang ◽

Yingwei Pan ◽

Ting Yao ◽

Jinhui Tang ◽

Tao Mei

Keyword(s):

Coherent Structure ◽

Topic Modeling ◽

Short Term Memory ◽

State Of The Art ◽

Attention Mechanism ◽

Visual Content ◽

Short Term ◽

Term Memory ◽

Sentence Level ◽

Long Short Term Memory

Image paragraph generation is the task of producing a coherent story (usually a paragraph) that describes the visual content of an image. The problem nevertheless is not trivial especially when there are multiple descriptive and diverse gists to be considered for paragraph generation, which often happens in real images. A valid question is how to encapsulate such gists/topics that are worthy of mention from an image, and then describe the image from one topic to another but holistically with a coherent structure. In this paper, we present a new design --- Convolutional Auto-Encoding (CAE) that purely employs convolutional and deconvolutional auto-encoding framework for topic modeling on the region-level features of an image. Furthermore, we propose an architecture, namely CAE plus Long Short-Term Memory (dubbed as CAE-LSTM), that novelly integrates the learnt topics in support of paragraph generation. Technically, CAE-LSTM capitalizes on a two-level LSTM-based paragraph generation framework with attention mechanism. The paragraph-level LSTM captures the inter-sentence dependency in a paragraph, while sentence-level LSTM is to generate one sentence which is conditioned on each learnt topic. Extensive experiments are conducted on Stanford image paragraph dataset, and superior results are reported when comparing to state-of-the-art approaches. More remarkably, CAE-LSTM increases CIDEr performance from 20.93% to 25.15%.

Download Full-text

Penerapan Convolutional Long Short-Term Memory untuk Klasifikasi Teks Berita Bahasa Indonesia

Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI) ◽

10.22146/jnteti.v10i4.2438 ◽

2021 ◽

Vol 10 (4) ◽

pp. 354-361

Author(s):

Yudi Widhiyasana ◽

Transmissia Semiawan ◽

Ilham Gibran Achmad Mudzakir ◽

Muhammad Randi Noor

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Short Term Memory ◽

Learning Rate ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Bahasa Indonesia

Klasifikasi teks saat ini telah menjadi sebuah bidang yang banyak diteliti, khususnya terkait Natural Language Processing (NLP). Terdapat banyak metode yang dapat dimanfaatkan untuk melakukan klasifikasi teks, salah satunya adalah metode deep learning. RNN, CNN, dan LSTM merupakan beberapa metode deep learning yang umum digunakan untuk mengklasifikasikan teks. Makalah ini bertujuan menganalisis penerapan kombinasi dua buah metode deep learning, yaitu CNN dan LSTM (C-LSTM). Kombinasi kedua metode tersebut dimanfaatkan untuk melakukan klasifikasi teks berita bahasa Indonesia. Data yang digunakan adalah teks berita bahasa Indonesia yang dikumpulkan dari portal-portal berita berbahasa Indonesia. Data yang dikumpulkan dikelompokkan menjadi tiga kategori berita berdasarkan lingkupnya, yaitu “Nasional”, “Internasional”, dan “Regional”. Dalam makalah ini dilakukan eksperimen pada tiga buah variabel penelitian, yaitu jumlah dokumen, ukuran batch, dan nilai learning rate dari C-LSTM yang dibangun. Hasil eksperimen menunjukkan bahwa nilai F1-score yang diperoleh dari hasil klasifikasi menggunakan metode C-LSTM adalah sebesar 93,27%. Nilai F1-score yang dihasilkan oleh metode C-LSTM lebih besar dibandingkan dengan CNN, dengan nilai 89,85%, dan LSTM, dengan nilai 90,87%. Dengan demikian, dapat disimpulkan bahwa kombinasi dua metode deep learning, yaitu CNN dan LSTM (C-LSTM),memiliki kinerja yang lebih baik dibandingkan dengan CNN dan LSTM.

Download Full-text

Groundwater level forecasting with artificial neural networks: a comparison of long short-term memory (LSTM), convolutional neural networks (CNNs), and non-linear autoregressive networks with exogenous input (NARX)

Hydrology and Earth System Sciences ◽

10.5194/hess-25-1671-2021 ◽

2021 ◽

Vol 25 (3) ◽

pp. 1671-1687

Author(s):

Andreas Wunsch ◽

Tanja Liesch ◽

Stefan Broda

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Convolutional Neural Networks ◽

Groundwater Level ◽

Short Term Memory ◽

State Of The Art ◽

Short Term ◽

Term Memory ◽

Non Linear ◽

Long Short Term Memory

Abstract. It is now well established to use shallow artificial neural networks (ANNs) to obtain accurate and reliable groundwater level forecasts, which are an important tool for sustainable groundwater management. However, we observe an increasing shift from conventional shallow ANNs to state-of-the-art deep-learning (DL) techniques, but a direct comparison of the performance is often lacking. Although they have already clearly proven their suitability, shallow recurrent networks frequently seem to be excluded from the study design due to the euphoria about new DL techniques and its successes in various disciplines. Therefore, we aim to provide an overview on the predictive ability in terms of groundwater levels of shallow conventional recurrent ANNs, namely non-linear autoregressive networks with exogenous input (NARX) and popular state-of-the-art DL techniques such as long short-term memory (LSTM) and convolutional neural networks (CNNs). We compare the performance on both sequence-to-value (seq2val) and sequence-to-sequence (seq2seq) forecasting on a 4-year period while using only few, widely available and easy to measure meteorological input parameters, which makes our approach widely applicable. Further, we also investigate the data dependency in terms of time series length of the different ANN architectures. For seq2val forecasts, NARX models on average perform best; however, CNNs are much faster and only slightly worse in terms of accuracy. For seq2seq forecasts, mostly NARX outperform both DL models and even almost reach the speed of CNNs. However, NARX are the least robust against initialization effects, which nevertheless can be handled easily using ensemble forecasting. We showed that shallow neural networks, such as NARX, should not be neglected in comparison to DL techniques especially when only small amounts of training data are available, where they can clearly outperform LSTMs and CNNs; however, LSTMs and CNNs might perform substantially better with a larger dataset, where DL really can demonstrate its strengths, which is rarely available in the groundwater domain though.

Download Full-text