Attention-Based LSTM with Filter Mechanism for Entity Relation Classification

Relation classification is an important research area in the field of natural language processing (NLP), which aims to recognize the relationship between two tagged entities in a sentence. The noise caused by irrelevant words and the word distance between the tagged entities may affect the relation classification accuracy. In this paper, we present a novel model multi-head attention long short term memory (LSTM) network with filter mechanism (MALNet) to extract the text features and classify the relation of two entities in a sentence. In particular, we combine LSTM with attention mechanism to obtain the shallow local information and introduce a filter layer based on attention mechanism to strength the available information. Besides, we design a semantic rule for marking the key word between the target words and construct a key word layer to extract its semantic information. We evaluated the performance of our model on SemEval-2010 Task8 dataset and KBP-37 dataset. We achieved an F1-score of 86.3% on SemEval-2010 Task8 dataset and F1-score of 61.4% on KBP-37 dataset, which shows that our method is superior to the previous state-of-the-art methods.

Download Full-text

A Sentence-Level Joint Relation Classification Model Based on Reinforcement Learning

Computational Intelligence and Neuroscience ◽

10.1155/2021/5557184 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Zhen Liu ◽

XiaoQiang Di ◽

Wei Song ◽

WeiWu Ren

Keyword(s):

Reinforcement Learning ◽

Language Processing ◽

Semantic Processing ◽

Large Scale ◽

Short Term Memory ◽

Attention Mechanism ◽

Training Data ◽

Classification Model ◽

Sentence Level ◽

Relation Classification

Relation classification is an important semantic processing task in the field of natural language processing (NLP). Data sources generally adopt remote monitoring strategies to automatically generate large-scale training data, which inevitably causes label noise problems. At the same time, another challenge is that important information can appear at any place in the sentence. This paper presents a sentence-level joint relation classification model. The model has two modules: a reinforcement learning (RL) agent and a joint network model. In particular, we combine bidirectional long short-term memory (Bi-LSTM) and attention mechanism as a joint model to process the text features of sentences and classify the relation between two entities. At the same time, we introduce an attention mechanism to discover hidden information in sentences. The joint training of the two modules solves the noise problem in relation extraction, sentence-level information extraction, and relation classification. Experimental results demonstrate that the model can effectively deal with data noise and achieve better relation classification performance at the sentence level.

Download Full-text

LAR: A User Behavior Prediction Model in Server Log Based on LSTM-Attention Network and RSC Algorithm

Fuzzy Systems and Data Mining VI - Frontiers in Artificial Intelligence and Applications ◽

10.3233/faia200709 ◽

2020 ◽

Author(s):

Yingying Shang

Keyword(s):

Clustering Algorithm ◽

Short Term Memory ◽

User Behavior ◽

Absolute Error ◽

Research Area ◽

Behavior Prediction ◽

Attention Network ◽

Important Research Area ◽

User Access ◽

Single User

Using server log data to predict the URLs that a user is likely to visit is an important research area in user behavior prediction. In this paper, a predictive model (called LAR) based on the long short-term memory (LSTM) attention network and reciprocal-nearest-neighbors supported clustering algorithm (RSC) for predicting the URL is proposed. First, the LSTM-attention network is used to predict the URL categories a user might visit, and the RSC algorithm is then used to cluster users. Subsequently, the URLs belonging to the same category are determined from the user clusters to predict the URLs that the user might visit. The proposed LAR model considers the time sequence of the user access URL, and the relationship between a single user and group users, which effectively improves the prediction accuracy. The experimental results demonstrate that the LAR model is feasible and effective for user behavior prediction. The accuracy of the mean absolute error and root mean square error of the LAR model are better than those of the other models compared in this study.

Download Full-text

Predicting Outcomes of Business Process Executions Based on LSTM Neural Networks and Attention Mechanism

10.21203/rs.3.rs-260970/v1 ◽

2021 ◽

Author(s):

Jiaojiao Wang ◽

Dongjin Yu ◽

Chengfei Liu ◽

Xiaoxiao Sun

Keyword(s):

Short Term Memory ◽

Attention Mechanism ◽

Sequential Data ◽

Time Prediction ◽

Prediction Time ◽

Highly Sensitive ◽

Long Short Term Memory ◽

Early Decision ◽

Lstm Network ◽

Hidden Layer

Abstract To effectively predict the outcome of an on-going process instance helps make an early decision, which plays an important role in so-called predictive process monitoring. Existing methods in this field are tailor-made for some empirical operations such as the prefix extraction, clustering, and encoding, leading that their relative accuracy is highly sensitive to the dataset. Moreover, they have limitations in real-time prediction applications due to the lengthy prediction time. Since Long Short-term Memory (LSTM) neural network provides a high precision in the prediction of sequential data in several areas, this paper investigates LSTM and its enhancements and proposes three different approaches to build more effective and efficient models for outcome prediction. The first move on enhancement is that we combine the original LSTM network from two directions, forward and backward, to capture more features from the completed cases. The second move on enhancement is that we add attention mechanism after extracting features in the hidden layer of LSTM network to distinct them from their attention weight. A series of extensive experiments are evaluated on twelve real datasets when comparing with other approaches. The results show that our approaches outperform the state-of-the-art ones in terms of prediction effectiveness and time performance.

Download Full-text

CWPC_BiAtt: Character–Word–Position Combined BiLSTM-Attention for Chinese Named Entity Recognition

Information ◽

10.3390/info11010045 ◽

2020 ◽

Vol 11 (1) ◽

pp. 45 ◽

Cited By ~ 1

Author(s):

Shardrom Johnson ◽

Sherlock Shen ◽

Yuanchen Liu

Keyword(s):

Language Processing ◽

Short Term Memory ◽

Conditional Random Field ◽

Named Entity Recognition ◽

Attention Mechanism ◽

Entity Recognition ◽

Position Information ◽

Named Entity ◽

Pos Tagging ◽

Word Position

Usually taken as linguistic features by Part-Of-Speech (POS) tagging, Named Entity Recognition (NER) is a major task in Natural Language Processing (NLP). In this paper, we put forward a new comprehensive-embedding, considering three aspects, namely character-embedding, word-embedding, and pos-embedding stitched in the order we give, and thus get their dependencies, based on which we propose a new Character–Word–Position Combined BiLSTM-Attention (CWPC_BiAtt) for the Chinese NER task. Comprehensive-embedding via the Bidirectional Llong Short-Term Memory (BiLSTM) layer can get the connection between the historical and future information, and then employ the attention mechanism to capture the connection between the content of the sentence at the current position and that at any location. Finally, we utilize Conditional Random Field (CRF) to decode the entire tagging sequence. Experiments show that CWPC_BiAtt model we proposed is well qualified for the NER task on Microsoft Research Asia (MSRA) dataset and Weibo NER corpus. A high precision and recall were obtained, which verified the stability of the model. Position-embedding in comprehensive-embedding can compensate for attention-mechanism to provide position information for the disordered sequence, which shows that comprehensive-embedding has completeness. Looking at the entire model, our proposed CWPC_BiAtt has three distinct characteristics: completeness, simplicity, and stability. Our proposed CWPC_BiAtt model achieved the highest F-score, achieving the state-of-the-art performance in the MSRA dataset and Weibo NER corpus.

Download Full-text

Prediction of Hot Topics of Agricultural Public Opinion Based on Attention Mechanism LSTM Model

International Journal of Agricultural and Environmental Information Systems ◽

10.4018/ijaeis.289429 ◽

2021 ◽

Vol 12 (4) ◽

pp. 1-16

Author(s):

Lifang Fu ◽

Feifei Zhao

Keyword(s):

Public Opinion ◽

Short Term Memory ◽

Classical Swine Fever ◽

Attention Mechanism ◽

Evaluation Indexes ◽

The Public ◽

Changing Trend ◽

Lstm Network ◽

Long Term Trend

In order to timely and accurately analyze the focus and appeal of public opinion on the Internet, A LSTM-ATTN model was proposed to extract the hot topics and predict their changing trend based on tens of thousands of news and commentary messages. First, an improved LDA model was used to extract hot words and classify the hot topics. Aimed to more accurately describe the detailed characteristics and long-term trend of topic popularity, a prediction model is proposed based on attention mechanism Long Short-Term Memory (LSTM) network, which named LSTM-ATTN model. A large number of numerical experiments were carried out using the public opinion information of "African classical swine fever" event in China. According to results of evaluation indexes, the relative superiority of LSTM-ATTN model was demonstrated. It can capture and reflect the inherent characteristics and periodic fluctuations of the agricultural public opinion information. Also, it has higher convergence efficiency and prediction accuracy.

Download Full-text

Vehicle Destination Prediction Using Bidirectional LSTM with Attention Mechanism

Sensors ◽

10.3390/s21248443 ◽

2021 ◽

Vol 21 (24) ◽

pp. 8443

Author(s):

Pietro Casabianca ◽

Yu Zhang ◽

Miguel Martínez-García ◽

Jiafu Wan

Keyword(s):

Language Processing ◽

Traffic Congestion ◽

Short Term Memory ◽

Satellite Navigation ◽

Attention Mechanism ◽

The Other ◽

Transportation Industry ◽

Navigation Data ◽

Average Accuracy ◽

Bidirectional Lstm

Satellite navigation has become ubiquitous to plan and track travelling. Having access to a vehicle’s position enables the prediction of its destination. This opens the possibility to various benefits, such as early warnings of potential hazards, route diversions to pass traffic congestion, and optimizing fuel consumption for hybrid vehicles. Thus, reliably predicting destinations can bring benefits to the transportation industry. This paper investigates using deep learning methods for predicting a vehicle’s destination based on its journey history. With this aim, Dense Neural Networks (DNNs), Long Short-Term Memory (LSTM) networks, Bidirectional LSTM (BiLSTM), and networks with and without attention mechanisms are tested. Especially, LSTM and BiLSTM models with attention mechanism are commonly used for natural language processing and text-classification-related applications. On the other hand, this paper demonstrates the viability of these techniques in the automotive and associated industrial domain, aimed at generating industrial impact. The results of using satellite navigation data show that the BiLSTM with an attention mechanism exhibits better prediction performance destination, achieving an average accuracy of 96% against the test set (4% higher than the average accuracy of the standard BiLSTM) and consistently outperforming the other models by maintaining robustness and stability during forecasting.

Download Full-text

Sentimental Classification of News Headlines using Recurrent Neural Network

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f3573.049620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 207-210

Keyword(s):

Neural Network ◽

Language Processing ◽

Recurrent Neural Network ◽

Short Term Memory ◽

Attention Mechanism ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Negative Comments ◽

News Headlines

Sentiment analysis combines the natural language processing task and analysis of the text that attempts to predict the sentiment of the text in terms of positive and negative comments. Nowadays, the tremendous volume of news originated via different webpages, and it is feasible to determine the opinion of particular news. This work tries to judge completely various machine learning techniques to classify the view of the news headlines. In this project, propose the appliance of Recurrent Neural Network with Long Short Term Memory Unit(LSTM), focus on seeking out similar news headlines, and predict the opinion of news headlines from numerous sources. The main objective is to classify the sentiment of news headlines from various sources using a recurrent neural network. Interestingly, the proposed attention mechanism performs better than the more complex attention mechanism on a held-out set of articles.

Download Full-text

Truncated attention mechanism and cascade loss for cross-modal person re-identification

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210382 ◽

2021 ◽

pp. 1-13

Author(s):

Shuo Shi ◽

Changwei Huo ◽

Yingchun Guo ◽

Stephen Lean ◽

Gang Yan ◽

...

Keyword(s):

Natural Language ◽

Short Term Memory ◽

Principal Component ◽

Image Features ◽

Attention Mechanism ◽

Text And Image ◽

Level Information ◽

Text Features ◽

Lstm Network ◽

Language Description

Person re-identification with natural language description is a process of retrieving the corresponding person’s image from an image dataset according to a text description of the person. The key challenge in this cross-modal task is to extract visual and text features and construct loss functions to achieve cross-modal matching between text and image. Firstly, we designed a two-branch network framework for person re-identification with natural language description. In this framework we include the following: a Bi-directional Long Short-Term Memory (Bi-LSTM) network is used to extract text features and a truncated attention mechanism is proposed to select the principal component of the text features; a MobileNet is used to extract image features. Secondly, we proposed a Cascade Loss Function (CLF), which includes cross-modal matching loss and single modal classification loss, both with relative entropy function, to fully exploit the identity-level information. The experimental results on the CUHK-PEDES dataset demonstrate that our method achieves better results in Top-5 and Top-10 than other current 10 state-of-the-art algorithms.

Download Full-text

ASA: A framework for Arabic sentiment analysis

Journal of Information Science ◽

10.1177/0165551519849516 ◽

2019 ◽

Vol 46 (4) ◽

pp. 544-559 ◽

Cited By ~ 4

Author(s):

Ahmed Oussous ◽

Fatima-Zahra Benjelloun ◽

Ayoub Ait Lahcen ◽

Samir Belfkih

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Opinion Mining ◽

Short Term Memory ◽

Research Area ◽

Support Vector ◽

Learning Models ◽

Arabic Natural Language Processing ◽

Arabic Sentiment Analysis

Sentiment analysis (SA), also known as opinion mining, is a growing important research area. Generally, it helps to automatically determine if a text expresses a positive, negative or neutral sentiment. It enables to mine the huge increasing resources of shared opinions such as social networks, review sites and blogs. In fact, SA is used by many fields and for various languages such as English and Arabic. However, since Arabic is a highly inflectional and derivational language, it raises many challenges. In fact, SA of Arabic text should handle such complex morphology. To better handle these challenges, we decided to provide the research community and Arabic users with a new efficient framework for Arabic Sentiment Analysis (ASA). Our primary goal is to improve the performance of ASA by exploiting deep learning while varying the preprocessing techniques. For that, we implement and evaluate two deep learning models namely convolutional neural network (CNN) and long short-term memory (LSTM) models. The framework offers various preprocessing techniques for ASA (including stemming, normalisation, tokenization and stop words). As a result of this work, we first provide a new rich and publicly available Arabic corpus called Moroccan Sentiment Analysis Corpus (MSAC). Second, the proposed framework demonstrates improvement in ASA. In fact, the experimental results prove that deep learning models have a better performance for ASA than classical approaches (support vector machines, naive Bayes classifiers and maximum entropy). They also show the key role of morphological features in Arabic Natural Language Processing (NLP).

Download Full-text

Bidirectional Recurrent Neural Network Approach for Arabic Named Entity Recognition

Future Internet ◽

10.3390/fi10120123 ◽

2018 ◽

Vol 10 (12) ◽

pp. 123 ◽

Cited By ~ 7

Author(s):

Mohammed Ali ◽

Guanzheng Tan ◽

Aamir Hussain

Keyword(s):

Neural Network ◽

Language Processing ◽

Recurrent Neural Network ◽

Short Term Memory ◽

Named Entity Recognition ◽

Recognition Task ◽

Word Embedding ◽

Entity Recognition ◽

Named Entity ◽

Lstm Network

Recurrent neural network (RNN) has achieved remarkable success in sequence labeling tasks with memory requirement. RNN can remember previous information of a sequence and can thus be used to solve natural language processing (NLP) tasks. Named entity recognition (NER) is a common task of NLP and can be considered a classification problem. We propose a bidirectional long short-term memory (LSTM) model for this entity recognition task of the Arabic text. The LSTM network can process sequences and relate to each part of it, which makes it useful for the NER task. Moreover, we use pre-trained word embedding to train the inputs that are fed into the LSTM network. The proposed model is evaluated on a popular dataset called “ANERcorp.” Experimental results show that the model with word embedding achieves a high F-score measure of approximately 88.01%.

Download Full-text