Gated Hierarchical LSTMs for Target-Based Sentiment Analysis

2018 ◽  
Vol 28 (11n12) ◽  
pp. 1719-1737
Author(s):  
Hao Wang ◽  
Xiaofang Zhang ◽  
Bin Liang ◽  
Qian Zhou ◽  
Baowen Xu

In the field of target-based sentiment analysis, the deep neural model combining attention mechanism is a remarkable success. In current research, it is commonly seen that attention mechanism is combined with Long Short-Term Memory (LSTM) networks. However, such neural network-based architectures generally rely on complex computation and only focus on single target. In this paper, we propose a gated hierarchical LSTM (GH-LSTMs) model which combines regional LSTM and sentence-level LSTM via a gated operation for the task of target-based sentiment analysis. This approach can distinguish different polarities of sentiment of different targets in the same sentence through a regional LSTM. Furthermore, it is able to concentrate on the long-distance dependency of target in the whole sentence via a sentence-level LSTM. The final results of our experiments on multi-domain datasets of two languages from SemEval 2016 indicate that our approach yields better performance than Support Vector Machine (SVM) and several typical neural network models. A case study of some typical examples also makes a supplement to this conclusion.

2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Xiaodi Wang ◽  
Xiaoliang Chen ◽  
Mingwei Tang ◽  
Tian Yang ◽  
Zhen Wang

The aim of aspect-level sentiment analysis is to identify the sentiment polarity of a given target term in sentences. Existing neural network models provide a useful account of how to judge the polarity. However, context relative position information for the target terms is adversely ignored under the limitation of training datasets. Considering position features between words into the models can improve the accuracy of sentiment classification. Hence, this study proposes an improved classification model by combining multilevel interactive bidirectional Gated Recurrent Unit (GRU), attention mechanisms, and position features (MI-biGRU). Firstly, the position features of words in a sentence are initialized to enrich word embedding. Secondly, the approach extracts the features of target terms and context by using a well-constructed multilevel interactive bidirectional neural network. Thirdly, an attention mechanism is introduced so that the model can pay greater attention to those words that are important for sentiment analysis. Finally, four classic sentiment classification datasets are used to deal with aspect-level tasks. Experimental results indicate that there is a correlation between the multilevel interactive attention network and the position features. MI-biGRU can obviously improve the performance of classification.


2019 ◽  
Vol 53 (1) ◽  
pp. 2-19 ◽  
Author(s):  
Erion Çano ◽  
Maurizio Morisio

Purpose The fabulous results of convolution neural networks in image-related tasks attracted attention of text mining, sentiment analysis and other text analysis researchers. It is, however, difficult to find enough data for feeding such networks, optimize their parameters, and make the right design choices when constructing network architectures. The purpose of this paper is to present the creation steps of two big data sets of song emotions. The authors also explore usage of convolution and max-pooling neural layers on song lyrics, product and movie review text data sets. Three variants of a simple and flexible neural network architecture are also compared. Design/methodology/approach The intention was to spot any important patterns that can serve as guidelines for parameter optimization of similar models. The authors also wanted to identify architecture design choices which lead to high performing sentiment analysis models. To this end, the authors conducted a series of experiments with neural architectures of various configurations. Findings The results indicate that parallel convolutions of filter lengths up to 3 are usually enough for capturing relevant text features. Also, max-pooling region size should be adapted to the length of text documents for producing the best feature maps. Originality/value Top results the authors got are obtained with feature maps of lengths 6–18. An improvement on future neural network models for sentiment analysis could be generating sentiment polarity prediction of documents using aggregation of predictions on smaller excerpt of the entire text.


2018 ◽  
Vol 8 (8) ◽  
pp. 1290 ◽  
Author(s):  
Beata Mrugalska

Increasing expectations of industrial system reliability require development of more effective and robust fault diagnosis methods. The paper presents a framework for quality improvement on the neural model applied for fault detection purposes. In particular, the proposed approach starts with an adaptation of the modified quasi-outer-bounding algorithm towards non-linear neural network models. Subsequently, its convergence is proven using quadratic boundedness paradigm. The obtained algorithm is then equipped with the sequential D-optimum experimental design mechanism allowing gradual reduction of the neural model uncertainty. Finally, an emerging robust fault detection framework on the basis of the neural network uncertainty description as the adaptive thresholds is proposed.


2019 ◽  
Vol 9 (19) ◽  
pp. 3945 ◽  
Author(s):  
Houssem Gasmi ◽  
Jannik Laval ◽  
Abdelaziz Bouras

Extracting cybersecurity entities and the relationships between them from online textual resources such as articles, bulletins, and blogs and converting these resources into more structured and formal representations has important applications in cybersecurity research and is valuable for professional practitioners. Previous works to accomplish this task were mainly based on utilizing feature-based models. Feature-based models are time-consuming and need labor-intensive feature engineering to describe the properties of entities, domain knowledge, entity context, and linguistic characteristics. Therefore, to alleviate the need for feature engineering, we propose the usage of neural network models, specifically the long short-term memory (LSTM) models to accomplish the tasks of Named Entity Recognition (NER) and Relation Extraction (RE). We evaluated the proposed models on two tasks. The first task is performing NER and evaluating the results against the state-of-the-art Conditional Random Fields (CRFs) method. The second task is performing RE using three LSTM models and comparing their results to assess which model is more suitable for the domain of cybersecurity. The proposed models achieved competitive performance with less feature-engineering work. We demonstrate that exploiting neural network models in cybersecurity text mining is effective and practical.


2020 ◽  
Vol 22 (4) ◽  
pp. 900-915 ◽  
Author(s):  
Xiao-ying Bi ◽  
Bo Li ◽  
Wen-long Lu ◽  
Xin-zhi Zhou

Abstract Accurate daily runoff prediction plays an important role in the management and utilization of water resources. In order to improve the accuracy of prediction, this paper proposes a deep neural network (CAGANet) composed of a convolutional layer, an attention mechanism, a gated recurrent unit (GRU) neural network, and an autoregressive (AR) model. Given that the daily runoff sequence is abrupt and unstable, it is difficult for a single model and combined model to obtain high-precision daily runoff predictions directly. Therefore, this paper uses a linear interpolation method to enhance the stability of hydrological data and apply the augmented data to the CAGANet model, the support vector machine (SVM) model, the long short-term memory (LSTM) neural network and the attention-mechanism-based LSTM model (AM-LSTM). The comparison results show that among the four models based on data augmentation, the CAGANet model proposed in this paper has the best prediction accuracy. Its Nash–Sutcliffe efficiency can reach 0.993. Therefore, the CAGANet model based on data augmentation is a feasible daily runoff forecasting scheme.


2011 ◽  
Vol 403-408 ◽  
pp. 3805-3812 ◽  
Author(s):  
Kong Hui Guo ◽  
Xian Yun Wang

Nonparametric models of hydraulic damper based on support vector regression (SVR) are developed. Then these models are compared with two kinds neural network models. One is backpropagation neural network (BPNN) model; another is radial basis function neural network (RBFNN) model. Comparisons are carried out both on virtual damper and actual damper. The force-velocity relation of a virtual damper is obtained based on a rheological model. Then these data are used to identify the characteristics of the virtual damper. The dynamometer measurements of an actual displacement-dependent damper are obtained by experiment. And these data are used to identify the characteristics of this actual damper. The comparisons show that BPNN model is best at identifying the characteristics of the virtual damper, but SVR model is best at identifying the characteristics of the actual damper. The reason is that all experimental data include noise more or less. When the amplitude of the noise is smaller than the parameter of SVR, the noise can not affect the construction of the resulting model. So when training a model based on the experimental data, SVR is superior to other neural networks methods.


Author(s):  
Osama A. Osman ◽  
Hesham Rakha

Distracted driving (i.e., engaging in secondary tasks) is an epidemic that threatens the lives of thousands every year. Data collected from vehicular sensor technologies and through connectivity provide comprehensive information that, if used to detect driver engagement in secondary tasks, could save thousands of lives and millions of dollars. This study investigates the possibility of achieving this goal using promising deep learning tools. Specifically, two deep neural network models (a multilayer perceptron neural network model and a long short-term memory networks [LSTMN] model) were developed to identify three secondary tasks: cellphone calling, cellphone texting, and conversation with adjacent passengers. The Second Strategic Highway Research Program Naturalistic Driving Study (SHRP 2 NDS) time series data, collected using vehicle sensor technology, were used to train and test the model. The results show excellent performance for the developed models, with a slight improvement for the LSTMN model, with overall classification accuracies ranging between 95 and 96%. Specifically, the models are able to identify the different types of secondary tasks with high accuracies of 100% for calling, 96%–97% for texting, 90%–91% for conversation, and 95%–96% for the normal driving. Based on this performance, the developed models improve on the results of a previous model developed by the author to classify the same three secondary tasks, which had an accuracy of 82%. The model is promising for use in in-vehicle driving assistance technology to report engagement in unlawful tasks or alert drivers to take over control in level 1 and 2 automated vehicles.


2022 ◽  
Vol 2161 (1) ◽  
pp. 012005
Author(s):  
C R Karthik ◽  
Raghunandan ◽  
B Ashwath Rao ◽  
N V Subba Reddy

Abstract A time series is an order of observations engaged serially in time. The prime objective of time series analysis is to build mathematical models that provide reasonable descriptions from training data. The goal of time series analysis is to forecast the forthcoming values of a series based on the history of the same series. Forecasting of stock markets is a thought-provoking problem because of the number of possible variables as well as volatile noise that may contribute to the prices of the stock. However, the capability to analyze stock market leanings could be vital to investors, traders and researchers, hence has been of continued interest. Plentiful arithmetical and machine learning practices have been discovered for stock analysis and forecasting/prediction. In this paper, we perform a comparative study on two very capable artificial neural network models i) Deep Neural Network (DNN) and ii) Long Short-Term Memory (LSTM) a type of recurrent neural network (RNN) in predicting the daily variance of NIFTYIT in BSE (Bombay Stock Exchange) and NSE (National Stock Exchange) markets. DNN was chosen due to its capability to handle complex data with substantial performance and better generalization without being saturated. LSTM model was decided, as it contains intermediary memory which can hold the historic patterns and occurrence of the next prediction depends on the values that preceded it. With both networks, measures were taken to reduce overfitting. Daily predictions of the NIFTYIT index were made to test the generalizability of the models. Both networks performed well at making daily predictions, and both generalized admirably to make daily predictions of the NiftyIT data. The LSTM-RNN outpaced the DNN in terms of forecasting and thus, grips more potential for making longer-term estimates.


Sign in / Sign up

Export Citation Format

Share Document