scholarly journals Integrating word status for joint detection of sentiment and aspect in reviews

2018 ◽  
Vol 45 (6) ◽  
pp. 736-755 ◽  
Author(s):  
Ayoub Bagheri

A crucial task in sentiment analysis is aspect detection: the step of selecting the aspects on which opinions are expressed. This step anticipates the step of determining whether the opinions on aspects are positive or negative. This article proposes a novel probabilistic generative topic model for aspect-based sentiment analysis which is able to discover the latent structure of a large collection of review documents. The proposed joint sentiment-aspect detection model (SAM) is a generative topic model that incorporates the structure of review sentences for detecting aspects and sentiments simultaneously. The intuitions behind the SAM are that from generating documents by latent single- and multi-word topics, modelling the word distribution for each topic and learning of the prior distribution over topics in sentences of documents. SAM introduces word status so that the model can decide when to sample from a bigram distribution or a unigram distribution and integrates all these components into one combined model for aspect-based sentiment analysis. We evaluate SAM both qualitatively and quantitatively to show that the model is indeed able to perform the task effectively and improves significantly over standard joint sentiment-aspect models. The proposed model can easily be transformed between domains or languages and can detect the polarity of text data at various levels. However, for the quantitative analysis, we mainly focus on presenting the results for the document-level sentiment classification.

Entropy ◽  
2020 ◽  
Vol 22 (12) ◽  
pp. 1336
Author(s):  
Gihyeon Choi ◽  
Shinhyeok Oh ◽  
Harksoo Kim

Previous researchers have considered sentiment analysis as a document classification task, in which input documents are classified into predefined sentiment classes. Although there are sentences in a document that support important evidences for sentiment analysis and sentences that do not, they have treated the document as a bag of sentences. In other words, they have not considered the importance of each sentence in the document. To effectively determine polarity of a document, each sentence in the document should be dealt with different degrees of importance. To address this problem, we propose a document-level sentence classification model based on deep neural networks, in which the importance degrees of sentences in documents are automatically determined through gate mechanisms. To verify our new sentiment analysis model, we conducted experiments using the sentiment datasets in the four different domains such as movie reviews, hotel reviews, restaurant reviews, and music reviews. In the experiments, the proposed model outperformed previous state-of-the-art models that do not consider importance differences of sentences in a document. The experimental results show that the importance of sentences should be considered in a document-level sentiment classification task.


2021 ◽  
Vol 11 (22) ◽  
pp. 10774
Author(s):  
Hongchan Li ◽  
Yu Ma ◽  
Zishuai Ma ◽  
Haodong Zhu

With the rapid increase of public opinion data, the technology of Weibo text sentiment analysis plays a more and more significant role in monitoring network public opinion. Due to the sparseness and high-dimensionality of text data and the complex semantics of natural language, sentiment analysis tasks face tremendous challenges. To solve the above problems, this paper proposes a new model based on BERT and deep learning for Weibo text sentiment analysis. Specifically, first using BERT to represent the text with dynamic word vectors and using the processed sentiment dictionary to enhance the sentiment features of the vectors; then adopting the BiLSTM to extract the contextual features of the text, the processed vector representation is weighted by the attention mechanism. After weighting, using the CNN to extract the important local sentiment features in the text, finally the processed sentiment feature representation is classified. A comparative experiment was conducted on the Weibo text dataset collected during the COVID-19 epidemic; the results showed that the performance of the proposed model was significantly improved compared with other similar models.


2020 ◽  
Vol 38 (3) ◽  
pp. 633-657
Author(s):  
Ammara Zamir ◽  
Hikmat Ullah Khan ◽  
Waqar Mehmood ◽  
Tassawar Iqbal ◽  
Abubakker Usman Akram

Purpose This research study proposes a feature-centric spam email detection model (FSEDM) based on content, sentiment, semantic, user and spam-lexicon features set. The purpose of this study is to exploit the role of sentiment features along with other proposed features to evaluate the classification accuracy of machine learning algorithms for spam email detection. Design/methodology/approach Existing studies primarily exploits content-based feature engineering approach; however, a limited number of features is considered. In this regard, this research study proposed a feature-centric framework (FSEDM) based on existing and novel features of email data set, which are extracted after pre-processing. Afterwards, diverse supervised learning techniques are applied on the proposed features in conjunction with feature selection techniques such as information gain, gain ratio and Relief-F to rank most prominent features and classify the emails into spam or ham (not spam). Findings Analysis and experimental results indicated that the proposed model with sentiment analysis is competitive approach for spam email detection. Using the proposed model, deep neural network applied with sentiment features outperformed other classifiers in terms of classification accuracy up to 97.2%. Originality/value This research is novel in this regard that no previous research focuses on sentiment analysis in conjunction with other email features for detection of spam emails.


Author(s):  
Jiajia Li ◽  
Phayung Meesad

According to the indirect relationship between information and stock trend, information such as comments and tweets can be used for stock trend prediction. When conducting classification on text data, feature sparse issues occur during conversion between tweets and word vectors. Another problem is that the unreliability of average sentiment scores to indicate one day’s sentiment. This is especially caused by the unbalanced number between positive and negative within one day, thus a large bias between sentiment and stock trend arises. In addion, information has social attributes when created and diffused in social networks, bias containing people’s belief in social networks also have become socialization bias. In order to solve those problems, this work proposes a sentiment analysis based prediction model and an inverse bias algorithm. Instead of applying sentiment analysis to add sentiment related features, this work uses SentiWordNet to give an additional weight to the selected features, and applies two kinds of sentiment analysis to inverse the socialization bias. Aiming at labeling the tweets to sentiment related groups to help find socialization bias, this work also proposes an extended wordlist based on a semi-supervised Naïve Bayes classification algorithm. After finishing the inverse socialization bias, stock trends were used to label example sets. Different classification algorithms were compared in this work. The proposed model with SVM linear algorithm proves to yield accuracy of 90.33% at its best performance.


2020 ◽  
Author(s):  
Pathikkumar Patel ◽  
Bhargav Lad ◽  
Jinan Fiaidhi

During the last few years, RNN models have been extensively used and they have proven to be better for sequence and text data. RNNs have achieved state-of-the-art performance levels in several applications such as text classification, sequence to sequence modelling and time series forecasting. In this article we will review different Machine Learning and Deep Learning based approaches for text data and look at the results obtained from these methods. This work also explores the use of transfer learning in NLP and how it affects the performance of models on a specific application of sentiment analysis.


2019 ◽  
Vol 13 (1) ◽  
pp. 20-27 ◽  
Author(s):  
Srishty Jindal ◽  
Kamlesh Sharma

Background: With the tremendous increase in the use of social networking sites for sharing the emotions, views, preferences etc. a huge volume of data and text is available on the internet, there comes the need for understanding the text and analysing the data to determine the exact intent behind the same for a greater good. This process of understanding the text and data involves loads of analytical methods, several phases and multiple techniques. Efficient use of these techniques is important for an effective and relevant understanding of the text/data. This analysis can in turn be very helpful in ecommerce for targeting audience, social media monitoring for anticipating the foul elements from society and take proactive actions to avoid unethical and illegal activities, business analytics, market positioning etc. Method: The goal is to understand the basic steps involved in analysing the text data which can be helpful in determining sentiments behind them. This review provides detailed description of steps involved in sentiment analysis with the recent research done. Patents related to sentiment analysis and classification are reviewed to throw some light in the work done related to the field. Results: Sentiment analysis determines the polarity behind the text data/review. This analysis helps in increasing the business revenue, e-health, or determining the behaviour of a person. Conclusion: This study helps in understanding the basic steps involved in natural language understanding. At each step there are multiple techniques that can be applied on data. Different classifiers provide variable accuracy depending upon the data set and classification technique used.


Author(s):  
Seema Rani ◽  
Avadhesh Kumar ◽  
Naresh Kumar

Background: Duplicate content often corrupts the filtering mechanism in online question answering. Moreover, as users are usually more comfortable conversing in their native language questions, transliteration adds to the challenges in detecting duplicate questions. This compromises with the response time and increases the answer overload. Thus, it has now become crucial to build clever, intelligent and semantic filters which semantically match linguistically disparate questions. Objective: Most of the research on duplicate question detection has been done on mono-lingual, majorly English Q&A platforms. The aim is to build a model which extends the cognitive capabilities of machines to interpret, comprehend and learn features for semantic matching in transliterated bi-lingual Hinglish (Hindi + English) data acquired from different Q&A platforms. Method: In the proposed DQDHinglish (Duplicate Question Detection) Model, firstly language transformation (transliteration & translation) is done to convert the bi-lingual transliterated question into a mono-lingual English only text. Next a hybrid of Siamese neural network containing two identical Long-term-Short-memory (LSTM) models and Multi-layer perceptron network is proposed to detect semantically similar question pairs. Manhattan distance function is used as the similarity measure. Result: A dataset was prepared by scrapping 100 question pairs from various social media platforms, such as Quora and TripAdvisor. The performance of the proposed model on the basis of accuracy and F-score. The proposed DQDHinglish achieves a validation accuracy of 82.40%. Conclusion: A deep neural model was introduced to find semantic match between English question and a Hinglish (Hindi + English) question such that similar intent questions can be combined to enable fast and efficient information processing and delivery. A dataset was created and the proposed model was evaluated on the basis of performance accuracy. To the best of our knowledge, this work is the first reported study on transliterated Hinglish semantic question matching.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 708
Author(s):  
Wenbo Liu ◽  
Fei Yan ◽  
Jiyong Zhang ◽  
Tao Deng

The quality of detected lane lines has a great influence on the driving decisions of unmanned vehicles. However, during the process of unmanned vehicle driving, the changes in the driving scene cause much trouble for lane detection algorithms. The unclear and occluded lane lines cannot be clearly detected by most existing lane detection models in many complex driving scenes, such as crowded scene, poor light condition, etc. In view of this, we propose a robust lane detection model using vertical spatial features and contextual driving information in complex driving scenes. The more effective use of contextual information and vertical spatial features enables the proposed model more robust detect unclear and occluded lane lines by two designed blocks: feature merging block and information exchange block. The feature merging block can provide increased contextual information to pass to the subsequent network, which enables the network to learn more feature details to help detect unclear lane lines. The information exchange block is a novel block that combines the advantages of spatial convolution and dilated convolution to enhance the process of information transfer between pixels. The addition of spatial information allows the network to better detect occluded lane lines. Experimental results show that our proposed model can detect lane lines more robustly and precisely than state-of-the-art models in a variety of complex driving scenarios.


2021 ◽  
pp. 1-17
Author(s):  
J. Shobana ◽  
M. Murali

Text Sentiment analysis is the process of predicting whether a segment of text has opinionated or objective content and analyzing the polarity of the text’s sentiment. Understanding the needs and behavior of the target customer plays a vital role in the success of the business so the sentiment analysis process would help the marketer to improve the quality of the product as well as a shopper to buy the correct product. Due to its automatic learning capability, deep learning is the current research interest in Natural language processing. Skip-gram architecture is used in the proposed model for better extraction of the semantic relationships as well as contextual information of words. However, the main contribution of this work is Adaptive Particle Swarm Optimization (APSO) algorithm based LSTM for sentiment analysis. LSTM is used in the proposed model for understanding complex patterns in textual data. To improve the performance of the LSTM, weight parameters are enhanced by presenting the Adaptive PSO algorithm. Opposition based learning (OBL) method combined with PSO algorithm becomes the Adaptive Particle Swarm Optimization (APSO) classifier which assists LSTM in selecting optimal weight for the environment in less number of iterations. So APSO - LSTM ‘s ability in adjusting the attributes such as optimal weights and learning rates combined with the good hyper parameter choices leads to improved accuracy and reduces losses. Extensive experiments were conducted on four datasets proved that our proposed APSO-LSTM model secured higher accuracy over the classical methods such as traditional LSTM, ANN, and SVM. According to simulation results, the proposed model is outperforming other existing models.


Agronomy ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 1307
Author(s):  
Haoriqin Wang ◽  
Huaji Zhu ◽  
Huarui Wu ◽  
Xiaomin Wang ◽  
Xiao Han ◽  
...  

In the question-and-answer (Q&A) communities of the “China Agricultural Technology Extension Information Platform”, thousands of rice-related Chinese questions are newly added every day. The rapid detection of the same semantic question is the key to the success of a rice-related intelligent Q&A system. To allow the fast and automatic detection of the same semantic rice-related questions, we propose a new method based on the Coattention-DenseGRU (Gated Recurrent Unit). According to the rice-related question characteristics, we applied word2vec with the TF-IDF (Term Frequency–Inverse Document Frequency) method to process and analyze the text data and compare it with the Word2vec, GloVe, and TF-IDF methods. Combined with the agricultural word segmentation dictionary, we applied Word2vec with the TF-IDF method, effectively solving the problem of high dimension and sparse data in the rice-related text. Each network layer employed the connection information of features and all previous recursive layers’ hidden features. To alleviate the problem of feature vector size increasing due to dense splicing, an autoencoder was used after dense concatenation. The experimental results show that rice-related question similarity matching based on Coattention-DenseGRU can improve the utilization of text features, reduce the loss of features, and achieve fast and accurate similarity matching of the rice-related question dataset. The precision and F1 values of the proposed model were 96.3% and 96.9%, respectively. Compared with seven other kinds of question similarity matching models, we present a new state-of-the-art method with our rice-related question dataset.


Sign in / Sign up

Export Citation Format

Share Document