scholarly journals Evaluation of Sentiment Analysis via Word Embedding and RNN Variants for Amazon Online Reviews

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Najla M. Alharbi ◽  
Norah S. Alghamdi ◽  
Eman H. Alkhammash ◽  
Jehad F. Al Amri

Consumer feedback is highly valuable in business to assess their performance and is also beneficial to customers as it gives them an idea of what to expect from new products. In this research, the aim is to evaluate different deep learning approaches to accurately predict the opinion of customers based on mobile phone reviews obtained from Amazon.com. The prediction is based on analysing these reviews and categorizing them as positive, negative, or neutral. Different deep learning algorithms have been implemented and evaluated such as simple RNN with its four variants, namely, Long Short-Term Memory Networks (LRNN), Group Long Short-Term Memory Networks (GLRNN), gated recurrent unit (GRNN), and update recurrent unit (UGRNN). All evaluated algorithms are combined with word embedding as feature extraction approach for sentiment analysis including Glove, word2vec, and FastText by Skip-grams. The five different algorithms with the three feature extraction methods are evaluated based on accuracy, recall, precision, and F1-score for both balanced and unbalanced datasets. For the unbalanced dataset, it was found that the GLRNN algorithms with FastText feature extraction scored the highest accuracy of 93.75%. This result achieved the highest accuracy on this dataset when compared with other methods mentioned in the literature. For the balanced dataset, the highest achieved accuracy was 88.39% by the LRNN algorithm.

Author(s):  
Dimple Tiwari ◽  
Bharti Nagpal

Sentiment analysis is used to embed an extensive collection of reviews and predicts people's opinion towards a particular topic, which is helpful for decision-makers. Machine learning and deep learning are standard techniques, which make the process of sentiment analysis simpler and popular. In this research, deep learning is used to analyze the sentiments of people. It has an ability to perform automatic feature extraction, which provides better performance, a more vibrant appearance, and more reliable results than conventional feature-based techniques. Traditional approaches were based on complicated manual feature extractions that were not able to provide reliable results. Therefore, the presented study aimed to improve the performance of the deep learning approach by combining automatic feature extraction with manual feature extraction techniques. The enhanced ELSTM model is proposed with hyper-parameter tuning in previous Long Short-Term Memory (LSTM) to get better results. Based on the results, a novel model of sentiment analysis and novel algorithm are proposed to set the benchmark in the field of textual classification and to describe the procedure of the developed model, respectively. The results of the ELSTM model are presented by training and testing accuracy curve. Finally, a comparative study confirms the best performance of the proposed ELSTM model.


2021 ◽  
pp. 016555152110065
Author(s):  
Rahma Alahmary ◽  
Hmood Al-Dossari

Sentiment analysis (SA) aims to extract users’ opinions automatically from their posts and comments. Almost all prior works have used machine learning algorithms. Recently, SA research has shown promising performance in using the deep learning approach. However, deep learning is greedy and requires large datasets to learn, so it takes more time for data annotation. In this research, we proposed a semiautomatic approach using Naïve Bayes (NB) to annotate a new dataset in order to reduce the human effort and time spent on the annotation process. We created a dataset for the purpose of training and testing the classifier by collecting Saudi dialect tweets. The dataset produced from the semiautomatic model was then used to train and test deep learning classifiers to perform Saudi dialect SA. The accuracy achieved by the NB classifier was 83%. The trained semiautomatic model was used to annotate the new dataset before it was fed into the deep learning classifiers. The three deep learning classifiers tested in this research were convolutional neural network (CNN), long short-term memory (LSTM) and bidirectional long short-term memory (Bi-LSTM). Support vector machine (SVM) was used as the baseline for comparison. Overall, the performance of the deep learning classifiers exceeded that of SVM. The results showed that CNN reported the highest performance. On one hand, the performance of Bi-LSTM was higher than that of LSTM and SVM, and, on the other hand, the performance of LSTM was higher than that of SVM. The proposed semiautomatic annotation approach is usable and promising to increase speed and save time and effort in the annotation process.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Hyejin Cho ◽  
Hyunju Lee

Abstract Background In biomedical text mining, named entity recognition (NER) is an important task used to extract information from biomedical articles. Previously proposed methods for NER are dictionary- or rule-based methods and machine learning approaches. However, these traditional approaches are heavily reliant on large-scale dictionaries, target-specific rules, or well-constructed corpora. These methods to NER have been superseded by the deep learning-based approach that is independent of hand-crafted features. However, although such methods of NER employ additional conditional random fields (CRF) to capture important correlations between neighboring labels, they often do not incorporate all the contextual information from text into the deep learning layers. Results We propose herein an NER system for biomedical entities by incorporating n-grams with bi-directional long short-term memory (BiLSTM) and CRF; this system is referred to as a contextual long short-term memory networks with CRF (CLSTM). We assess the CLSTM model on three corpora: the disease corpus of the National Center for Biotechnology Information (NCBI), the BioCreative II Gene Mention corpus (GM), and the BioCreative V Chemical Disease Relation corpus (CDR). Our framework was compared with several deep learning approaches, such as BiLSTM, BiLSTM with CRF, GRAM-CNN, and BERT. On the NCBI corpus, our model recorded an F-score of 85.68% for the NER of diseases, showing an improvement of 1.50% over previous methods. Moreover, although BERT used transfer learning by incorporating more than 2.5 billion words, our system showed similar performance with BERT with an F-scores of 81.44% for gene NER on the GM corpus and a outperformed F-score of 86.44% for the NER of chemicals and diseases on the CDR corpus. We conclude that our method significantly improves performance on biomedical NER tasks. Conclusion The proposed approach is robust in recognizing biological entities in text.


2021 ◽  
Vol 8 (1) ◽  
pp. 64
Author(s):  
Dedi Tri Hermanto ◽  
Arief Setyanto ◽  
Emha Taufiq Luthfi

Media online banyak menghasilkan berbagai macam berita, baik ekonomi, politik, kesehatan, olahraga atau ilmu pengetahuan. Di antara itu semua, ekonomi adalah salah satu topik menarik untuk dibahas. Ekonomi memiliki dampak langsung kepada warga negara, perusahaan, bahkan pasar tradisional tergantung pada kondisi ekonomi di suatu negara. Sentimen yang terkandung dalam berita dapat mempengaruhi pandangan masyarakat terhadap suatu hal atau kebijakan pemerintah. Topik ekonomi adalah bahasan yang menarik untuk dilakukan penelitian karena memiliki dampak langsung kepada masyarakat Indonesia. Namun, masih sedikit penelitian yang menerapkan metode deep learning yaitu Long Short-Term Memory dan CNN untuk analisis sentimen pada artikel finance di Indonesia. Penelitian ini bertujuan untuk melakukan pengklasifikasian judul berita berbahasa Indonesia berdasarkan sentimen positif, negatif dengan menggunakan metode LSTM, LSTM-CNN, CNN-LSTM. Dataset yang digunakan adalah data judul artikel berbahasa Indonesia yang diambil dari situs Detik Finance. Berdasarkan hasil pengujian memperlihatkan bahwa metode LSTM, LSTM-CNN, CNN-LSTM memiliki hasil akurasi sebesar, 62%, 65% dan 74%.Kata Kunci — LSTM, sentiment analysis, CNNOnline media produce a lot of various kinds of news, be it economics, politics, health, sports or science. Among them, economics is one interesting topic to discuss. The economy has a direct impact on citizens, companies, and even traditional markets depending on the economic conditions in a country. The sentiment contained in the news can influence people's views on a matter or government policy. The topic of economics is an interesting topic for research because it has a direct impact on Indonesian society. However, there are still few studies that apply deep learning methods, namely Long Short-Term Memory and CNN for sentiment analysis on finance articles in Indonesia. This study aims to classify Indonesian news headlines based on positive and negative sentiments using the LSTM, LSTM-CNN, CNN-LSTM methods. The dataset used is data on Indonesian language article titles taken from the Detik Finance website. Based on the test results, it shows that the LSTM, LSTM-CNN, CNN-LSTM methods have an accuracy of, 62%, 65% and 74%.Keywords — LSTM, sentiment analysis, CNN


2021 ◽  
Vol 7 (2) ◽  
pp. 113-121
Author(s):  
Firman Pradana Rachman

Setiap orang mempunyai pendapat atau opini terhadap suatu produk, tokoh masyarakat, atau pun sebuah kebijakan pemerintah yang tersebar di media sosial. Pengolahan data opini itu di sebut dengan sentiment analysis. Dalam pengolahan data opini yang besar tersebut tidak hanya cukup menggunakan machine learning, namun bisa juga menggunakan deep learning yang di kombinasikan dengan teknik NLP (Natural Languange Processing). Penelitian ini membandingkan beberapa model deep learning seperti CNN (Convolutional Neural Network), RNN (Recurrent Neural Networks), LSTM (Long Short-Term Memory) dan beberapa variannya untuk mengolah data sentiment analysis dari review produk amazon dan yelp.


2021 ◽  
Author(s):  
Usha Devi G ◽  
Priyan M K ◽  
Gokulnath Chandra Babu ◽  
Gayathri Karthick

Abstract Twitter sentiment analysis is an automated process of analyzing the text data which determining the opinion or feeling of public tweets from the various fields. For example, in marketing field, political field huge number of tweets is posting with hash tags every moment via internet from one user to another user. This sentiment analysis is a challenging task for the researchers mainly to correct interpretation of context in which certain tweet words are difficult to evaluate what truly is negative and positive statement from the huge corpus of tweet data. This problem violates the integrity of the system and the user reliability can be significantly reduced. In this paper, we identify the each tweet word and we are assigning a meaning into it. The feature work is combined with tweet words, word2vec, stop words and integrated into the deep learning techniques of Convolution neural network model and Long short Term Memory, these algorithms can identify the pattern of stop word counts with its own strategy. Those two models are well trained and applied for IMDB dataset which contains 50,000 movie reviews. With huge amount of twitter data is processed for predicting the sentimental tweets for classification. With the proposed methodology, the samples are experimentally collected from the real-time environment can be discriminated well and the efficacy of the system is improved. The result of Deep Learning algorithms aims to rate the review tweets and also able to identify movie review with testing accuracy as 87.74% and 88.02%.


Author(s):  
Riszki Wijayatun Pratiwi ◽  
Yunita Sari ◽  
Yohanes Suyanto

Research on sentiment analysis in recent years has increased. However, in sentiment analysis research there are still few ideas about the handling of negation, one of which is in the Indonesian sentence. This results in sentences that contain elements of the word negation have not found the exact polarity.The purpose of this research is to analyze the effect of the negation word in Indonesian. Based on positive, neutral and negative classes, using attention-based Long Short Term Memory and word2vec feature extraction method with continuous bag-of-word (CBOW) architecture. The dataset used is data from Twitter. Model performance is seen in the accuracy value.The use of word2vec with CBOW architecture and the addition of layer attention to the Long Short Term Memory (LSTM) and Bidirectional Long Short Term Memory (BiLSTM) methods obtained an accuracy of 78.16% and for BiLSTM resulted in an accuracy of 79.68%. whereas in the FSW algorithm is 73.50% and FWL 73.79%. It can be concluded that attention based BiLSTM has the highest accuracy, but the addition of layer attention in the Long Short Term Memory method is not too significant for negation handling. because the addition of the attention layer cannot determine the words that you want to pay attention to.


Sign in / Sign up

Export Citation Format

Share Document