scholarly journals Unsupervised Aspect Extraction Algorithm for Opinion Mining using Topic Modeling

Author(s):  
Azizkhan F Pathan ◽  
Chetana Prakash
Information ◽  
2020 ◽  
Vol 11 (11) ◽  
pp. 535 ◽  
Author(s):  
Alejandro Ramón-Hernández ◽  
Alfredo Simón-Cuevas ◽  
María Matilde García Lorenzo ◽  
Leticia Arco ◽  
Jesús Serrano-Guerrero

Opinion mining and summarization of the increasing user-generated content on different digital platforms (e.g., news platforms) are playing significant roles in the success of government programs and initiatives in digital governance, from extracting and analyzing citizen’s sentiments for decision-making. Opinion mining provides the sentiment from contents, whereas summarization aims to condense the most relevant information. However, most of the reported opinion summarization methods are conceived to obtain generic summaries, and the context that originates the opinions (e.g., the news) has not usually been considered. In this paper, we present a context-aware opinion summarization model for monitoring the generated opinions from news. In this approach, the topic modeling and the news content are combined to determine the “importance” of opinionated sentences. The effectiveness of different developed settings of our model was evaluated through several experiments carried out over Spanish news and opinions collected from a real news platform. The obtained results show that our model can generate opinion summaries focused on essential aspects of the news, as well as cover the main topics in the opinionated texts well. The integration of term clustering, word embeddings, and the similarity-based sentence-to-news scoring turned out the more promising and effective setting of our model.


Explosion of Web 2.0 had made different social media platforms like Facebook, Twitter, Blogs, etc a data hub for the task of Data Mining. Sentiment Analysis or Opinion mining is an automated process of understanding an opinion expressed by customers. By using Data mining techniques, sentiment analysis helps in determining the polarity (Positive, Negative & Neutral) of views expressed by the end user. Nowadays there are terabytes of data available related to any topic then it can be advertising, politics and Survey Companies, etc. CSAT (Customer Satisfaction) is the key factor for this survey companies. In this paper, we used topic modeling by incorporating a LDA algorithm for finding the topics related to social media. We have used datasets of 900 records for analysis. By analysis, we found three important topics from Survey/Response dataset, which are Customers, Agents & Product/Services. Results depict the CSAT score according to Positive, Negative and Neutral response. We used topic modeling which is a statistical modeling technique. Topic modeling is a technique for categorization of text documents into different topics. This approach helps in better summarization of data according to the topic identification and depiction of polarity classification of sentiments expressed.


2021 ◽  
Author(s):  
Adebayo Abayomi-Alli ◽  
Olusola Abayomi-Alli ◽  
Sanjay Misra ◽  
Luis Fernandez-Sanz

Abstract BackgroundSocial media opinion has become a medium to quickly access large, valuable, and rich details of information on any subject matter within a short period. Twitter being a social microblog site, generate over 330 million tweets monthly across different countries. Analyzing trending topics on Twitter presents opportunities to extract meaningful insight into different opinions on various issues.AimThis study aims to gain insights into the trending yahoo-yahoo topic on Twitter using content analysis of selected historical tweets.MethodologyThe widgets and workflow engine in the Orange Data mining toolbox were employed for all the text mining tasks. 5500 tweets were collected from Twitter using the 'yahoo yahoo' hashtag. The corpus was pre-processed using a pre-trained tweet tokenizer, Valence Aware Dictionary for Sentiment Reasoning (VADER) was used for the sentiment and opinion mining, Latent Dirichlet Allocation (LDA) and Latent Semantic Indexing (LSI) was used for topic modeling. In contrast, Multidimensional scaling (MDS) was used to visualize the modeled topics. ResultsResults showed that "yahoo" appeared in the corpus 9555 times, 175 unique tweets were returned after duplicate removal. Contrary to expectation, Spain had the highest number of participants tweeting on the 'yahoo yahoo' topic within the period. The result of Vader sentiment analysis returned 35.85%, 24.53%, 15.09%, and 24.53%, negative, neutral, no-zone, and positive sentiment tweets, respectively. The word yahoo was highly representative of the LDA topics 1, 3, 4, 6, and LSI topic 1.ConclusionIt can be concluded that emojis are even more representative of the sentiments in tweets faster than the textual contents. Also, despite popular belief, a significant number of youths regard cybercrime as a detriment to society.


Sign in / Sign up

Export Citation Format

Share Document