Effectiveness of Domain-Based Lexicons vis-à-vis General Lexicon for Aspect-Level Sentiment Analysis: A Comparative Analysis

2019 ◽  
Vol 18 (03) ◽  
pp. 1950033
Author(s):  
Madan Lal Yadav ◽  
Basav Roychoudhury

One can either use machine learning techniques or lexicons to undertake sentiment analysis. Machine learning techniques include text classification algorithms like SVM, naive Bayes, decision tree or logistic regression, whereas lexicon-based sentiment analysis uses either general or domain-based lexicons. In this paper, we investigate the effectiveness of domain lexicons vis-à-vis general lexicon, wherein we have performed aspect-level sentiment analysis on data from three different domains, viz. car, guitar and book. While it is intuitive that domain lexicons will always perform better than general lexicons, the actual performance however may depend on the richness of the concerned domain lexicon as well as the text analysed. We used the general lexicon SentiWordNet and the corresponding domain lexicons in the aforesaid domains to compare their relative performances. The results indicate that domain lexicon used along with general lexicon performs better as compared to general lexicon or domain lexicon, when used alone. They also suggest that the performance of domain lexicons depends on the text content; and also on whether the language involves technical or non-technical words in the concerned domain. This paper makes a case for development of domain lexicons across various domains for improved performance, while gathering that they might not always perform better. It further highlights that the importance of general lexicons cannot be underestimated — the best results for aspect-level sentiment analysis are obtained, as per this paper, when both the domain and general lexicons are used side by side.

Author(s):  
Padmavathi .S ◽  
M. Chidambaram

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.


2018 ◽  
Vol 34 (3) ◽  
pp. 569-581 ◽  
Author(s):  
Sujata Rani ◽  
Parteek Kumar

Abstract In this article, an innovative approach to perform the sentiment analysis (SA) has been presented. The proposed system handles the issues of Romanized or abbreviated text and spelling variations in the text to perform the sentiment analysis. The training data set of 3,000 movie reviews and tweets has been manually labeled by native speakers of Hindi in three classes, i.e. positive, negative, and neutral. The system uses WEKA (Waikato Environment for Knowledge Analysis) tool to convert these string data into numerical matrices and applies three machine learning techniques, i.e. Naive Bayes (NB), J48, and support vector machine (SVM). The proposed system has been tested on 100 movie reviews and tweets, and it has been observed that SVM has performed best in comparison to other classifiers, and it has an accuracy of 68% for movie reviews and 82% in case of tweets. The results of the proposed system are very promising and can be used in emerging applications like SA of product reviews and social media analysis. Additionally, the proposed system can be used in other cultural/social benefits like predicting/fighting human riots.


2018 ◽  
Vol 7 (2.32) ◽  
pp. 462
Author(s):  
G Krishna Chaitanya ◽  
Dinesh Reddy Meka ◽  
Vakalapudi Surya Vamsi ◽  
M V S Ravi Karthik

Sentiment or emotion behind a tweet from Twitter or a post from Facebook can help us answer what opinions or feedback a person has. With the advent of growing user-generated blogs, posts and reviews across various social media and online retails, calls for an understanding of these afore mentioned user data acts as a catalyst in building Recommender systems and drive business plans. User reviews on online retail stores influence buying behavior of customers and thus complements the ever-growing need of sentiment analysis. Machine Learning helps us to read between the lines of tweets by proving us with various algorithms like Naïve Bayes, SVM, etc. Sentiment Analysis uses Machine Learning and Natural Language Processing (NLP) to extract, classify and analyze tweets for sentiments (emotions). There are various packages and frameworks in R and Python that aid in Sentiment Analysis or Text Mining in general. 


Author(s):  
V Umarani ◽  
A Julian ◽  
J Deepa

Sentiment analysis has gained a lot of attention from researchers in the last year because it has been widely applied to a variety of application domains such as business, government, education, sports, tourism, biomedicine, and telecommunication services. Sentiment analysis is an automated computational method for studying or evaluating sentiments, feelings, and emotions expressed as comments, feedbacks, or critiques. The sentiment analysis process can be automated using machine learning techniques, which analyses text patterns faster. The supervised machine learning technique is the most used mechanism for sentiment analysis. The proposed work discusses the flow of sentiment analysis process and investigates the common supervised machine learning techniques such as multinomial naive bayes, Bernoulli naive bayes, logistic regression, support vector machine, random forest, K-nearest neighbor, decision tree, and deep learning techniques such as Long Short-Term Memory and Convolution Neural Network. The work examines such learning methods using standard data set and the experimental results of sentiment analysis demonstrate the performance of various classifiers taken in terms of the precision, recall, F1-score, RoC-Curve, accuracy, running time and k fold cross validation and helps in appreciating the novelty of the several deep learning techniques and also giving the user an overview of choosing the right technique for their application.


Software engineering is an important area that deals with development and maintenance of software. After developing a software, it is always important to track its performance. One has to always see whether the software functions according to customer requirements. To ensure this, faulty and non- faulty modules must be identified. For this purpose, one can make use of a model for binary class classification of faults. Different technique's outputs differ in one or the other way with respect to the following: fault dataset used, complexity, classification algorithm implemented, etc. Various machine learning techniques can be used for this purpose. But this paper deals with the best classification algorithms available till date and they are decision tree, random forest, naive bayes and logistic regression (tree-based techniques and bayesian based techniques). The motive behind developing such a project is to identify the faulty modules within a software before the actual software testing takes place. As a result, the time consumed by testers or the workload of the testers can be reduced to an extent. This work is very well useful to those working in software industry and also to those people carrying out research in software engineering where the lifecycle of development of a software is discussed.


Sign in / Sign up

Export Citation Format

Share Document