Twitter Sentiment Analysis Using Binary Classification Technique

Author(s):  
B. N. Supriya ◽  
Vish Kallimani ◽  
S. Prakash ◽  
C. B. Akki
Algorithms ◽  
2020 ◽  
Vol 13 (4) ◽  
pp. 83 ◽  
Author(s):  
Giannis Haralabopoulos ◽  
Ioannis Anagnostopoulos ◽  
Derek McAuley

Sentiment analysis usually refers to the analysis of human-generated content via a polarity filter. Affective computing deals with the exact emotions conveyed through information. Emotional information most frequently cannot be accurately described by a single emotion class. Multilabel classifiers can categorize human-generated content in multiple emotional classes. Ensemble learning can improve the statistical, computational and representation aspects of such classifiers. We present a baseline stacked ensemble and propose a weighted ensemble. Our proposed weighted ensemble can use multiple classifiers to improve classification results without hyperparameter tuning or data overfitting. We evaluate our ensemble models with two datasets. The first dataset is from Semeval2018-Task 1 and contains almost 7000 Tweets, labeled with 11 sentiment classes. The second dataset is the Toxic Comment Dataset with more than 150,000 comments, labeled with six different levels of abuse or harassment. Our results suggest that ensemble learning improves classification results by 1.5 % to 5.4 % .


2017 ◽  
Vol 24 (1) ◽  
pp. 3-37 ◽  
Author(s):  
SANDRA KÜBLER ◽  
CAN LIU ◽  
ZEESHAN ALI SAYYED

AbstractWe investigate feature selection methods for machine learning approaches in sentiment analysis. More specifically, we use data from the cooking platform Epicurious and attempt to predict ratings for recipes based on user reviews. In machine learning approaches to such tasks, it is a common approach to use word or part-of-speech n-grams. This results in a large set of features, out of which only a small subset may be good indicators for the sentiment. One of the questions we investigate concerns the extension of feature selection methods from a binary classification setting to a multi-class problem. We show that an inherently multi-class approach, multi-class information gain, outperforms ensembles of binary methods. We also investigate how to mitigate the effects of extreme skewing in our data set by making our features more robust and by using review and recipe sampling. We show that over-sampling is the best method for boosting performance on the minority classes, but it also results in a severe drop in overall accuracy of at least 6 per cent points.


2018 ◽  
Vol 9 (2) ◽  
pp. 111-120
Author(s):  
Argha Roy ◽  
Shyamali Guria ◽  
Suman Halder ◽  
Sayani Banerjee ◽  
Sourav Mandal

Recently, the web has been crowded with growing volumes of various texts on every aspect of human life. It is difficult to rapidly access, analyze, and compose important decisions using efficient methods for raw textual data in the form of social media, blogs, feedback, reviews, etc., which receive textual inputs directly. It proposes an efficient method for summarization of various reviews of tourists on a specific tourist spot towards analyzing their sentiments towards the place. A classification technique automatically arranges documents into predefined categories and a summarization algorithm produces the exact condensed input such that output is most significant concepts of source documents. Finally, sentiment analysis is done in summarized opinion using NLP and text analysis techniques to show overall sentiment about the spot. Therefore, interested tourists can plan to visit the place do not go through all the reviews, rather they go through summarized documents with the overall sentiment about target place.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Murtuza Shahzad ◽  
Hamed Alhoori

Abstract Purpose Social media users share their ideas, thoughts, and emotions with other users. However, it is not clear how online users would respond to new research outcomes. This study aims to predict the nature of the emotions expressed by Twitter users toward scientific publications. Additionally, we investigate what features of the research articles help in such prediction. Identifying the sentiments of research articles on social media will help scientists gauge a new societal impact of their research articles. Design/methodology/approach Several tools are used for sentiment analysis, so we applied five sentiment analysis tools to check which are suitable for capturing a tweet's sentiment value and decided to use NLTK VADER and TextBlob. We segregated the sentiment value into negative, positive, and neutral. We measure the mean and median of tweets’ sentiment value for research articles with more than one tweet. We next built machine learning models to predict the sentiments of tweets related to scientific publications and investigated the essential features that controlled the prediction models. Findings We found that the most important feature in all the models was the sentiment of the research article title followed by the author count. We observed that the tree-based models performed better than other classification models, with Random Forest achieving 89% accuracy for binary classification and 73% accuracy for three-label classification. Research limitations In this research, we used state-of-the-art sentiment analysis libraries. However, these libraries might vary at times in their sentiment prediction behavior. Tweet sentiment may be influenced by a multitude of circumstances and is not always immediately tied to the paper's details. In the future, we intend to broaden the scope of our research by employing word2vec models. Practical implications Many studies have focused on understanding the impact of science on scientists or how science communicators can improve their outcomes. Research in this area has relied on fewer and more limited measures, such as citations and user studies with small datasets. There is currently a critical need to find novel methods to quantify and evaluate the broader impact of research. This study will help scientists better comprehend the emotional impact of their work. Additionally, the value of understanding the public's interest and reactions helps science communicators identify effective ways to engage with the public and build positive connections between scientific communities and the public. Originality/value This study will extend work on public engagement with science, sociology of science, and computational social science. It will enable researchers to identify areas in which there is a gap between public and expert understanding and provide strategies by which this gap can be bridged.


Author(s):  
Abdessamad Benlahbib ◽  
El Habib Nfaoui

Customer reviews are a valuable source of information from which we can extract very useful data about different online shopping experiences. For trendy items (products, movies, TV shows, hotels, services . . . ), the number of available users and customers’ opinions could easily surpass thousands. Therefore, online reputation systems could aid potential customers in making the right decision (buying, renting, booking . . . ) by automatically mining textual reviews and their ratings. This paper presents MTVRep, a movie and TV show reputation system that incorporates fine-grained opinion mining and semantic analysis to generate and visualize reputation toward movies and TV shows. Differently from previous studies on reputation generation that treat the task of sentiment analysis as a binary classification problem (positive, negative), the proposed system identifies the sentiment strength during the phase of sentiment classification by using fine-grained sentiment analysis to separate movie and TV show reviews into five discrete classes: strongly negative, weakly negative, neutral, weakly positive and strongly positive. Besides, it employs embeddings from language models (ELMo) representations to extract semantic relations between reviews. The contribution of this paper is threefold. First, movie and TV show reviews are separated into five groups based on their sentiment orientation. Second, a custom score is computed for each opinion group. Finally, a numerical reputation value is produced toward the target movie or TV show. The efficacy of the proposed system is illustrated by conducting several experiments on a real-world movie and TV show dataset.


Sign in / Sign up

Export Citation Format

Share Document