scholarly journals A Check on Annotation in Sentiment Research

2019 ◽  
Vol 8 (2S8) ◽  
pp. 1346-1350

The research literature on sentiment analysis methodologies has exponentially grown in recent years. In any research area, where new concepts and techniques are constantly introduced, it is, therefore, of interest to analyze the latest trends in this literature. In particular, we have chosen to primarily focus on the literature of the last five years, on annotation methodologies, including frequently used datasets and from which they were obtained. Based on the survey, it appears that researchers do more manual annotation in the formation of sentiment corpus. As for the dataset, there are still many uses of English language taken from social media such as Twitter. In this area of research, there are still many that need to be explored, such as the use of semi-automatic annotation method that is still very rarely used by researchers. Also, less popular languages, such as Malay, Korean, Japanese, and so on, still require corpus for sentiment analysis research.

Author(s):  
Normi Sham Awang Abu Bakar ◽  
Ros Aziehan Rahmat ◽  
Umar Faruq Othman

<p>The popularity of the social media channels has increased the interest among researchers in the sentiment analysis(SA) area. One aspect of the SA research is the determination of the polarity of the comments in the social media, i.e. positive, negative, and neutral. However, there is a scarcity of Malay sentiment analysis tools because most of the work in the literature discuss the polarity classification tool in English. This paper presents the development of a polarity classification tool called Malay Polarity Classification Tool(MaCT). This tool is developed based on the AFINN sentiment lexicon for English language. We have attempted to translate each word in AFINN to its Malay equivalent and later, use the lexicon to collect the sentiment data from Twitter. The Twitter data are then classified into positive, negative, and neutral. For the validation purpose, we collect 400 positive tweets, 400 negative tweets, and 200 neutral tweets, and later, run the tweets through our sentiment lexicon and found 90% score for precision, recall and accuracy. Our main contribution in the research is the new AFINN translation for Malay language and also the classification of the sentiment data.</p>


Author(s):  
Sujata Patil ◽  
Bhavesh Wagh ◽  
Aditya Bhinge ◽  
Aakash Sahal ◽  
Prof. Madhav Ingale

Social media monitoring has been growing day by day so analyzing social data plays an important role in knowing people's behavior. So we are analyzing Social data such as Twitter Tweets using sentiment analysis which checks the opinion of people related to government schemes that are announced by the Central Government. This paper-based is on social media Twitter datasets of particular schemes and their polarity of sentiments. The popularity of the Internet has been rapidly increased. Sentiment analysis and opinion mining is the field of study that analyses people's opinions, sentiments, evaluations, attitudes, and emotions from written language. User-generated content is highly generated by users. The growing importance of sentiment analysis coincides with the growth of social media such as reviews, forum discussions, blogs, micro-blogs, Twitter, and social networks. It is difficult to analyze or summarize user-generated content. Most of the users write their opinions, thoughts on blogs, social media sites, E-commerce sites, etc. So these contents are very important for individuals, industry, government, and research work to make decisions. This Sentiment analysis and opinion mining research is a hot research area that comes under Natural Language processing. We plot and calculate numbers of positive, negative, and neutral tweets from each event.


Author(s):  
Shailendra Kumar Singh ◽  
Manoj Kumar Sachan

The rapid growth of internet facilities has increased the comments, posts, blogs, feedback, etc., on a large scale on social networking sites. These social media data are available in an unstructured form, which includes images, text, and videos. The processing of these data is difficult, but some sentiment analysis, information retrieval, and recommender systems are used to process these unstructured data. To extract the opinion and sentiment of internet users from their written social media text, a sentiment analysis system is required to develop, which can work on both monolingual and bilingual phonetic text. Therefore, a sentiment analysis (SA) system is developed, which performs well on different domain datasets. The system performance is tested on four different datasets and achieved better accuracy of 3% on social media datasets, 1.5% on movie reviews, 1.35% on Amazon product reviews, and 4.56% on large Amazon product reviews than the state-of-art techniques. Also, the stemmer (StemVerb) for verbs of the English language is proposed, which improves the SA system's performance.


Computers ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 3
Author(s):  
Ihsan Ullah Khan ◽  
Aurangzeb Khan ◽  
Wahab Khan ◽  
Mazliham Mohd Su’ud ◽  
Muhammad Mansoor Alam ◽  
...  

Research efforts in the field of sentiment analysis have exponentially increased in the last few years due to its applicability in areas such as online product purchasing, marketing, and reputation management. Social media and online shopping sites have become a rich source of user-generated data. Manufacturing, sales, and marketing organizations are progressively turning their eyes to this source to get worldwide feedback on their activities and products. Millions of sentences in Urdu and Roman Urdu are posted daily on social sites, such as Facebook, Instagram, Snapchat, and Twitter. Disregarding people’s opinions in Urdu and Roman Urdu and considering only resource-rich English language leads to the vital loss of this vast amount of data. Our research focused on collecting research papers related to Urdu and Roman Urdu language and analyzing them in terms of preprocessing, feature extraction, and classification techniques. This paper contains a comprehensive study of research conducted on Roman Urdu and Urdu text for a product review. This study is divided into categories, such as collection of relevant corpora, data preprocessing, feature extraction, classification platforms and approaches, limitations, and future work. The comparison was made based on evaluating different research factors, such as corpus, lexicon, and opinions. Each reviewed paper was evaluated according to some provided benchmarks and categorized accordingly. Based on results obtained and the comparisons made, we suggested some helpful steps in a future study.


In sentiment analysis, annotation is the crucial step of data processing to label the review or sentences as positive, negative, or neutral. An annotation process is usually performed by three key approaches: (i) manual, (ii) crowdsourcing, and (iii) automated annotation. Manual annotation is preferred in most of the literature’s and crowdsourcing tools are used in some of the works. This indicates that there is a scarce of automatic annotation and its service is highly essential to support more systematic research in sentiment analysis. Manual procedures mostly depends on external annotators, resulting in costly and time-consuming processes. Thus, we propose a method for automatic annotation using web search model to dynamically label reviews (positive, negative, or neutral) that are not available in the dictionary. Some research works consider the product opinions from e-commerce sites where most of the texts may not be available in the dictionary, in such cases, the web search model can be used instead of manual annotation. A large-scale opinosis dataset is used to evaluate the accuracy of the algorithms and feasibility of the model. The experimental results indicate that this model outperforms conventional methodologies and therefore we firmly believe it will be useful for current researchers in the field of opinion mining.


2022 ◽  
Vol 2022 ◽  
pp. 1-10
Author(s):  
Abdullah Al-Hashedi ◽  
Belal Al-Fuhaidi ◽  
Abdulqader M. Mohsen ◽  
Yousef Ali ◽  
Hasan Ali Gamal Al-Kaf ◽  
...  

Sentiment analysis has recently become increasingly important with a massive increase in online content. It is associated with the analysis of textual data generated by social media that can be easily accessed, obtained, and analyzed. With the emergence of COVID-19, most published studies related to COVID-19’s conspiracy theories were surveys on the people's sentiments and opinions and studied the impact of the pandemic on their lives. Just a few studies utilized sentiment analysis of social media using a machine learning approach. These studies focused more on sentiment analysis of Twitter tweets in the English language and did not pay more attention to other languages such as Arabic. This study proposes a machine learning model to analyze the Arabic tweets from Twitter. In this model, we apply Word2Vec for word embedding which formed the main source of features. Two pretrained continuous bag-of-words (CBOW) models are investigated, and Naïve Bayes was used as a baseline classifier. Several single-based and ensemble-based machine learning classifiers have been used with and without SMOTE (synthetic minority oversampling technique). The experimental results show that applying word embedding with an ensemble and SMOTE achieved good improvement on average of F1 score compared to the baseline classifier and other classifiers (single-based and ensemble-based) without SMOTE.


2020 ◽  
Vol 17 (9) ◽  
pp. 4535-4542
Author(s):  
Ramneet ◽  
Deepali Gupta ◽  
Mani Madhukar

For the past few years, sentiment analysis has been growing rapidly and with the abundance of computation power and plethora of machine learning algorithms, sentiment analysis has found numerous applications and acceptance as research area in machine learning. This paper covers analysis of sentiment analysis dealing with different aspects of its applications such as customer reviews, product reviews, film reviews, emotion detection, market research or many more such areas. To conduct sentiment analysis, data is extracted from various social media platforms like Twitter, Facebook etc. The data available on these social media platforms is primarily unstructured, therefore to analyze this data it must be pre-processed, feature vector identified and further implementation of models to trained and tested on different algorithms. There are several algorithms such as SVM, Naïve Bayes, K-means, KNN, decision tree, random forest and other algorithms, which are used to evaluate and hybrid to improve the efficiency and accuracy of the model.


2019 ◽  
Vol 8 (2S11) ◽  
pp. 3571-3576

Social media is most popular platform on which users can share their views, reviews and knowledge about various topics, news, products etc. Identifying sentiments or opinions of users is valuable for many e-commerce companies, Hotels, e-learning etc. This opinion analysis is useful for companies to improve their service and products. Due to increase in web users across globe, users happen to post their views freely over the internet. Many different languages are spoken across globe, supporting multilingual nature of social media makes analysis of such text difficult. Sentiment analysis can be conducted using videos, image, text, where text sentiment analysis is most popular form because of freely available contents in the form of blogs, reviews, comments etc. Because of development of social media platform, people can post comment in any language, creates the need for Multilingual sentiment analysis. Sentiment analysis task needs phases such as data collection, pre-processing, sentiment classification and polarity identification. The Multilingual nature needs Script Identification on the input text by labelling the different words used in text along with scripts used to denote them. Various languages used in the text are identified and the Hindi language text written in Romanized script is transliterated to Devanagari script. Text is then completely translated into English language and POS(Parts of Speech) tagging is performed on the obtained text. The aim and purpose of this study is to survey different techniques of multilingual sentiment analysis, and language identification of source text, where n-grams model outperforms all.


Author(s):  
Ibrahim Moge Noor

Social media sites recently became popular, it is clear that it has major influence in society, and almost one third of the entire world are in social media. It became a platform where people express their feelings, share their ideas, wisdoms and give feedback of an event or a product, with help of new technology it gave us an opportunity to analyse these contents easily. Twitter being one of these sites, with full of people opinions, where one can truck sentiment express about different kind of topics, instead of wasting time and energy for long surveys, due to advance sentiment analysis we can now collect a huge data of opinions of people. Sentiment analysis was one of the major interesting research area nowadays. In this paper we focused Sentimental insight into the 2019 Kenya currency replacement. Kenya government has announced that the country currency is to be replace wıth new generatıon of bank notes, the government ordered the Kenyan citizen to return back the old 1000 shilling notes ($10) to bank by 1st October 2019, in a bid to fight against corruption and money laundering. Kenyans citizen expressed their reaction over new banknotes. We perform sentiment analysis of the tweets using Multinomial Naïve Bayes algorithm by utilizing data from one of the social media platform–Twitter and I have collected during this period of demonetization, 1122 tweets from twitter using web scrapper with help of twitter advance search.


Sign in / Sign up

Export Citation Format

Share Document