Sentiment Classification

Author(s):  
Jalel Akaichi

In this work, we focus on the application of text mining and sentiment analysis techniques for analyzing Tunisian users' statuses updates on Facebook. We aim to extract useful information, about their sentiment and behavior, especially during the “Arabic spring” era. To achieve this task, we describe a method for sentiment analysis using Support Vector Machine and Naïve Bayes algorithms, and applying a combination of more than two features. The output of this work consists, on one hand, on the construction of a sentiment lexicon based on the Emoticons and Acronyms' lexicons that we developed based on the extracted statuses updates; and on the other hand, it consists on the realization of detailed comparative experiments between the above algorithms by creating a training model for sentiment classification.

Author(s):  
Midde Venkateswarlu Naik ◽  
D. Vasumathi ◽  
A.P. Siva Kumar

Aims: The proposed research work is on an evolutionary enhanced method for sentiment or emotion classification on unstructured review text in the big data field. The sentiment analysis plays a vital role for current generation of people for extracting valid decision points about any aspect such as movie ratings, education institute or politics ratings, etc. The proposed hybrid approach combined the optimal feature selection using Particle Swarm Optimization (PSO) and sentiment classification through Support Vector Machine (SVM). The current approach performance is evaluated with statistical measures, such as precision, recall, sensitivity, specificity, and was compared with the existing approaches. The earlier authors have achieved an accuracy of sentiment classifier in the English text up to 94% as of now. In the proposed scheme, an average accuracy of sentiment classifier on distinguishing datasets outperformed as 99% by tuning various parameters of SVM, such as constant c value and kernel gamma value in association with PSO optimization technique. The proposed method utilized three datasets, such as airline sentiment data, weather, and global warming datasets, that are publically available. The current experiment produced results that are trained and tested based on 10- Fold Cross-Validations (FCV) and confusion matrix for predicting sentiment classifier accuracy. Background: The sentiment analysis plays a vital role for current generation people for extracting valid decisions about any aspect such as movie rating, education institute or even politics ratings, etc. Sentiment Analysis (SA) or opinion mining has become fascinated scientifically as a research domain for the present environment. The key area is sentiment classification on semi-structured or unstructured data in distinguish languages, which has become a major research aspect. User-Generated Content [UGC] from distinguishing sources has been hiked significantly with rapid growth in a web environment. The huge user-generated data over social media provides substantial value for discovering hidden knowledge or correlations, patterns, and trends or sentiment extraction about any specific entity. SA is a computational analysis to determine the actual opinion of an entity which is expressed in terms of text. SA is also called as computation of emotional polarity expressed over social media as natural text in miscellaneous languages. Usually, the automatic superlative sentiment classifier model depends on feature selection and classification algorithms. Methods: The proposed work used Support vector machine as classification technique and particle swarm optimization technique as feature selection purpose. In this methodology, we tune various permutations and combination parameters in order to obtain expected desired results with kernel and without kernel technique for sentiment classification on three datasets, including airline, global warming, weather sentiment datasets, that are freely hosted for research practices. Results: In the proposed scheme, The proposed method has outperformed with 99.2% of average accuracy to classify the sentiment on different datasets, among other machine learning techniques. The attained high accuracy in classifying sentiment or opinion about review text proves superior effectiveness over existing sentiment classifiers. The current experiment produced results that are trained and tested based on 10- Fold Cross-Validations (FCV) and confusion matrix for predicting sentiment classifier accuracy. Conclusion: The objective of the research issue sentiment classifier accuracy has been hiked with the help of Kernel-based Support Vector Machine (SVM) based on parameter optimization. The optimal feature selection to classify sentiment or opinion towards review documents has been determined with the help of a particle swarm optimization approach. The proposed method utilized three datasets to simulate the results, such as airline sentiment data, weather sentiment data, and global warming data that are freely available datasets.


2020 ◽  
Vol 11 (2) ◽  
pp. 66-81
Author(s):  
Badia Klouche ◽  
Sidi Mohamed Benslimane ◽  
Sakina Rim Bennabi

Sentiment analysis is one of the recent areas of emerging research in the classification of sentiment polarity and text mining, particularly with the considerable number of opinions available on social media. The Algerian Operator Telephone Ooredoo, as other operators, deploys in its new strategy to conquer new customers, by exploiting their opinions through a sentiments analysis. The purpose of this work is to set up a system called “Ooredoo Rayek”, whose objective is to collect, transliterate, translate and classify the textual data expressed by the Ooredoo operator's customers. This article developed a set of rules allowing the transliteration from Algerian Arabizi to Algerian dialect. Furthermore, the authors used Naïve Bayes (NB) and (Support Vector Machine) SVM classifiers to assign polarity tags to Facebook comments from the official pages of Ooredoo written in multilingual and multi-dialect context. Experimental results show that the system obtains good performance with 83% of accuracy.


2021 ◽  
Vol 13 (3) ◽  
pp. 128-133
Author(s):  
Attala Rafid Abelard ◽  
Yuliant Sibaroni

Among many film streaming platforms that have sprung up, Netflix is ​​the platform that has the most subscribers compared to the other platforms. However, not all reviews provided by the Netflix users are good reviews. These reviews will later be analyzed to determine what aspects are reviewed by the users based on reviews written on the Google Play Store, using the Latent Dirichlet Allocation (LDA) method. Then, the classification process using the Support Vector Machine (SVM) method will be carried out to determine whether each of these reviews is included in the positive or negative class (Sentiment Analysis). There are 2 scenarios that were carried out in this study. The first scenario resulted that the best number of LDA topics to be used is 40, and the second scenario resulted that the use of filtering process in the preprocessing stage reduces the score of the f1-score. Thus, this study resulted in the best performance score on LDA and SVM testing with 40 topics, and without running the filtering process with the score of 78.15%.


Author(s):  
Daniel Febrian Sengkey ◽  
Agustinus Jacobus ◽  
Fabian Johanes Manoppo

Support vector machine (SVM) is a known method for supervised learning in sentiment analysis and there are many studies about the use of SVM in classifying the sentiments in lecturer evaluation. SVM has various parameters that can be tuned and kernels that can be chosen to improve the classifier accuracy. However, not all options have been explored. Therefore, in this study we compared the four SVM kernels: radial, linear, polynomial, and sigmoid, to discover how each kernel influences the accuracy of the classifier. To make a proper assessment, we used our labeled dataset of students’ evaluations toward the lecturer. The dataset was split, one for training the classifier, and another one for testing the model. As an addition, we also used several different ratios of the training:testing dataset. The split ratios are 0.5 to 0.95, with the increment factor of 0.05. The dataset was split randomly, hence the splitting-training-testing processes were repeated 1,000 times for each kernel and splitting ratio. Therefore, at the end of the experiment, we got 40,000 accuracy data. Later, we applied statistical methods to see whether the differences are significant. Based on the statistical test, we found that in this particular case, the linear kernel significantly has higher accuracy compared to the other kernels. However, there is a tradeoff, where the results are getting more varied with a higher proportion of data used for training.


Author(s):  
Mohd Suhairi Md Suhaimin ◽  
Mohd Hanafi Ahmad Hijazi ◽  
Rayner Alfred ◽  
Frans Coenen

<span>Sentiment analysis is directed at identifying people's opinions, beliefs, views and emotions in the context of the entities and attributes that appear in text. The presence of sarcasm, however, can significantly hamper sentiment analysis. In this paper a sentiment classification framework is presented that incorporates sarcasm detection. The framework was evaluated using a non-linear Support Vector Machine and Malay social media data. The results obtained demonstrated that the proposed sarcasm detection process could successfully detect the presence of sarcasm in that better sentiment classification performance was recorded. A best average F-measure score of 0.905 was recorded using the framework; a significantly better result than when sentiment classification was performed without sarcasm detection.</span>


SINERGI ◽  
2020 ◽  
Vol 24 (2) ◽  
pp. 87
Author(s):  
Mona Cindo ◽  
Dian Palupi Rini ◽  
Ermatita Ermatita

With the advancement of social media and its growth, there is a lot of data that can be presented for research in social mining. Twitter is a microblogging that can be used. In this event, a lot of companies used the data on Twitter to analyze the satisfaction of their customer about product quality. On the other hand, a lot of users use social media to express their daily emotions. The case can be developed into a research study that can be used both to improve product quality, as well as to analyze the opinion on certain events. The research is often called sentiment analysis or opinion mining. While The previous research does a particularly useful feature for sentiment analysis, but it is still a lack of performance. Furthermore, they used Support Vector Machine as a classification method. On the other hand, most researchers found another classification method, which is considered more efficient such as Maximum Entropy. So, this research used two types of a dataset, the general opinion data, and the airline's opinion data. For feature extraction, we employ four feature extraction, such as pragmatic, lexical-grams, pos-grams, and sentiment lexical. For the classification, we use both of Support Vector Machine and Maximum Entropy to find the best result. In the end, the best result is performed by Maximum Entropy with 85,8% accuracy on general opinion data, and 92,6% accuracy on airlines opinion data.


2019 ◽  
Vol 6 (1) ◽  
pp. 138-149
Author(s):  
Ukhti Ikhsani Larasati ◽  
Much Aziz Muslim ◽  
Riza Arifudin ◽  
Alamsyah Alamsyah

Data processing can be done with text mining techniques. To process large text data is required a machine to explore opinions, including positive or negative opinions. Sentiment analysis is a process that applies text mining methods. Sentiment analysis is a process that aims to determine the content of the dataset in the form of text is positive or negative. Support vector machine is one of the classification algorithms that can be used for sentiment analysis. However, support vector machine works less well on the large-sized data. In addition, in the text mining process there are constraints one is number of attributes used. With many attributes it will reduce the performance of the classifier so as to provide a low level of accuracy. The purpose of this research is to increase the support vector machine accuracy with implementation of feature selection and feature weighting. Feature selection will reduce a large number of irrelevant attributes. In this study the feature is selected based on the top value of K = 500. Once selected the relevant attributes are then performed feature weighting to calculate the weight of each attribute selected. The feature selection method used is chi square statistic and feature weighting using Term Frequency Inverse Document Frequency (TFIDF). Result of experiment using Matlab R2017b is integration of support vector machine with chi square statistic and TFIDF that uses 10 fold cross validation gives an increase of accuracy of 11.5% with the following explanation, the accuracy of the support vector machine without applying chi square statistic and TFIDF resulted in an accuracy of 68.7% and the accuracy of the support vector machine by applying chi square statistic and TFIDF resulted in an accuracy of 80.2%.


2021 ◽  
Vol 04 (01) ◽  
Author(s):  
Mahmood Umar ◽  

Nowadays, social media platforms, blogs, and e-commerce are commonly use to express opinion on politics, movies, products, education respectively; for election forecasting, business boosting and improvement of teaching and learning. As a result, data generation becomes easier; producing big data which requires appropriate techniques and tools to analyse easily, accurately and timely. Thus, making sentiment analysis very demanding research area. This study will investigate on what basis (sentiment classification level) or area of application (data source) do supervised machine learning approaches particularly Support Vector Machine (SVM), Naïve Bayes, and Maximum Entropy algorithms, and other technique-lexicon-based approach give the best result in sentiment analysis. Based on the review of the literature there is a contradiction on the point that SVM generated the best result in analyzing student sentiment on document level. This study also discovers that sentiment analysis differs from system to system based on polarity (types of the classes to predict: positive or negative, subjective or objective), different levels of classification (sentence, phrase, or document level) and language that is processed. This research produces a taxonomy which serves as a guide for the choice of techniques in sentiment analysis. The taxonomy explores the sentiment classification levels and data preprocessing stages. It also explores that sentiment analysis techniques were organised in to three (3) groups; Machine learning, Lexicon and hybrid or combination. The machine learning techniques were sub-grouped in to two (2) namely; supervised and unsupervised. The supervised were organized in to two (2): Classification and Regression. un-supervised machine learning techniques includes clustering and association. The clustering technique consist of k-means. Decision tree which is a classification based under supervised type of machine learning technique consist of random forest,(Akinkunmi, 2019) while the ruled-based classifiers consist of confidence criterion and support criterion. The commonly used tools are Weka, Python compiler, and R programming tool.


2014 ◽  
Vol 596 ◽  
pp. 263-270
Author(s):  
Jia Neng Yang ◽  
Ai Min Yang ◽  
Yong Mei Zhou

A method was proposed to build a Chinese sentiment lexicon based on semantics. Sentiment intensity of the word was automatically calculated by decomposing it into multiple English semantic units (Esu). A lexicon proofreading method was used to optimize the sentiment intensity of words. The proposed lexicon was applied to the task of sentiment analysis, in which the method of support vector machine was used to build the sentiment classifier. The experiment results shown that the built sentiment lexicon was more effective than the general polar sentiment lexicon.


2020 ◽  
Vol 1477 ◽  
pp. 022023
Author(s):  
Imamah ◽  
Husni ◽  
Eka Malasari Rachman ◽  
Ika Oktavia Suzanti ◽  
Fifin Ayu Mufarroha

Sign in / Sign up

Export Citation Format

Share Document