Sentiment Analysis of Movie Reviews Using Support Vector Machine Classifier with Linear Kernel Function

Author(s):  
A. Sheik Abdullah ◽  
K. Akash ◽  
J. ShaminThres ◽  
S. Selvakumar
Author(s):  
Daniel Febrian Sengkey ◽  
Agustinus Jacobus ◽  
Fabian Johanes Manoppo

Support vector machine (SVM) is a known method for supervised learning in sentiment analysis and there are many studies about the use of SVM in classifying the sentiments in lecturer evaluation. SVM has various parameters that can be tuned and kernels that can be chosen to improve the classifier accuracy. However, not all options have been explored. Therefore, in this study we compared the four SVM kernels: radial, linear, polynomial, and sigmoid, to discover how each kernel influences the accuracy of the classifier. To make a proper assessment, we used our labeled dataset of students’ evaluations toward the lecturer. The dataset was split, one for training the classifier, and another one for testing the model. As an addition, we also used several different ratios of the training:testing dataset. The split ratios are 0.5 to 0.95, with the increment factor of 0.05. The dataset was split randomly, hence the splitting-training-testing processes were repeated 1,000 times for each kernel and splitting ratio. Therefore, at the end of the experiment, we got 40,000 accuracy data. Later, we applied statistical methods to see whether the differences are significant. Based on the statistical test, we found that in this particular case, the linear kernel significantly has higher accuracy compared to the other kernels. However, there is a tradeoff, where the results are getting more varied with a higher proportion of data used for training.


2008 ◽  
pp. 1269-1279
Author(s):  
Xiuju Fu ◽  
Lipo Wang ◽  
GihGuang Hung ◽  
Liping Goh

Classification decisions from linguistic rules are more desirable compared to complex mathematical formulas from support vector machine (SVM) classifiers due to the explicit explanation capability of linguistic rules. Linguistic rule extraction has been attracting much attention in explaining knowledge hidden in data. In this chapter, we show that the decisions from an SVM classifier can be decoded into linguistic rules based on the information provided by support vectors and decision function. Given a support vector of a certain class, cross points between each line, which is extended from the support vector along each axis, and an SVM decision hyper-curve are searched first. A hyper-rectangular rule is derived from these cross points. The hyper-rectangle is tuned by a tuning phase in order to exclude those out-class data points. Finally, redundant rules are merged to produce a compact rule set. Simultaneously, important attributes could be highlighted in the extracted rules. Rule extraction results from our proposed method could follow SVM classifier decisions very well. We compare the rule extraction results from SVM with RBF kernel function and linear kernel function. Experiment results show that rules extracted from SVM with RBF nonlinear kernel function are with better accuracy than rules extracted from SVM with linear kernel function. Comparisons between our method and other rule extraction methods are also carried out on several benchmark data sets. Higher rule accuracy is obtained in our method with fewer number of premises in each rule.


Author(s):  
Xiuju Fu ◽  
Lipo Wang ◽  
GihGuang Hung ◽  
Liping Goh

Classification decisions from linguistic rules are more desirable compared to complex mathematical formulas from support vector machine (SVM) classifiers due to the explicit explanation capability of linguistic rules. Linguistic rule extraction has been attracting much attention in explaining knowledge hidden in data. In this chapter, we show that the decisions from an SVM classifier can be decoded into linguistic rules based on the information provided by support vectors and decision function. Given a support vector of a certain class, cross points between each line, which is extended from the support vector along each axis, and an SVM decision hyper-curve are searched first. A hyper-rectangular rule is derived from these cross points. The hyper-rectangle is tuned by a tuning phase in order to exclude those out-class data points. Finally, redundant rules are merged to produce a compact rule set. Simultaneously, important attributes could be highlighted in the extracted rules. Rule extraction results from our proposed method could follow SVM classifier decisions very well. We compare the rule extraction results from SVM with RBF kernel function and linear kernel function. Experiment results show that rules extracted from SVM with RBF nonlinear kernel function are with better accuracy than rules extracted from SVM with linear kernel function. Comparisons between our method and other rule extraction methods are also carried out on several benchmark data sets. Higher rule accuracy is obtained in our method with fewer number of premises in each rule.


2020 ◽  
Vol 10 (3) ◽  
pp. 1125 ◽  
Author(s):  
Kai-Xu Han ◽  
Wei Chien ◽  
Chien-Ching Chiu ◽  
Yu-Ting Cheng

At present, in the mainstream sentiment analysis methods represented by the Support Vector Machine, the vocabulary and the latent semantic information involved in the text are not well considered, and sentiment analysis of text is dependent overly on the statistics of sentiment words. Thus, a Fisher kernel function based on Probabilistic Latent Semantic Analysis is proposed in this paper for sentiment analysis by Support Vector Machine. The Fisher kernel function based on the model is derived from the Probabilistic Latent Semantic Analysis model. By means of this method, latent semantic information involving the probability characteristics can be used as the classification characteristics, along with the improvement of the effect of classification for support vector machine, and the problem of ignoring the latent semantic characteristics in text sentiment analysis can be addressed. The results show that the effect of the method proposed in this paper, compared with the comparison method, is obviously improved.


2021 ◽  
Vol 13 (2) ◽  
pp. 168-174
Author(s):  
Rifqatul Mukarramah ◽  
Dedy Atmajaya ◽  
Lutfi Budi Ilmawan

Sentiment analysis is a technique to extract information of one’s perception, called sentiment, on an issue or event. This study employs sentiment analysis to classify society’s response on covid-19 virus posted at twitter into 4 polars, namely happy, sad, angry, and scared. Classification technique used is support vector machine (SVM) method which compares the classification performance figure of 2 linear kernel functions, linear and polynomial. There were 400 tweet data used where each sentiment class consists of 100 data. Using the testing method of k-fold cross validation, the result shows the accuracy value of linear kernel function is 0.28 for unigram feature and 0.36 for trigram feature. These figures are lower compared to accuracy value of kernel polynomial with 0.34 and 0.48 for unigram and trigram feature respectively. On the other hand, testing method of confusion matrix suggests the highest performance is obtained by using kernel polynomial with accuracy value of 0.51, precision of 0.43, recall of 0.45, and f-measure of 0.51.


2019 ◽  
Vol 5 (2) ◽  
pp. 90-99
Author(s):  
Putroue Keumala Intan

The maternal mortality rate during childbirth can be reduced through the efforts of the medical team in determining the childbirth process that must be undertaken immediately. Machine learning in terms of classifying childbirth can be a solution for the medical team in determining the childbirth process. One of the classification methods that can be used is the Support Vector Machine (SVM) method which is able to determine a hyperplane that will form a good decision boundary so that it is able to classify data appropriately. In SVM, there is a kernel function that is useful for solving non-linear classification cases by transforming data to a higher dimension. In this study, four kernel functions will be used; Linear, Radial Basis Function (RBF), Polynomial, and Sigmoid in the classification process of childbirth in order to determine the kernel function that is capable of producing the highest accuracy value. Based on research that has been done, it is obtained that the accuracy value generated by SVM with linear kernel functions is higher than the other kernel functions.


Sign in / Sign up

Export Citation Format

Share Document