Optimized Swarm Search-Based Feature Selection for Text Mining in Sentiment Analysis

In Present situation, a huge quantity of data is recorded in variety of forms like text, image, video, and audio and is estimated to enhance in future. The major tasks related to text are entity extraction, information extraction, entity relation modeling, document summarization are performed by using text mining. This paper main focus is on document clustering, a sub task of text mining and to measure the performance of different clustering techniques. In this paper we are using an enhanced features selection for clustering of text documents to prove that it produces better results compared to traditional feature selection.

Download Full-text

Comparison of Naïve Bayes Algorithm with Genetic Algorithm and Particle Swarm Optimization as Feature Selection for Sentiment Analysis Review of Digital Learning Application

Journal of Physics Conference Series ◽

10.1088/1742-6596/1641/1/012040 ◽

2020 ◽

Vol 1641 ◽

pp. 012040

Author(s):

Siti Ernawati ◽

Risa Wati ◽

Nuzuliarini Nuris ◽

Lita Sari Marita ◽

Eka Rini Yulia

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Particle Swarm Optimization ◽

Sentiment Analysis ◽

Naive Bayes ◽

Particle Swarm ◽

Digital Learning ◽

Swarm Optimization ◽

Selection For ◽

Bayes Algorithm

Download Full-text

Comparison of Naive Bayes Algorithm and Support Vector Machine using PSO Feature Selection for Sentiment Analysis on E-Wallet Review

Journal of Physics Conference Series ◽

10.1088/1742-6596/1641/1/012085 ◽

2020 ◽

Vol 1641 ◽

pp. 012085

Author(s):

Dwi Andini Putri ◽

Dinar Ajeng Kristiyanti ◽

Elly Indrayuni ◽

Acmad Nurhadi ◽

Denda Rinaldi Hadinata

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Selection For ◽

Bayes Algorithm

Download Full-text

An approach to feature selection for sentiment analysis

2011 15th IEEE International Conference on Intelligent Engineering Systems ◽

10.1109/ines.2011.5954773 ◽

2011 ◽

Cited By ~ 20

Author(s):

Peter Koncz ◽

Jan Paralic

Keyword(s):

Feature Selection ◽

Sentiment Analysis ◽

Selection For

Download Full-text

Feature Selection for Highly Skewed Sentiment Analysis Tasks

10.3115/v1/w14-5902 ◽

2014 ◽

Cited By ~ 3

Author(s):

Can Liu ◽

Sandra Kübler ◽

Ning Yu

Keyword(s):

Feature Selection ◽

Sentiment Analysis ◽

Selection For

Download Full-text

Performance Analysis of Feature Selection for Twitter Sentiment Analysis: Classification approach

IJARCCE ◽

10.17148/ijarcce.2018.7815 ◽

2018 ◽

Vol 7 (8) ◽

pp. 67-74

Author(s):

Shweta V. Raut ◽

Madhu M. Nashipudimath

Keyword(s):

Feature Selection ◽

Performance Analysis ◽

Sentiment Analysis ◽

Classification Approach ◽

Selection For

Download Full-text

Improve the Accuracy of Support Vector Machine Using Chi Square Statistic and Term Frequency Inverse Document Frequency on Movie Review Sentiment Analysis

Scientific Journal of Informatics ◽

10.15294/sji.v6i1.14244 ◽

2019 ◽

Vol 6 (1) ◽

pp. 138-149

Author(s):

Ukhti Ikhsani Larasati ◽

Much Aziz Muslim ◽

Riza Arifudin ◽

Alamsyah Alamsyah

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Text Mining ◽

Sentiment Analysis ◽

Feature Weighting ◽

Support Vector ◽

Chi Square ◽

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency

Data processing can be done with text mining techniques. To process large text data is required a machine to explore opinions, including positive or negative opinions. Sentiment analysis is a process that applies text mining methods. Sentiment analysis is a process that aims to determine the content of the dataset in the form of text is positive or negative. Support vector machine is one of the classification algorithms that can be used for sentiment analysis. However, support vector machine works less well on the large-sized data. In addition, in the text mining process there are constraints one is number of attributes used. With many attributes it will reduce the performance of the classifier so as to provide a low level of accuracy. The purpose of this research is to increase the support vector machine accuracy with implementation of feature selection and feature weighting. Feature selection will reduce a large number of irrelevant attributes. In this study the feature is selected based on the top value of K = 500. Once selected the relevant attributes are then performed feature weighting to calculate the weight of each attribute selected. The feature selection method used is chi square statistic and feature weighting using Term Frequency Inverse Document Frequency (TFIDF). Result of experiment using Matlab R2017b is integration of support vector machine with chi square statistic and TFIDF that uses 10 fold cross validation gives an increase of accuracy of 11.5% with the following explanation, the accuracy of the support vector machine without applying chi square statistic and TFIDF resulted in an accuracy of 68.7% and the accuracy of the support vector machine by applying chi square statistic and TFIDF resulted in an accuracy of 80.2%.

Download Full-text