Penerapan Algoritma Support Vector Machine (SVM) dengan TF-IDF N-Gram untuk Text Classification

Sentiment Analysis intends to get the basic perspective of the content, which may be anything that holds a subjective supposition, for example, an online audit, Comments on Blog posts, film rating and so forth. These surveys and websites might be characterized into various extremity gatherings, for example, negative, positive, and unbiased keeping in mind the end goal to concentrate data from the info dataset. Supervised machine learning strategies group these reviews. In this paper, three distinctive machine learning calculations, for example, Support Vector Machine (SVM), Maximum Entropy (ME) and Naive Bayes (NB), have been considered for the arrangement of human conclusions. The exactness of various strategies is basically inspected keeping in mind the end goal to get to their execution on the premise of parameters, e.g. accuracy, review, f-measure, and precision.

Download Full-text

Artificial bee colony algorithm for feature selection and improved support vector machine for text classification

Information Discovery and Delivery ◽

10.1108/idd-09-2018-0045 ◽

2019 ◽

Vol 47 (3) ◽

pp. 154-170

Author(s):

Janani Balakumar ◽

S. Vijayarani Mohan

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Text Classification ◽

Support Vector ◽

Data Sets ◽

Selection Algorithm ◽

Data Set ◽

Content Type ◽

Benchmark Data ◽

Bee Colony

Purpose Owing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification results, feature selection, an important stage, is used to curtail the dimensionality of text documents by choosing suitable features. The main purpose of this research work is to classify the personal computer documents based on their content. Design/methodology/approach This paper proposes a new algorithm for feature selection based on artificial bee colony (ABCFS) to enhance the text classification accuracy. The proposed algorithm (ABCFS) is scrutinized with the real and benchmark data sets, which is contrary to the other existing feature selection approaches such as information gain and χ2 statistic. To justify the efficiency of the proposed algorithm, the support vector machine (SVM) and improved SVM classifier are used in this paper. Findings The experiment was conducted on real and benchmark data sets. The real data set was collected in the form of documents that were stored in the personal computer, and the benchmark data set was collected from Reuters and 20 Newsgroups corpus. The results prove the performance of the proposed feature selection algorithm by enhancing the text document classification accuracy. Originality/value This paper proposes a new ABCFS algorithm for feature selection, evaluates the efficiency of the ABCFS algorithm and improves the support vector machine. In this paper, the ABCFS algorithm is used to select the features from text (unstructured) documents. Although, there is no text feature selection algorithm in the existing work, the ABCFS algorithm is used to select the data (structured) features. The proposed algorithm will classify the documents automatically based on their content.

Download Full-text

A Hybrid Text Classification Method Based on K-Congener-Nearest-Neighbors and Hypersphere Support Vector Machine

2013 International Conference on Information Technology and Applications ◽

10.1109/ita.2013.120 ◽

2013 ◽

Cited By ~ 2

Author(s):

Y.H. Chen ◽

Y.F. Zheng ◽

J.F. Pan ◽

N. Yang

Keyword(s):

Support Vector Machine ◽

Text Classification ◽

Nearest Neighbors ◽

Classification Method ◽

Support Vector

Download Full-text

Improving the Accuracy of Text Classification using Stemming Method, A Case of Non-formal Indonesian Conversation

10.21203/rs.3.rs-41431/v2 ◽

2020 ◽

Author(s):

Rianto Rianto ◽

Achmad Benny Mutiara ◽

Eri Prasetyo Wibowo ◽

Paulus Insap Santosa

Keyword(s):

Support Vector Machine ◽

Information Retrieval ◽

Text Classification ◽

Experimental Evaluation ◽

Hate Speech ◽

Text Processing ◽

High Accuracy ◽

Support Vector ◽

Support Vector Machine Algorithm ◽

Text Data

Abstract Stemming has long been used in data pre-processing in information retrieval, which aims to make affix words into root words. However, there are not many stemming methods for non-formal Indonesian text processing. The existing stemming method has high accuracy for formal Indonesian, but low for non-formal Indonesian. Thus, the stemming method which has high accuracy for non-formal Indonesian classifier model is still an open-ended challenge. This study introduces a new stemming method to solve problems in the non-formal Indonesian text data pre-processing. Furthermore, this study aims to provide comprehensive research on improving the accuracy of text classifier models by strengthening on stemming method. Using the Support Vector Machine algorithm, a text classifier model is developed, and its accuracy is checked. The experimental evaluation was done by testing 550 datasets in Indonesian using two different stemming methods. The results show that using the proposed stemming method, the text classifier model has higher accuracy than the existing methods with a score of 0.85 and 0.73, respectively. In the future, the proposed stemming method can be used to develop the Indonesian text classifier model which can be used for various purposes including text clustering, summarization, detecting hate speech, and other text processing applications.

Download Full-text

Studi Komparatif Metode Ekstraksi Fitur pada Analisis Sentimen Maskapai Penerbangan Menggunakan Support Vector Machine dan Maximum Entropy

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v3i3.1159 ◽

2019 ◽

Vol 3 (3) ◽

pp. 402-407 ◽

Cited By ~ 1

Author(s):

Mona Cindo ◽

Dian Palupi Rini ◽

Ermatita

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Sentiment Analysis ◽

Maximum Entropy ◽

Entropy Method ◽

Support Vector ◽

Machine Method ◽

N Gram ◽

Almost All

Almost all companies use social media to improve their product services and provide after-sales services that allow their customers to review the quality of their products. By using Twitter social media to be an important source for tracking sentiment analysis. Sentiment analysis is one of the most popular studies today, using sentiment analysis companies can analyze customer satisfaction to improve their services. This study aims to analyze airline sentiments with five different features such as pragmatic, lexical n-gram, POS, sentiment, and LDA using the Support Vector Machine and Maximum Entropy methods. The best results can be obtained using the Maximum Entropy method using all feature extraction with an accuracy of 92.7% and in the Support Vector Machine method, the accuracy obtained is 89.2%.

Download Full-text