scholarly journals Penggunaan Feature Selection di Algoritma Support Vector Machine untuk Sentimen Analisis Komisi Pemilihan Umum

2019 ◽  
Vol 3 (3) ◽  
pp. 364-370
Imam Santoso ◽  
Windu Gata ◽  
Atik Budi Paryanti

At this time sentiment analysis is very widely used by people to see the extent of people's sentiments towards an object.  Objects that can be used in sentiment analysis can be various kinds, for example about the product regarding receipt by consumers, agencies or institutions regarding the performance of the agency. Whereas for this study taking sentiment analysis of the State Institution namely the General Election Commission (KPU) about the sentiments of the implementation of the ELECTION simultaneously and also the results of the implementation of the ELECTION which have become the subject of discussion by netizens on social media. So this research takes retweet data and retention comments from Twitter social media users. The algorithm used in this study is Support Vector Machine (SVM), with optimization of the use of Weight by Correlation Feature Selection (FS). The results of cross validation SVM without FS are 66.49% for accuracy and 0.716 for AUC. Whereas SVM with FS is 81.18% for accuracy and 0.943 for AUC. Very significant improvement with the use of Weight by Correlation Feature Selection (FS).

2020 ◽  
Vol 8 (2) ◽  
pp. 91-100
Muhamad Azhar ◽  
Noor Hafidz ◽  
Biktra Rudianto ◽  
Windu Gata

Abstract   Technology implementation in the marketplace world has attracted the attention of researchers to analyze the reviews from customers. The Klik Indomaret application page on GooglePlay is one application that can be used to get information on review data collection. However, getting information on consumer’s opinion or review is not an easy task and need a specific method in categorizing or grouping these reviews into certain groups, i.e. positive or negative reviews. The sentiment analysis study of a review application in GooglePlay is still rare. Therefore, this paper analysis the customer’s sentiment from klikindomaret app using Naive Bayes Classifier (NB) algorithm that is compared to Support Vector Machine (SVM) as well as optimizing the Feature Selection (FS) using the Particle Swarm Optimization method. The results for NB without using FS optimization were 69.74% for accuracy and 0.518 for Area Under Curve (AUC) and for SVM without using FS optimization were 81.21% for accuracy and 0.896 for AUC. While the results of cross-validation NB with FS are 75.21% for accuracy and 0.598 for AUC and cross-validation of SVM with FS is 81.84% for accuracy and 0.898 for AUC, while there is an increase when using the Feature Selection (FS) Particle Swarm Optimization and also the modeling algorithm SVM has a higher value compared to NB for the dataset used in this study.   Keywords: Naive Bayes, Particle Swarm Optimization, Support Vector Machine, Feature Selection, Consumer Review.

2020 ◽  
Vol 4 (3) ◽  
pp. 504-512
Faried Zamachsari ◽  
Gabriel Vangeran Saragih ◽  
Susafa'ati ◽  
Windu Gata

The decision to move Indonesia's capital city to East Kalimantan received mixed responses on social media. When the poverty rate is still high and the country's finances are difficult to be a factor in disapproval of the relocation of the national capital. Twitter as one of the popular social media, is used by the public to express these opinions. How is the tendency of community responses related to the move of the National Capital and how to do public opinion sentiment analysis related to the move of the National Capital with Feature Selection Naive Bayes Algorithm and Support Vector Machine to get the highest accuracy value is the goal in this study. Sentiment analysis data will take from public opinion using Indonesian from Twitter social media tweets in a crawling manner. Search words used are #IbuKotaBaru and #PindahIbuKota. The stages of the research consisted of collecting data through social media Twitter, polarity, preprocessing consisting of the process of transform case, cleansing, tokenizing, filtering and stemming. The use of feature selection to increase the accuracy value will then enter the ratio that has been determined to be used by data testing and training. The next step is the comparison between the Support Vector Machine and Naive Bayes methods to determine which method is more accurate. In the data period above it was found 24.26% positive sentiment 75.74% negative sentiment related to the move of a new capital city. Accuracy results using Rapid Miner software, the best accuracy value of Naive Bayes with Feature Selection is at a ratio of 9:1 with an accuracy of 88.24% while the best accuracy results Support Vector Machine with Feature Selection is at a ratio of 5:5 with an accuracy of 78.77%.

Midde Venkateswarlu Naik ◽  
D. Vasumathi ◽  
A.P. Siva Kumar

Aims: The proposed research work is on an evolutionary enhanced method for sentiment or emotion classification on unstructured review text in the big data field. The sentiment analysis plays a vital role for current generation of people for extracting valid decision points about any aspect such as movie ratings, education institute or politics ratings, etc. The proposed hybrid approach combined the optimal feature selection using Particle Swarm Optimization (PSO) and sentiment classification through Support Vector Machine (SVM). The current approach performance is evaluated with statistical measures, such as precision, recall, sensitivity, specificity, and was compared with the existing approaches. The earlier authors have achieved an accuracy of sentiment classifier in the English text up to 94% as of now. In the proposed scheme, an average accuracy of sentiment classifier on distinguishing datasets outperformed as 99% by tuning various parameters of SVM, such as constant c value and kernel gamma value in association with PSO optimization technique. The proposed method utilized three datasets, such as airline sentiment data, weather, and global warming datasets, that are publically available. The current experiment produced results that are trained and tested based on 10- Fold Cross-Validations (FCV) and confusion matrix for predicting sentiment classifier accuracy. Background: The sentiment analysis plays a vital role for current generation people for extracting valid decisions about any aspect such as movie rating, education institute or even politics ratings, etc. Sentiment Analysis (SA) or opinion mining has become fascinated scientifically as a research domain for the present environment. The key area is sentiment classification on semi-structured or unstructured data in distinguish languages, which has become a major research aspect. User-Generated Content [UGC] from distinguishing sources has been hiked significantly with rapid growth in a web environment. The huge user-generated data over social media provides substantial value for discovering hidden knowledge or correlations, patterns, and trends or sentiment extraction about any specific entity. SA is a computational analysis to determine the actual opinion of an entity which is expressed in terms of text. SA is also called as computation of emotional polarity expressed over social media as natural text in miscellaneous languages. Usually, the automatic superlative sentiment classifier model depends on feature selection and classification algorithms. Methods: The proposed work used Support vector machine as classification technique and particle swarm optimization technique as feature selection purpose. In this methodology, we tune various permutations and combination parameters in order to obtain expected desired results with kernel and without kernel technique for sentiment classification on three datasets, including airline, global warming, weather sentiment datasets, that are freely hosted for research practices. Results: In the proposed scheme, The proposed method has outperformed with 99.2% of average accuracy to classify the sentiment on different datasets, among other machine learning techniques. The attained high accuracy in classifying sentiment or opinion about review text proves superior effectiveness over existing sentiment classifiers. The current experiment produced results that are trained and tested based on 10- Fold Cross-Validations (FCV) and confusion matrix for predicting sentiment classifier accuracy. Conclusion: The objective of the research issue sentiment classifier accuracy has been hiked with the help of Kernel-based Support Vector Machine (SVM) based on parameter optimization. The optimal feature selection to classify sentiment or opinion towards review documents has been determined with the help of a particle swarm optimization approach. The proposed method utilized three datasets to simulate the results, such as airline sentiment data, weather sentiment data, and global warming data that are freely available datasets.

2019 ◽  
Vol 11 (2) ◽  
pp. 144
Danar Wido Seno ◽  
Arief Wibowo

Social media writing content growing make a lot of new words that appear on Twitter in the form of words and abbreviations that appear so that sentiment analysis is increasingly difficult to get high accuracy of textual data on Twitter social media. In this study, the authors conducted research on sentiment analysis of the pairs of candidates for President and Vice President of Indonesia in the 2019 Elections. To obtain higher accuracy results and accommodate the problem of textual data development on Twitter, the authors conducted a combination of methods to conduct the sentiment analysis with unsupervised and supervised methods. namely Lexicon Based. This study used Twitter data in October 2018 using the search keywords with the names of each pair of candidates for President and Vice President of the 2019 Elections totaling 800 datasets. From the study with 800 datasets the best accuracy was obtained with a value of 92.5% with 80% training data composition and 20% testing data with a Precision value in each class between 85.7% - 97.2% and Recall value for each class among 78, 2% - 93.5%. With the Lexicon Based method as a labeling dataset, the process of labeling the Support Vector Machine dataset is no longer done manually but is processed by the Lexicon Based method and the dictionary on the lexicon can be added along with the development of data content on Twitter social media.

Karteek Ramalinga Ponnuru ◽  
Rashik Gupta ◽  
Shrawan Kumar Trivedi

Firms are turning their eye towards social media analytics to get to know what people are really talking about their firm or their product. With the huge amount of buzz being created online about anything and everything social media has become ‘the' platform of the day to understand what public on a whole are talking about a particular product and the process of converting all the talking into valuable information is called Sentiment Analysis. Sentiment Analysis is a process of identifying and categorizing a piece of text into positive or negative so as to understand the sentiment of the users. This chapter would take the reader through basic sentiment classifiers like building word clouds, commonality clouds, dendrograms and comparison clouds to advanced algorithms like K Nearest Neighbour, Naïve Biased Algorithm and Support Vector Machine.

2020 ◽  
Vol 11 (2) ◽  
pp. 66-81
Badia Klouche ◽  
Sidi Mohamed Benslimane ◽  
Sakina Rim Bennabi

Sentiment analysis is one of the recent areas of emerging research in the classification of sentiment polarity and text mining, particularly with the considerable number of opinions available on social media. The Algerian Operator Telephone Ooredoo, as other operators, deploys in its new strategy to conquer new customers, by exploiting their opinions through a sentiments analysis. The purpose of this work is to set up a system called “Ooredoo Rayek”, whose objective is to collect, transliterate, translate and classify the textual data expressed by the Ooredoo operator's customers. This article developed a set of rules allowing the transliteration from Algerian Arabizi to Algerian dialect. Furthermore, the authors used Naïve Bayes (NB) and (Support Vector Machine) SVM classifiers to assign polarity tags to Facebook comments from the official pages of Ooredoo written in multilingual and multi-dialect context. Experimental results show that the system obtains good performance with 83% of accuracy.

2021 ◽  
Vol 4 (1) ◽  
pp. 1-8
Shafira Shalehanny ◽  
Agung Triayudi ◽  
Endah Tri Esti Handayani

Technology field following how era keep evolving. Social media already on everyone’s daily life and being a place for writing their opinion, either review or response for product and service that already being used. Twitter are one of popular social media on Indonesia, according to Statista data it reach 17.55 million users. For online business sector, knowing sentiment score are really important to stepping up their business. The use of machine learning, NLP (Natural Processing Language), and text mining for knowing the real meaning of opinion words given by customer called sentiment analysis. Two methods are using for data testing, the first is Lexicon Based and the second is Support Vector Machine (SVM). Data source that used for sentiment analyst are from keyword ‘ShopeeFood’ and ‘syopifud’. The result of analysis giving accuracy score 87%, precision score 81%, recall score 75%, and f1-score 78%.

Teknika ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 18-26
Hendry Cipta Husada ◽  
Adi Suryaputra Paramita

Perkembangan teknologi saat ini telah memberikan kemudahan bagi banyak orang dalam mendapatkan dan menyebarkan informasi di berbagai social media platform. Twitter merupakan salah satu media yang kerap digunakan untuk menyampaikan opini sebagai bentuk reaksi seseorang atas suatu hal. Opini yang terdapat di Twitter dapat digunakan perusahaan maskapai penerbangan sebagai parameter kunci untuk mengetahui tingkat kepuasan publik sekaligus bahan evaluasi bagi perusahaan. Berdasarkan hal tersebut, diperlukan sebuah metode yang dapat secara otomatis melakukan klasifikasi opini ke dalam kategori positif, negatif, atau netral melalui proses analisis sentimen. Proses analisis sentimen dilakukan dengan proses data preprocessing, pembobotan kata menggunakan metode TF-IDF, penerapan algoritma, dan pembahasan atas hasil klasifikasi. Klasifikasi opini dilakukan dengan machine learning approach memanfaatkan algoritma multi-class Support Vector Machine (SVM). Data yang digunakan dalam penelitian ini adalah opini dalam bahasa Inggris dari para pengguna Twitter terhadap maskapai penerbangan. Berdasarkan pengujian yang telah dilakukan, hasil klasifikasi terbaik diperoleh menggunakan SVM kernel RBF pada nilai parameter 𝐶(complexity) = 10 dan 𝛾(gamma) = 1, dengan nilai accuracy sebesar 84,37% dan 80,41% ketika menggunakan 10-fold cross validation.

2018 ◽  
Vol 5 (5) ◽  
pp. 537 ◽  
Oman Somantri ◽  
Dyah Apriliani

<p class="Judul2"><strong>Abstrak</strong></p><p class="Judul2"> </p><p class="Abstrak">Setiap pelanggan pasti menginginkan sebuah pendukung keputusan dalam menentukan pilihan ketika akan mengunjungi sebuah tempat makan atau kuliner yang sesuai dengan keinginan salah satu contohnya yaitu di Kota Tegal. <em>Sentiment analysis</em> digunakan untuk memberikan sebuah solusi terkait dengan permasalahan tersebut, dengan menereapkan model algoritma S<em>upport Vector Machine</em> (SVM). Tujuan dari penelitian ini adalah mengoptimalisasi model yang dihasilkan dengan diterapkannya <em>feature selection</em> menggunakan algoritma <em>Informatioan Gain</em> (IG) dan <em>Chi Square</em> pada hasil model terbaik yang dihasilkan oleh SVM pada klasifikasi tingkat kepuasan pelanggan terhadap warung dan restoran kuliner di Kota Tegal sehingga terjadi peningkatan akurasi dari model yang dihasilkan. Hasil penelitian menunjukan bahwa tingkat akurasi terbaik dihasilkan oleh model SVM-IG dengan tingkat akurasi terbaik sebesar 72,45% mengalami peningkatan sekitar 3,08% yang awalnya 69.36%. Selisih rata-rata yang dihasilkan setelah dilakukannya optimasi SVM dengan <em>feature selection</em> adalah 2,51% kenaikan tingkat akurasinya. Berdasarkan hasil penelitian bahwa <em>feature selection</em> dengan menggunakan <em>Information Gain (IG)</em> (SVM-IG) memiliki tingkat akurasi lebih baik apabila dibandingkan SVM dan <em>Chi Squared</em> (SVM-CS) sehingga dengan demikian model yang diusulkan dapat meningkatkan tingkat akurasi yang dihasilkan oleh SVM menjadi lebih baik.</p><p class="Abstrak"><strong><em><br /></em></strong></p><p class="Abstrak"><strong><em>Abstract</em></strong></p><p class="Judul2"> </p><p class="Judul2"><em>The Customer needs to get a decision support in determining a choice when they’re visit a culinary restaurant accordance to their wishes especially at Tegal City. Sentiment analysis is used to provide a solution related to this problem by applying the Support Vector Machine (SVM) algorithm model. The purpose of this research is to optimize the generated model by applying feature selection using Informatioan Gain (IG) and Chi Square algorithm on the best model produced by SVM on the classification of customer satisfaction level based on culinary restaurants at Tegal City so that there is an increasing accuracy from the model. The results showed that the best accuracy level produced by the SVM-IG model with the best accuracy of 72.45% experienced an increase of about 3.08% which was initially 69.36%. The difference average produced after SVM optimization with feature selection is 2.51% increase in accuracy. Based on the results of the research, the feature selection using Information Gain (SVM-IG) has a better accuracy rate than SVM and Chi Squared (SVM-CS) so that the proposed model can improve the accuracy of SVM better.</em></p>

Sign in / Sign up

Export Citation Format

Share Document