Supervised term weighting for sentiment analysis

With the advancement of information and communication technology, social networking and microblogging sites have become a vital source of information. Individuals can express their opinions, grievances, feelings, and attitudes about a variety of topics. Through microblogging platforms, they can express their opinions on current events and products. Sentiment analysis is a significant area of research in natural language processing because it aims to define the orientation of the sentiment contained in source materials. Twitter is one of the most popular microblogging sites on the internet, with millions of users daily publishing over one hundred million text messages (referred to as tweets). Choosing an appropriate term representation scheme for short text messages is critical. Term weighting schemes are critical representation schemes for text documents in the vector space model. We present a comprehensive analysis of Turkish sentiment analysis using nine supervised and unsupervised term weighting schemes in this paper. The predictive efficiency of term weighting schemes is investigated using four supervised learning algorithms (Naive Bayes, support vector machines, the k-nearest neighbor algorithm, and logistic regression) and three ensemble learning methods (AdaBoost, Bagging, and Random Subspace). The empirical evidence suggests that supervised term weighting models can outperform unsupervised term weighting models.

Download Full-text

Aspect Category Classification dengan Pendekatan Machine Learning Menggunakan Dataset Bahasa Indonesia

Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI) ◽

10.22146/jnteti.v10i3.1819 ◽

2021 ◽

Vol 10 (3) ◽

pp. 229-235

Author(s):

Syaifulloh Amien Pandega Perdana ◽

Teguh Bharata Aji ◽

Ridi Ferdiana

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Sentiment Analysis ◽

Support Vector ◽

Term Weighting ◽

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency ◽

Bahasa Indonesia

Ulasan pelanggan merupakan opini terhadap kualitas barang atau jasa yang dirasakan konsumen. Ulasan pelanggan mengandung informasi yang berguna bagi konsumen maupun penyedia barang atau jasa. Ketersediaan ulasan pelanggan dalam jumlah besar pada website membutuhkan suatu framework untuk mengekstraksi sentimen secara otomatis. Sebuah ulasan pelanggan sering kali mengandung banyak aspek sehingga Aspect Based Sentiment Analysis (ABSA) harus digunakan untuk mengetahui polaritas masing-masing aspek. Salah satu tugas penting dalam ABSA adalah Aspect Category Detection. Metode machine learning untuk Aspect Category Detection sudah banyak dilakukan pada domain berbahasa Inggris, tetapi pada domain bahasa Indonesia masih sedikit. Makalah ini membandingkan kinerja tiga algoritme machine learning, yaitu Naïve Bayes (NB), Support Vector Machine (SVM), dan Random Forest (RF) pada ulasan pelanggan berbahasa Indonesia menggunakan Term Frequency–Inverse Document Frequency (TF-IDF) sebagai term weighting. Hasil menunjukkan bahwa RF memiliki kinerja paling unggul dibandingkan NB dan SVM pada tiga domain yang berbeda, yaitu restoran, hotel, dan e-commerce, dengan nilai f1-score untuk masing-masing domain adalah 84.3%, 85.7%, dan 89,3%.

Download Full-text

Improving of Imbalanced Data in Multiclass Classification for Sentiment Analysis using Supervised Term Weighting

10.1109/ri2c51727.2021.9559797 ◽

2021 ◽

Author(s):

Jantima Polpinij ◽

Khanista Namee

Keyword(s):

Sentiment Analysis ◽

Imbalanced Data ◽

Multiclass Classification ◽

Term Weighting

Download Full-text

Normalization of Term Weighting Scheme for Sentiment Analysis

Human Language Technology Challenges for Computer Science and Linguistics - Lecture Notes in Computer Science ◽

10.1007/978-3-319-14120-6_10 ◽

2014 ◽

pp. 116-128 ◽

Cited By ~ 1

Author(s):

Alexander Pak ◽

Patrick Paroubek ◽

Amel Fraisse ◽

Gil Francopoulo

Keyword(s):

Sentiment Analysis ◽

Weighting Scheme ◽

Term Weighting

Download Full-text

Performance Analysis of Multiple Classifiers using different Term Weighting Schemes for Sentiment Analysis

2019 International Conference on Intelligent Computing and Control Systems (ICCS) ◽

10.1109/iccs45141.2019.9065895 ◽

2019 ◽

Author(s):

Aiman Abdullah Anees ◽

Harsh Prakash Gupta ◽

Aditya Prashant Dalvi ◽

Suhas Gopinath ◽

Biju R Mohan

Keyword(s):

Performance Analysis ◽

Sentiment Analysis ◽

Term Weighting ◽

Weighting Schemes ◽

Multiple Classifiers

Download Full-text

Credibility Adjusted Term Frequency: A Supervised Term Weighting Scheme for Sentiment Analysis and Text Classification

10.3115/v1/w14-2614 ◽

2014 ◽

Cited By ~ 8

Author(s):

Yoon Kim ◽

Owen Zhang

Keyword(s):

Sentiment Analysis ◽

Text Classification ◽

Weighting Scheme ◽

Term Weighting ◽

Term Frequency

Download Full-text

A Comparison of Term Weighting Schemes for Text Classification and Sentiment Analysis with a Supervised Variant of tf.idf

Communications in Computer and Information Science - Data Management Technologies and Applications ◽

10.1007/978-3-319-30162-4_4 ◽

2016 ◽

pp. 39-58 ◽

Cited By ~ 5

Author(s):

Giacomo Domeniconi ◽

Gianluca Moro ◽

Roberto Pasolini ◽

Claudio Sartori

Keyword(s):

Sentiment Analysis ◽

Text Classification ◽

Term Weighting ◽

Weighting Schemes

Download Full-text

Exploring performance of clustering methods on document sentiment analysis

Journal of Information Science ◽

10.1177/0165551515617374 ◽

2016 ◽

Vol 43 (1) ◽

pp. 54-74 ◽

Cited By ~ 14

Author(s):

Baojun Ma ◽

Hua Yuan ◽

Ye Wu

Keyword(s):

Sentiment Analysis ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Experimental Studies ◽

Experimental Results ◽

Clustering Methods ◽

Term Weighting ◽

Weighting Method ◽

Clustering Techniques ◽

Better Than

Clustering is a powerful unsupervised tool for sentiment analysis from text. However, the clustering results may be affected by any step of the clustering process, such as data pre-processing strategy, term weighting method in Vector Space Model and clustering algorithm. This paper presents the results of an experimental study of some common clustering techniques with respect to the task of sentiment analysis. Different from previous studies, in particular, we investigate the combination effects of these factors with a series of comprehensive experimental studies. The experimental results indicate that, first, the K-means-type clustering algorithms show clear advantages on balanced review datasets, while performing rather poorly on unbalanced datasets by considering clustering accuracy. Second, the comparatively newly designed weighting models are better than the traditional weighting models for sentiment clustering on both balanced and unbalanced datasets. Furthermore, adjective and adverb words extraction strategy can offer obvious improvements on clustering performance, while strategies of adopting stemming and stopword removal will bring negative influences on sentiment clustering. The experimental results would be valuable for both the study and usage of clustering methods in online review sentiment analysis.

Download Full-text

The impact of term weighting method on Twitter sentiment analysis

Pamukkale University Journal of Engineering Sciences ◽

10.5505/pajes.2016.50480 ◽

2018 ◽

Vol 24 (2) ◽

pp. 283-291

Author(s):

Önder Çoban ◽

Gülşah Tümüklü Özyer

Keyword(s):

Sentiment Analysis ◽

Term Weighting ◽

Weighting Method ◽

The Impact

Download Full-text

Supervised term weighting for sentiment analysis

Comparison Study Of Term Weighting Optimally With SVM In Sentiment Analysis

Ensemble of Classifiers and Term Weighting Schemes for Sentiment Analysis in Turkish

Aspect Category Classification dengan Pendekatan Machine Learning Menggunakan Dataset Bahasa Indonesia

Improving of Imbalanced Data in Multiclass Classification for Sentiment Analysis using Supervised Term Weighting

Normalization of Term Weighting Scheme for Sentiment Analysis

Performance Analysis of Multiple Classifiers using different Term Weighting Schemes for Sentiment Analysis

Credibility Adjusted Term Frequency: A Supervised Term Weighting Scheme for Sentiment Analysis and Text Classification

A Comparison of Term Weighting Schemes for Text Classification and Sentiment Analysis with a Supervised Variant of tf.idf

Exploring performance of clustering methods on document sentiment analysis

The impact of term weighting method on Twitter sentiment analysis

Export Citation Format