Self-Supervised Learning based on Sentiment Analysis with Word Weight Calculation

2021 ◽  
Author(s):  
Dongcheol Son ◽  
Youngjoong Ko
2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Huu-Thanh Duong ◽  
Tram-Anh Nguyen-Thi

AbstractIn literature, the machine learning-based studies of sentiment analysis are usually supervised learning which must have pre-labeled datasets to be large enough in certain domains. Obviously, this task is tedious, expensive and time-consuming to build, and hard to handle unseen data. This paper has approached semi-supervised learning for Vietnamese sentiment analysis which has limited datasets. We have summarized many preprocessing techniques which were performed to clean and normalize data, negation handling, intensification handling to improve the performances. Moreover, data augmentation techniques, which generate new data from the original data to enrich training data without user intervention, have also been presented. In experiments, we have performed various aspects and obtained competitive results which may motivate the next propositions.


Author(s):  
Yuhao Pan ◽  
Zhiqun Chen ◽  
Yoshimi Suzuki ◽  
Fumiyo Fukumoto ◽  
Hiromitsu Nishizaki

2016 ◽  
Vol 49 (1) ◽  
pp. 1-26 ◽  
Author(s):  
Nadia Felix F. Da Silva ◽  
Luiz F. S. Coletta ◽  
Eduardo R. Hruschka

2020 ◽  
pp. 016555152091003
Author(s):  
Gyeong Taek Lee ◽  
Chang Ouk Kim ◽  
Min Song

Sentiment analysis plays an important role in understanding individual opinions expressed in websites such as social media and product review sites. The common approaches to sentiment analysis use the sentiments carried by words that express opinions and are based on either supervised or unsupervised learning techniques. The unsupervised learning approach builds a word-sentiment dictionary, but it requires lengthy time periods and high costs to build a reliable dictionary. The supervised learning approach uses machine learning models to learn the sentiment scores of words; however, training a classifier model requires large amounts of labelled text data to achieve a good performance. In this article, we propose a semisupervised approach that performs well despite having only small amounts of labelled data available for training. The proposed method builds a base sentiment dictionary from a small training dataset using a lasso-based ensemble model with minimal human effort. The scores of words not in the training dataset are estimated using an adaptive instance-based learning model. In a pretrained word2vec model space, the sentiment values of the words in the dictionary are propagated to the words that did not exist in the training dataset. Through two experiments, we demonstrate that the performance of the proposed method is comparable to that of supervised learning models trained on large datasets.


2012 ◽  
Vol 92 ◽  
pp. 98-115 ◽  
Author(s):  
Jonathan Ortigosa-Hernández ◽  
Juan Diego Rodríguez ◽  
Leandro Alzate ◽  
Manuel Lucania ◽  
Iñaki Inza ◽  
...  

2021 ◽  
Vol 26 (5) ◽  
pp. 501-506
Author(s):  
Anuj Kumar Singh ◽  
Sandeep Kumar ◽  
Shashi Bhushan ◽  
Pramod Kumar ◽  
Arun Vashishtha

When anyone is looking to enroll for a freely available online course so the first and famous name comes in front of the searcher is MOOC courses. So here in this article our focus is to collect the comments by enrolled users for the specified MOOC course and apply sentiment analysis over that data. The significance of our article is to introduce a proficient sentiment analysis algorithm with high perceptive execution in MOOC courses, by seeking after the standards of gathering various supervised learning methods where the performance of various supervised machine learning algorithms in performing sentiment analysis of MOOC data. Some research questions have been addressed on sentiment analysis of MOOC data. For the assessment task, we have investigated a large no of MOOC courses, with the different Supervised Learning methods and calculated accuracy of the data by using parameters such as Precision, Recall and F1 Score. From the results we can conclude that when the bigram model was applied to the logistic regression, the Multilayer Perceptron (MLP) overcomes the accuracy by other algorithms as SVM, Naive Bayes and achieved an accuracy of 92.44 percent. To determine the sentiment polarity of a sentence, the suggested method use term frequency (No of Positive, Negative terms in the text) to calculate the sentiment polarity of the text. We use a logistic regression Function to predict the sentiment classification accuracy of positive and negative comments from the data.


Sign in / Sign up

Export Citation Format

Share Document