Toward multi-label sentiment analysis: a transfer learning based approach

AbstractSentiment analysis is recognized as one of the most important sub-areas in Natural Language Processing (NLP) research, where understanding implicit or explicit sentiments expressed in social media contents is valuable to customers, business owners, and other stakeholders. Researchers have recognized that the generic sentiments extracted from the textual contents are inadequate, thus, Aspect Based Sentiment Analysis (ABSA) was coined to capture aspect sentiments expressed toward specific review aspects. Existing ABSA methods not only treat the analytical problem as single-label classification that requires a fairly large amount of labelled data for model training purposes, but also underestimate the entity aspects that are independent of certain sentiments. In this study, we propose a transfer learning based approach tackling the aforementioned shortcomings of existing ABSA methods. Firstly, the proposed approach extends the ABSA methods with multi-label classification capabilities. Secondly, we propose an advanced sentiment analysis method, namely Aspect Enhanced Sentiment Analysis (AESA) to classify text into sentiment classes with consideration of the entity aspects. Thirdly, we extend two state-of-the-art transfer learning models as the analytical vehicles of multi-label ABSA and AESA tasks. We design an experiment that includes data from different domains to extensively evaluate the proposed approach. The empirical results undoubtedly exhibit that the proposed approach outperform all the baseline approaches.

Download Full-text

A Tweet Sentiment Classification Approach Using a Hybrid Stacked Ensemble Technique

Information ◽

10.3390/info12090374 ◽

2021 ◽

Vol 12 (9) ◽

pp. 374

Author(s):

Babacar Gaye ◽

Dezheng Zhang ◽

Aziguli Wulamu

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Deep Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Short Term Memory ◽

State Of The Art ◽

Accuracy Score ◽

Learning Models ◽

Proposed Model

With the extensive availability of social media platforms, Twitter has become a significant tool for the acquisition of peoples’ views, opinions, attitudes, and emotions towards certain entities. Within this frame of reference, sentiment analysis of tweets has become one of the most fascinating research areas in the field of natural language processing. A variety of techniques have been devised for sentiment analysis, but there is still room for improvement where the accuracy and efficacy of the system are concerned. This study proposes a novel approach that exploits the advantages of the lexical dictionary, machine learning, and deep learning classifiers. We classified the tweets based on the sentiments extracted by TextBlob using a stacked ensemble of three long short-term memory (LSTM) as base classifiers and logistic regression (LR) as a meta classifier. The proposed model proved to be effective and time-saving since it does not require feature extraction, as LSTM extracts features without any human intervention. We also compared our proposed approach with conventional machine learning models such as logistic regression, AdaBoost, and random forest. We also included state-of-the-art deep learning models in comparison with the proposed model. Experiments were conducted on the sentiment140 dataset and were evaluated in terms of accuracy, precision, recall, and F1 Score. Empirical results showed that our proposed approach manifested state-of-the-art results by achieving an accuracy score of 99%.

Download Full-text

Sentiment analysis via dually-born-again network and sample selection

Intelligent Data Analysis ◽

10.3233/ida-194909 ◽

2020 ◽

Vol 24 (6) ◽

pp. 1257-1271

Author(s):

Pinlong Zhao ◽

Zefeng Han ◽

Qing Yin ◽

Shuxiao Li ◽

Ou Wu

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

State Of The Art ◽

Sample Selection ◽

Learning Framework ◽

Training Samples ◽

Knowledge Distillation ◽

Model Training ◽

Born Again ◽

Text Sentiment Analysis

Text sentiment analysis is an important natural language processing (NLP) task and has received considerable attention in recent years. Numerous deep-learning based methods have been proposed in previous literature in terms of new deep neural networks (DNN) including new embedding strategies, new attention mechanisms, and new encoding layers. In this study, an alternative technical path is investigated to further improve the state-of-the-art performance of text sentiment analysis. An new effective learning framework is proposed that combines knowledge distillation and sample selection. A dually-born-again network (DBAN) is presented in which the teacher network and the student network are simultaneously trained through an iterative approach. A selection gate is defined to deal with training samples which are useless or even harmful for model training. Moreover, both the DBAN and sample selection are further improved by ensemble. The proposed framework can improve the existing state-of-the-art DNN models in sentiment analysis. Experimental results indicate that the proposed framework enhances the performances of existing networks. In addition, DBAN outperforms existing born-again network.

Download Full-text

Lexicon based Chinese language sentiment analysis method

Computer Science and Information Systems ◽

10.2298/csis181015013c ◽

2019 ◽

Vol 16 (2) ◽

pp. 639-655

Author(s):

Jinyan Chen ◽

Susanne Becken ◽

Bela Stantic

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Language Processing ◽

Chinese Language ◽

Big Data Analytics ◽

Limited Attention ◽

Analysis Method ◽

Media Platform ◽

Chinese Social Media

The growing number of social media users and vast volume of posts could provide valuable information about the sentiment toward different locations, services as well as people. Recent advances in Big Data analytics and natural language processing often means to automatically calculate sentiment in these posts. Sentiment analysis is challenging and computationally demanding task due to the volume of data, misspelling, emoticons as well as abbreviations. While significant work was directed toward the sentiment analysis of English text there is limited attention in literature toward the sentiment analytic of Chinese language. In this work we propose method to identify the sentiment in Chinese social media posts and to test our method we rely on posts sent by visitors of Great Barrier Reef by users of most popular Chinese social media platform Sina Weibo. We elaborate process of capturing of weibo posts, describe a creation of lexicon as well as develop and explain algorithm for sentiment calculation. In case study, related to sentiment toward the different GBR destinations, we demonstrate that the proposed method is effective in obtaining the information and is suitable to monitor visitors? opinion.

Download Full-text

Deep Learning for text in limted data settings

10.36227/techrxiv.12100692 ◽

2020 ◽

Author(s):

Pathikkumar Patel ◽

Bhargav Lad ◽

Jinan Fiaidhi

Keyword(s):

Machine Learning ◽

Time Series ◽

Deep Learning ◽

Sentiment Analysis ◽

Transfer Learning ◽

Text Classification ◽

State Of The Art ◽

Time Series Forecasting ◽

Text Data ◽

Performance Levels

During the last few years, RNN models have been extensively used and they have proven to be better for sequence and text data. RNNs have achieved state-of-the-art performance levels in several applications such as text classification, sequence to sequence modelling and time series forecasting. In this article we will review different Machine Learning and Deep Learning based approaches for text data and look at the results obtained from these methods. This work also explores the use of transfer learning in NLP and how it affects the performance of models on a specific application of sentiment analysis.

Download Full-text

Learning emotional word embeddings for sentiment analysis

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201993 ◽

2021 ◽

pp. 1-13

Author(s):

Qingtian Zeng ◽

Xishi Zhao ◽

Xiaohui Hu ◽

Hua Duan ◽

Zhongying Zhao ◽

...

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

State Of The Art ◽

Research Problem ◽

Emotional Word ◽

Classification Model ◽

Data Sets ◽

Word Embeddings ◽

Real World Data ◽

Text Documents

Word embeddings have been successfully applied in many natural language processing tasks due to its their effectiveness. However, the state-of-the-art algorithms for learning word representations from large amounts of text documents ignore emotional information, which is a significant research problem that must be addressed. To solve the above problem, we propose an emotional word embedding (EWE) model for sentiment analysis in this paper. This method first applies pre-trained word vectors to represent document features using two different linear weighting methods. Then, the resulting document vectors are input to a classification model and used to train a text sentiment classifier, which is based on a neural network. In this way, the emotional polarity of the text is propagated into the word vectors. The experimental results on three kinds of real-world data sets demonstrate that the proposed EWE model achieves superior performances on text sentiment prediction, text similarity calculation, and word emotional expression tasks compared to other state-of-the-art models.

Download Full-text

Sentiment Analysis Techniques Applied to Raw-Text Data from a Csq-8 Questionnaire about Mindfulness in Times of COVID-19 to Improve Strategy Generation

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18126408 ◽

2021 ◽

Vol 18 (12) ◽

pp. 6408

Author(s):

Mario Jojoa Acosta ◽

Gema Castillo-Sánchez ◽

Begonya Garcia-Zapirain ◽

Isabel de la Torre Díez ◽

Manuel Franco-Martín

Keyword(s):

Health Care ◽

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Transfer Learning ◽

Language Processing ◽

Health Care Professionals ◽

Ground Truth ◽

Relevant Information ◽

Free Text

The use of artificial intelligence in health care has grown quickly. In this sense, we present our work related to the application of Natural Language Processing techniques, as a tool to analyze the sentiment perception of users who answered two questions from the CSQ-8 questionnaires with raw Spanish free-text. Their responses are related to mindfulness, which is a novel technique used to control stress and anxiety caused by different factors in daily life. As such, we proposed an online course where this method was applied in order to improve the quality of life of health care professionals in COVID 19 pandemic times. We also carried out an evaluation of the satisfaction level of the participants involved, with a view to establishing strategies to improve future experiences. To automatically perform this task, we used Natural Language Processing (NLP) models such as swivel embedding, neural networks, and transfer learning, so as to classify the inputs into the following three categories: negative, neutral, and positive. Due to the limited amount of data available—86 registers for the first and 68 for the second—transfer learning techniques were required. The length of the text had no limit from the user’s standpoint, and our approach attained a maximum accuracy of 93.02% and 90.53%, respectively, based on ground truth labeled by three experts. Finally, we proposed a complementary analysis, using computer graphic text representation based on word frequency, to help researchers identify relevant information about the opinions with an objective approach to sentiment. The main conclusion drawn from this work is that the application of NLP techniques in small amounts of data using transfer learning is able to obtain enough accuracy in sentiment analysis and text classification stages.

Download Full-text

Triage and diagnosis of COVID-19 from medical social media (Preprint)

10.2196/preprints.30397 ◽

2021 ◽

Author(s):

Abul Hasan ◽

Mark Levene ◽

David Weston ◽

Renate Fromson ◽

Nicolas Koslover ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Models ◽

Rule Based ◽

Additional Information ◽

Processing Pipeline ◽

Machine Learning Models

BACKGROUND The COVID-19 pandemic has created a pressing need for integrating information from disparate sources, in order to assist decision makers. Social media is important in this respect, however, to make sense of the textual information it provides and be able to automate the processing of large amounts of data, natural language processing methods are needed. Social media posts are often noisy, yet they may provide valuable insights regarding the severity and prevalence of the disease in the population. In particular, machine learning techniques for triage and diagnosis could allow for a better understanding of what social media may offer in this respect. OBJECTIVE This study aims to develop an end-to-end natural language processing pipeline for triage and diagnosis of COVID-19 from patient-authored social media posts, in order to provide researchers and other interested parties with additional information on the symptoms, severity and prevalence of the disease. METHODS The text processing pipeline first extracts COVID-19 symptoms and related concepts such as severity, duration, negations, and body parts from patients’ posts using conditional random fields. An unsupervised rule-based algorithm is then applied to establish relations between concepts in the next step of the pipeline. The extracted concepts and relations are subsequently used to construct two different vector representations of each post. These vectors are applied separately to build support vector machine learning models to triage patients into three categories and diagnose them for COVID-19. RESULTS We report that Macro- and Micro-averaged F_{1\ }scores in the range of 71-96% and 61-87%, respectively, for the triage and diagnosis of COVID-19, when the models are trained on human labelled data. Our experimental results indicate that similar performance can be achieved when the models are trained using predicted labels from concept extraction and rule-based classifiers, thus yielding end-to-end machine learning. Also, we highlight important features uncovered by our diagnostic machine learning models and compare them with the most frequent symptoms revealed in another COVID-19 dataset. In particular, we found that the most important features are not always the most frequent ones. CONCLUSIONS Our preliminary results show that it is possible to automatically triage and diagnose patients for COVID-19 from natural language narratives using a machine learning pipeline, in order to provide additional information on the severity and prevalence of the disease through the eyes of social media.

Download Full-text

Análise de discursos em notícias sobre homofobia, racismo e sexismo em comentários de portais brasileiros de notícias

10.14210/cotb.v12.p467-474 ◽

2021 ◽

Author(s):

Lucas Rodrigues ◽

Antonio Jacob Junior ◽

Fábio Lobato

Keyword(s):

Social Media ◽

Natural Language Processing ◽

Sentiment Analysis ◽

Data Visualization ◽

Language Processing ◽

Topic Modeling ◽

Hate Speech ◽

Psychological Impact ◽

Internet Service ◽

General Law

Posts with defamatory content or hate speech are constantly foundon social media. The results for readers are numerous, not restrictedonly to the psychological impact, but also to the growth of thissocial phenomenon. With the General Law on the Protection ofPersonal Data and the Marco Civil da Internet, service providersbecame responsible for the content in their platforms. Consideringthe importance of this issue, this paper aims to analyze the contentpublished (news and comments) on the G1 News Portal with techniquesbased on data visualization and Natural Language Processing,such as sentiment analysis and topic modeling. The results showthat even with most of the comments being neutral or negative andclassified or not as hate speech, the majority of them were acceptedby the users.

Download Full-text

SENTIMENT ANALYSIS FOR SARCASTIC MESSAGES IN SOCIAL MEDIA USING DEEP LEARNING TECHNIQUES - AN EMPIRICAL STUDY

INFORMATION TECHNOLOGY IN INDUSTRY ◽

10.17762/itii.v9i2.451 ◽

2021 ◽

Vol 9 (2) ◽

pp. 1051-1052

Author(s):

K. Kavitha, Et. al.

Keyword(s):

Social Media ◽

Deep Learning ◽

Empirical Study ◽

Sentiment Analysis ◽

Learning Strategies ◽

Language Processing ◽

The People ◽

Learning Techniques ◽

Key Terms ◽

Learning Frameworks

Sentiments is the term of opinion or views about any topic expressed by the people through a source of communication. Nowadays social media is an effective platform for people to communicate and it generates huge amount of unstructured details every day. It is essential for any business organization in the current era to process and analyse the sentiments by using machine learning and Natural Language Processing (NLP) strategies. Even though in recent times the deep learning strategies are becoming more familiar due to higher capabilities of performance. This paper represents an empirical study of an application of deep learning techniques in Sentiment Analysis (SA) for sarcastic messages and their increasing scope in real time. Taxonomy of the sentiment analysis in recent times and their key terms are also been highlighted in the manuscript. The survey concludes the recent datasets considered, their key contributions and the performance of deep learning model applied with its primary purpose like sarcasm detection in order to describe the efficiency of deep learning frameworks in the domain of sentimental analysis.

Download Full-text

Discovery of Sustainable Transport Modes Underlying TripAdvisor Reviews With Sentiment Analysis

Advances in Business Information Systems and Analytics - Natural Language Processing for Global and Local Business ◽

10.4018/978-1-7998-4240-8.ch008 ◽

2021 ◽

pp. 180-199

Author(s):

Ainhoa Serna ◽

Jon Kepa Gerrikagoitia

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Language Processing ◽

Predictive Analytics ◽

Data Gathering ◽

Point Of View ◽

Training Data ◽

Complete Analysis ◽

Sustainable Transport ◽

Transport Modes

In recent years, digital technology and research methods have developed natural language processing for better understanding consumers and what they share in social media. There are hardly any studies in transportation analysis with TripAdvisor, and moreover, there is not a complete analysis from the point of view of sentiment analysis. The aim of study is to investigate and discover the presence of sustainable transport modes underlying in non-categorized TripAdvisor texts, such as walking mobility in order to impact positively in public services and businesses. The methodology follows a quantitative and qualitative approach based on knowledge discovery techniques. Thus, data gathering, normalization, classification, polarity analysis, and labelling tasks have been carried out to obtain sentiment labelled training data set in the transport domain as a valuable contribution for predictive analytics. This research has allowed the authors to discover sustainable transport modes underlying the texts, focused on walking mobility but extensible to other means of transport and social media sources.

Download Full-text