VLSP SHARED TASK: SENTIMENT ANALYSIS

Sentiment analysis is a natural language processing (NLP) task of identifying orextracting the sentiment content of a text unit. This task has become an active research topic since the early 2000s. During the two last editions of the VLSP workshop series, the shared task on Sentiment Analysis (SA) for Vietnamese has been organized in order to provide an objective evaluation measurement about the performance (quality) of sentiment analysis tools, and encouragethe development of Vietnamese sentiment analysis systems, as well as to provide benchmark datasets for this task. The rst campaign in 2016 only focused on the sentiment polarity classication, with a dataset containing reviews of electronic products. The second campaign in 2018 addressed the problem of Aspect Based Sentiment Analysis (ABSA) for Vietnamese, by providing two datasets containing reviews in restaurant and hotel domains. These data are accessible for research purpose via the VLSP website vlsp.org.vn/resources. This paper describes the built datasets as well as the evaluation results of the systems participating to these campaigns.

Download Full-text

Sentiment of App with Word Vectors

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1416.0986s319 ◽

2019 ◽

Vol 8 (6S3) ◽

pp. 2156-2159

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Text Data ◽

Vector Representations ◽

Text Sentiment Analysis

Vector representations for language have been shown to be useful in a number of Natural Language Processing tasks. In this paper, we aim to investigate the effectiveness of word vector representations for the problem of Sentiment Analysis. In particular, we target three sub-tasks namely sentiment words extraction, polarity of sentiment words detection, and text sentiment prediction. We investigate the effectiveness of vector representations over different text data and evaluate the quality of domain-dependent vectors. Vector representations has been used to compute various vector-based features and conduct systematically experiments to demonstrate their effectiveness. Using simple vector based features can achieve better results for text sentiment analysis of APP.

Download Full-text

Multi-neural network-based sentiment analysis of food reviews based on character and word embeddings

International Journal of Electrical Engineering Education ◽

10.1177/0020720920928492 ◽

2020 ◽

pp. 002072092092849

Author(s):

Yong Li ◽

Qingyu Jin ◽

Min Zuo ◽

Haisheng Li ◽

Xiaojun Yang ◽

...

Keyword(s):

Neural Network ◽

Sentiment Analysis ◽

Language Processing ◽

Chinese Character ◽

Semantic Features ◽

Word Embeddings ◽

Emotional Information ◽

Related Sequence ◽

Active Research ◽

Multi Neural Network

Sentiment analysis becomes one of the most active research hotspots in the field of natural language processing tasks in recent years. However, the inability to fully and effectively use emotional information is a problem in present deep learning models. A single Chinese character has different meanings in different words, and the character embeddings are combined with the word embeddings to extract more precise meaning information. In this paper, a single Chinese character and word are used as input units to train. Based on BLSTM, the attention mechanism based on vocabulary semantics in food field is introduced to realize distance-related sequence semantic feature extraction. CNN is used to realize semantic sentiment classification of sequence semantic features. Therefore, a model based on multi-neural network for sentiment information extraction and analysis is proposed. Experiments show that the model has excellent characteristics in sentiment analysis and obtains high accuracy and F value.

Download Full-text

A Fuzzy Computing Model for Identifying Polarity of Chinese Sentiment Words

Computational Intelligence and Neuroscience ◽

10.1155/2015/525437 ◽

2015 ◽

Vol 2015 ◽

pp. 1-13 ◽

Cited By ~ 7

Author(s):

Bingkun Wang ◽

Yongfeng Huang ◽

Xian Wu ◽

Xing Li

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

State Of The Art ◽

Negative Polarity ◽

Fuzzy Classifier ◽

Important Indicator ◽

Computing Model ◽

Active Research ◽

Better Than ◽

Fuzzy Computing

With the spurt of online user-generated contents on web, sentiment analysis has become a very active research issue in data mining and natural language processing. As the most important indicator of sentiment, sentiment words which convey positive and negative polarity are quite instrumental for sentiment analysis. However, most of the existing methods for identifying polarity of sentiment words only consider the positive and negative polarity by the Cantor set, and no attention is paid to the fuzziness of the polarity intensity of sentiment words. In order to improve the performance, we propose a fuzzy computing model to identify the polarity of Chinese sentiment words in this paper. There are three major contributions in this paper. Firstly, we propose a method to compute polarity intensity of sentiment morphemes and sentiment words. Secondly, we construct a fuzzy sentiment classifier and propose two different methods to compute the parameter of the fuzzy classifier. Thirdly, we conduct extensive experiments on four sentiment words datasets and three review datasets, and the experimental results indicate that our model performs better than the state-of-the-art methods.

Download Full-text

Aspect-based sentiment analysis of reviews in the domain of higher education

The Electronic Library ◽

10.1108/el-06-2019-0140 ◽

2020 ◽

Vol 38 (1) ◽

pp. 44-64

Author(s):

Nikola Nikolić ◽

Olivera Grljević ◽

Aleksandar Kovačević

Keyword(s):

Higher Education ◽

Sentiment Analysis ◽

Language Processing ◽

Higher Education Institutions ◽

Educational Institution ◽

Teaching Staff ◽

Free Text ◽

Content Type ◽

F Measure

Purpose Student recruitment and retention are important issues for all higher education institutions. Constant monitoring of student satisfaction levels is therefore crucial. Traditionally, students voice their opinions through official surveys organized by the universities. In addition to that, nowadays, social media and review websites such as “Rate my professors” are rich sources of opinions that should not be ignored. Automated mining of students’ opinions can be realized via aspect-based sentiment analysis (ABSA). ABSA s is a sub-discipline of natural language processing (NLP) that focusses on the identification of sentiments (negative, neutral, positive) and aspects (sentiment targets) in a sentence. The purpose of this paper is to introduce a system for ABSA of free text reviews expressed in student opinion surveys in the Serbian language. Sentiment analysis was carried out at the finest level of text granularity – the level of sentence segment (phrase and clause). Design/methodology/approach The presented system relies on NLP techniques, machine learning models, rules and dictionaries. The corpora collected and annotated for system development and evaluation comprise students’ reviews of teaching staff at the Faculty of Technical Sciences, University of Novi Sad, Serbia, and a corpus of publicly available reviews from the Serbian equivalent of the “Rate my professors” website. Findings The research results indicate that positive sentiment can successfully be identified with the F-measure of 0.83, while negative sentiment can be detected with the F-measure of 0.94. While the F-measure for the aspect’s range is between 0.49 and 0.89, depending on their frequency in the corpus. Furthermore, the authors have concluded that the quality of ABSA depends on the source of the reviews (official students’ surveys vs review websites). Practical implications The system for ABSA presented in this paper could improve the quality of service provided by the Serbian higher education institutions through a more effective search and summary of students’ opinions. For example, a particular educational institution could very easily find out which aspects of their service the students are not satisfied with and to which aspects of their service more attention should be directed. Originality/value To the best of the authors’ knowledge, this is the first study of ABSA carried out at the level of sentence segment for the Serbian language. The methodology and findings presented in this paper provide a much-needed bases for further work on sentiment analysis for the Serbian language that is well under-resourced and under-researched in this area.

Download Full-text

Using Ensemble Models to Classify the Sentiment Expressed in Suicide Notes

Biomedical Informatics Insights ◽

10.4137/bii.s8931 ◽

2012 ◽

Vol 5s1 ◽

pp. BII.S8931 ◽

Cited By ~ 5

Author(s):

James A. McCart ◽

Dezon K. Finch ◽

Jay Jarman ◽

Edward Hickling ◽

Jason D. Lind ◽

...

Keyword(s):

Natural Language Processing ◽

Text Mining ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Regular Expression ◽

Shared Task ◽

Suicide Notes ◽

The Mean ◽

The U.S

In 2007, suicide was the tenth leading cause of death in the U.S. Given the significance of this problem, suicide was the focus of the 2011 Informatics for Integrating Biology and the Bedside (i2b2) Natural Language Processing (NLP) shared task competition (track two). Specifically, the challenge concentrated on sentiment analysis, predicting the presence or absence of 15 emotions (labels) simultaneously in a collection of suicide notes spanning over 70 years. Our team explored multiple approaches combining regular expression-based rules, statistical text mining (STM), and an approach that applies weights to text while accounting for multiple labels. Our best submission used an ensemble of both rules and STM models to achieve a micro-averaged F1 score of 0.5023, slightly above the mean from the 26 teams that competed (0.4875).

Download Full-text

A TRANSFORMATION METHOD FOR ASPECT-BASED SENTIMENT ANALYSIS

Journal of Computer Science and Cybernetics ◽

10.15625/1813-9663/34/4/13162 ◽

2019 ◽

Vol 34 (4) ◽

pp. 323-333 ◽

Cited By ~ 3

Author(s):

Thin Van Dang ◽

Vu Duc Nguyen ◽

Nguyen Van Kiet ◽

Nguyen Luu Thuy Ngan

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Speech Processing ◽

International Workshop ◽

Transformation Method ◽

Support Vector ◽

The Internet ◽

Shared Task ◽

Research Topics ◽

User Reviews

Along with the explosion of user reviews on the Internet, sentiment analysis has becomeone of the trending research topics in the field of natural language processing. In the last five years,many shared tasks were organized to keep track of the progress of sentiment analysis for various lan-guages. In the Fifth International Workshop on Vietnamese Language and Speech Processing (VLSP2018), the Sentiment Analysis shared task was the first evaluation campaign for the Vietnamese lan-guage. In this paper, we describe our system for this shared task. We employ a supervised learningmethod based on the Support Vector Machine classifiers combined with a variety of features. Weobtained the F1-score of 61% for both domains, which was ranked highest in the shared task. For theaspect detection subtask, our method achieved 77% and 69% in F1-score for the restaurant domainand the hotel domain respectively.

Download Full-text

A Comprehensive Guideline for Bengali Sentiment Annotation

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3474363 ◽

2022 ◽

Vol 21 (2) ◽

pp. 1-19

Author(s):

Md. Saddam Hossain Mukta ◽

Md. Adnanul Islam ◽

Faisal Ahamed Khan ◽

Afjal Hossain ◽

Shuvanon Razik ◽

...

Keyword(s):

Data Mining ◽

Information Retrieval ◽

Sentiment Analysis ◽

Computational Linguistics ◽

Language Processing ◽

Web Mining ◽

English Language ◽

Research Work ◽

Bengali Language

Sentiment Analysis (SA) is a Natural Language Processing (NLP) and an Information Extraction (IE) task that primarily aims to obtain the writer’s feelings expressed in positive or negative by analyzing a large number of documents. SA is also widely studied in the fields of data mining, web mining, text mining, and information retrieval. The fundamental task in sentiment analysis is to classify the polarity of a given content as Positive, Negative, or Neutral . Although extensive research has been conducted in this area of computational linguistics, most of the research work has been carried out in the context of English language. However, Bengali sentiment expression has varying degree of sentiment labels, which can be plausibly distinct from English language. Therefore, sentiment assessment of Bengali language is undeniably important to be developed and executed properly. In sentiment analysis, the prediction potential of an automatic modeling is completely dependent on the quality of dataset annotation. Bengali sentiment annotation is a challenging task due to diversified structures (syntax) of the language and its different degrees of innate sentiments (i.e., weakly and strongly positive/negative sentiments). Thus, in this article, we propose a novel and precise guideline for the researchers, linguistic experts, and referees to annotate Bengali sentences immaculately with a view to building effective datasets for automatic sentiment prediction efficiently.

Download Full-text

Sentiment Analysis Techniques Applied to Raw-Text Data from a CSQ-8 Questionnaire About Mindfulness in Times of Covid-19 to Improve Strategy Generation

10.20944/preprints202106.0053.v1 ◽

2021 ◽

Author(s):

Mario Jojoa ◽

Gema Castillo-Sánchez ◽

Begonya Garcia-Zapirain ◽

Isabel De la Torre Diez ◽

Manuel Franco-Martín

Keyword(s):

Sentiment Analysis ◽

Transfer Learning ◽

Language Processing ◽

Health Care Professionals ◽

Ground Truth ◽

Relevant Information ◽

Free Text ◽

Text Data ◽

Learning Techniques

The aim of this study was to build a tool to analyze, using artificial intelligence, the sentiment perception of users who answered two questions from the CSQ – 8 questionnaires with raw Spanish free-text. Their responses are related to mindfulness, which is a novel technique used to control stress and anxiety caused by different factors in daily life. As such, we proposed an online course where this method was applied in order to improve the quality of life of health care professionals in COVID 19 pandemic times. We also carried out an evaluation of the satis-faction level of the participants involved, with a view to establishing strategies to improve fu-ture experiences. To automatically perform this task, we used Natural Language Processing (NLP) models such as swivel embedding, neural networks and transfer learning, so as to classify the inputs into the following 3 categories: negative, neutral and positive. Due to the lim-ited amount of data available - 86 registers for the first and 68 for the second - transfer learning techniques were required. The length of the text had no limit from the user’s standpoint, and our approach attained a maximum accuracy of 93.02 % and 90.53 % respectively based on ground truth labeled by 3 experts. Finally, we proposed a complementary analysis, using com-puter graphic text representation based on word frequency, to help researchers identify relevant information about the opinions with an objective approach to sentiment. The main conclusion drawn from this work is that the application of NLP techniques in small amounts of data using transfer learning is able to obtain enough accuracy in sentiment analysis and text classification stages

Download Full-text

Machine Learning Based Sentiment Text Classification for Evaluating Treatment Quality of Discharge Summary

Information ◽

10.3390/info11050281 ◽

2020 ◽

Vol 11 (5) ◽

pp. 281

Author(s):

Samer Abdulateef Waheeb ◽

Naseer Ahmed Khan ◽

Bolin Chen ◽

Xuequn Shang

Keyword(s):

Natural Language ◽

Sentiment Analysis ◽

Statistical Methods ◽

Language Processing ◽

False Positive Rate ◽

Discharge Summary ◽

Treatment Quality ◽

Positive Rate ◽

Discharge Summaries

Patients’ discharge summaries (documents) are health sensors that are used for measuring the quality of treatment in medical centers. However, extracting information automatically from discharge summaries with unstructured natural language is considered challenging. These kinds of documents include various aspects of patient information that could be used to test the treatment quality for improving medical-related decisions. One of the significant techniques in literature for discharge summaries classification is feature extraction techniques from the domain of natural language processing on text data. We propose a novel sentiment analysis method for discharge summaries classification that relies on vector space models, statistical methods, association rule, and extreme learning machine autoencoder (ELM-AE). Our novel hybrid model is based on statistical methods that build the lexicon in a domain related to health and medical records. Meanwhile, our method examines treatment quality based on an idea inspired by sentiment analysis. Experiments prove that our proposed method obtains a higher F1 value of 0.89 with good TPR (True Positive Rate) and FPR (False Positive Rate) values compared with various well-known state-of-the-art methods with different size of training and testing datasets. The results also prove that our method provides a flexible and effective technique to examine treatment quality based on positive, negative, and neutral terms for sentence-level in each discharge summary.

Download Full-text

Exogenous approach to improve topic segmentation

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-01-2016-0001 ◽

2016 ◽

Vol 9 (2) ◽

pp. 165-178 ◽

Cited By ~ 1

Author(s):

Marwa Naili ◽

Anja Habacha Chaibi ◽

Henda Hajjami Ben Ghezala

Keyword(s):

Language Processing ◽

Digital Libraries ◽

Semantic Knowledge ◽

Domain Ontology ◽

Topic Segmentation ◽

Content Type ◽

Research Fields ◽

External Resource ◽

Active Research

Purpose – Topic segmentation is one of the active research fields in natural language processing. Also, many topic segmenters have been proposed. However, the current challenge of researchers is the improvement of these segmenters by using external resources. Therefore, the purpose of this paper is to integrate study and evaluate a new external semantic resource in topic segmentation. Design/methodology/approach – New topic segmenters (TSS-Onto and TSB-Onto) are proposed based on the two well-known segmenters C99 and TextTiling. The proposed segmenters integrate semantic knowledge to the segmentation process by using a domain ontology as an external resource. Subsequently, an evaluation is made to study the effect of this resource on the quality of topic segmentation along with a comparative study with related works. Findings – Based on this study, the authors showed that adding semantic knowledge, which is extracted from a domain ontology, improves the quality of topic segmentation. Moreover, TSS-Ont outperforms TSB-Ont in terms of quality of topic segmentation. Research limitations/implications – The main limitation of this study is the used test corpus for the evaluation which is not a benchmark. However, we used a collection of scientific papers from well-known digital libraries (ArXiv and ACM). Practical implications – The proposed topic segmenters can be useful in different NLP applications such as information retrieval and text summarizing. Originality/value – The primary original contribution of this paper is the improvement of topic segmentation based on semantic knowledge. This knowledge is extracted from an ontological external resource.

Download Full-text