Semi-Supervised Aspect-Based Sentiment Analysis for Case-Related Microblog Reviews Using Case Knowledge Graph Embedding

One for “All”: a unified model for fine-grained sentiment analysis under three tasks

PeerJ Computer Science ◽

10.7717/peerj-cs.816 ◽

2021 ◽

Vol 7 ◽

pp. e816

Author(s):

Heng-yang Lu ◽

Jun Yang ◽

Cong Hu ◽

Wei Fang

Keyword(s):

Sentiment Analysis ◽

Data Augmentation ◽

Language Model ◽

Unified Model ◽

Training Data ◽

Low Resource ◽

Fine Grained ◽

Questions And Answers ◽

Resource Conditions ◽

Media Data

Background Fine-grained sentiment analysis is used to interpret consumers’ sentiments, from their written comments, towards specific entities on specific aspects. Previous researchers have introduced three main tasks in this field (ABSA, TABSA, MEABSA), covering all kinds of social media data (e.g., review specific, questions and answers, and community-based). In this paper, we identify and address two common challenges encountered in these three tasks, including the low-resource problem and the sentiment polarity bias. Methods We propose a unified model called PEA by integrating data augmentation methodology with the pre-trained language model, which is suitable for all the ABSA, TABSA and MEABSA tasks. Two data augmentation methods, which are entity replacement and dual noise injection, are introduced to solve both challenges at the same time. An ensemble method is also introduced to incorporate the results of the basic RNN-based and BERT-based models. Results PEA shows significant improvements on all three fine-grained sentiment analysis tasks when compared with state-of-the-art models. It also achieves comparable results with what the baseline models obtain while using only 20% of their training data, which demonstrates its extraordinary performance under extreme low-resource conditions.

Get full-text (via PubEx)

Zero-Shot Object Detection with Textual Descriptions

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018690 ◽

2019 ◽

Vol 33 ◽

pp. 8690-8697 ◽

Cited By ~ 6

Author(s):

Zhihui Li ◽

Lina Yao ◽

Xiaoqin Zhang ◽

Xianzhi Wang ◽

Salil Kanhere ◽

...

Keyword(s):

Object Detection ◽

Training Data ◽

Challenging Problem ◽

Learning Framework ◽

Word Level ◽

Proposed Model ◽

Real World Applications ◽

Fair Comparison ◽

Benchmark Datasets ◽

Novel Concept

Object detection is important in real-world applications. Existing methods mainly focus on object detection with sufficient labelled training data or zero-shot object detection with only concept names. In this paper, we address the challenging problem of zero-shot object detection with natural language description, which aims to simultaneously detect and recognize novel concept instances with textual descriptions. We propose a novel deep learning framework to jointly learn visual units, visual-unit attention and word-level attention, which are combined to achieve word-proposal affinity by an element-wise multiplication. To the best of our knowledge, this is the first work on zero-shot object detection with textual descriptions. Since there is no directly related work in the literature, we investigate plausible solutions based on existing zero-shot object detection for a fair comparison. We conduct extensive experiments on three challenging benchmark datasets. The extensive experimental results confirm the superiority of the proposed model.

Get full-text (via PubEx)

A review: preprocessing techniques and data augmentation for sentiment analysis

Computational Social Networks ◽

10.1186/s40649-020-00080-x ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Huu-Thanh Duong ◽

Tram-Anh Nguyen-Thi

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Supervised Learning ◽

Data Augmentation ◽

Original Data ◽

Training Data ◽

Unseen Data ◽

Augmentation Techniques ◽

User Intervention

AbstractIn literature, the machine learning-based studies of sentiment analysis are usually supervised learning which must have pre-labeled datasets to be large enough in certain domains. Obviously, this task is tedious, expensive and time-consuming to build, and hard to handle unseen data. This paper has approached semi-supervised learning for Vietnamese sentiment analysis which has limited datasets. We have summarized many preprocessing techniques which were performed to clean and normalize data, negation handling, intensification handling to improve the performances. Moreover, data augmentation techniques, which generate new data from the original data to enrich training data without user intervention, have also been presented. In experiments, we have performed various aspects and obtained competitive results which may motivate the next propositions.

Get full-text (via PubEx)

Aspect-level sentiment analysis merged with knowledge graph and graph convolutional neural network

Journal of Physics Conference Series ◽

10.1088/1742-6596/2083/4/042044 ◽

2021 ◽

Vol 2083 (4) ◽

pp. 042044

Author(s):

Zuhua Dai ◽

Yuanyuan Liu ◽

Shilong Di ◽

Qi Fan

Keyword(s):

Neural Network ◽

Sentiment Analysis ◽

Structural Information ◽

Knowledge Graph ◽

Convolutional Network ◽

Text Data ◽

Short Text ◽

Fine Grained ◽

Syntactic Information ◽

Text Information

Abstract Aspect level sentiment analysis belongs to fine-grained sentiment analysis, w hich has caused extensive research in academic circles in recent years. For this task, th e recurrent neural network (RNN) model is usually used for feature extraction, but the model cannot effectively obtain the structural information of the text. Recent studies h ave begun to use the graph convolutional network (GCN) to model the syntactic depen dency tree of the text to solve this problem. For short text data, the text information is not enough to accurately determine the emotional polarity of the aspect words, and the knowledge graph is not effectively used as external knowledge that can enrich the sem antic information. In order to solve the above problems, this paper proposes a graph co nvolutional neural network (GCN) model that can process syntactic information, know ledge graphs and text semantic information. The model works on the “syntax-knowled ge” graph to extract syntactic information and common sense information at the same t ime. Compared with the latest model, the model in this paper can effectively improve t he accuracy of aspect-level sentiment classification on two datasets.

Get full-text (via PubEx)

Fine-Grained Named Entity Typing over Distantly Supervised Data Based on Refined Representations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6234 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7391-7398

Author(s):

Muhammad Asif Ali ◽

Yifang Sun ◽

Bing Li ◽

Wei Wang

Keyword(s):

Language Processing ◽

Training Data ◽

Specific Context ◽

Fine Grained ◽

Named Entity ◽

Distant Supervision ◽

Proposed Model ◽

Wide Range ◽

Relative Score ◽

Noisy Labels

Fine-Grained Named Entity Typing (FG-NET) is a key component in Natural Language Processing (NLP). It aims at classifying an entity mention into a wide range of entity types. Due to a large number of entity types, distant supervision is used to collect training data for this task, which noisily assigns type labels to entity mentions irrespective of the context. In order to alleviate the noisy labels, existing approaches on FG-NET analyze the entity mentions entirely independent of each other and assign type labels solely based on mention's sentence-specific context. This is inadequate for highly overlapping and/or noisy type labels as it hinders information passing across sentence boundaries. For this, we propose an edge-weighted attentive graph convolution network that refines the noisy mention representations by attending over corpus-level contextual clues prior to the end classification. Experimental evaluation shows that the proposed model outperforms the existing research by a relative score of upto 10.2% and 8.3% for macro-f1 and micro-f1 respectively.

Get full-text (via PubEx)

Deep Persian sentiment analysis: Cross-lingual training for low-resource languages

Journal of Information Science ◽

10.1177/0165551520962781 ◽

2020 ◽

pp. 016555152096278

Author(s):

Rouzbeh Ghasemi ◽

Seyed Arad Ashrafi Asli ◽

Saeedeh Momtazi

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Training Data ◽

Target Language ◽

Low Resource ◽

Proposed Model ◽

Significant Difference ◽

Cross Lingual

With the advent of deep neural models in natural language processing tasks, having a large amount of training data plays an essential role in achieving accurate models. Creating valid training data, however, is a challenging issue in many low-resource languages. This problem results in a significant difference between the accuracy of available natural language processing tools for low-resource languages compared with rich languages. To address this problem in the sentiment analysis task in the Persian language, we propose a cross-lingual deep learning framework to benefit from available training data of English. We deployed cross-lingual embedding to model sentiment analysis as a transfer learning model which transfers a model from a rich-resource language to low-resource ones. Our model is flexible to use any cross-lingual word embedding model and any deep architecture for text classification. Our experiments on English Amazon dataset and Persian Digikala dataset using two different embedding models and four different classification networks show the superiority of the proposed model compared with the state-of-the-art monolingual techniques. Based on our experiment, the performance of Persian sentiment analysis improves 22% in static embedding and 9% in dynamic embedding. Our proposed model is general and language-independent; that is, it can be used for any low-resource language, once a cross-lingual embedding is available for the source–target language pair. Moreover, by benefitting from word-aligned cross-lingual embedding, the only required data for a reliable cross-lingual embedding is a bilingual dictionary that is available between almost all languages and the English language, as a potential source language.

Get full-text (via PubEx)

An adaptive sentimental analysis using ontology for retail market

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i1.2.10666 ◽

2017 ◽

Vol 7 (1.2) ◽

pp. 176

Author(s):

J Mannar Mannan ◽

Jayavel J

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Domain Knowledge ◽

Opinion Mining ◽

Identification System ◽

Present System ◽

3D Space ◽

User Data ◽

New Type ◽

Analysis System

The growth of digital documents on web becomes the massive sources for online market analyzing at broad level. The study of market research over online incorporating new parameter called sentiment analysis. The sentiment analysis plays a crucial role for identifying behavior of customers by means of natural language processing from customer feedback about product or services. The opinion mining have done from the user data over web related activities such as search history, blog activities, forums, comments on the social network, express the opinion about the concept/product and suggestion or recommendations. The present system is non-adaptive relation identification system works on existing, predetermined set of relations and it cannot identify the new type relation for opinion mining. The existing system are also neglected the static sentiments of users. This paper proposed ontology based adaptive sentiment analysis system for extracting new features added on the user space. In our work, the ontology and 3D space clustering framework which allows incorporation of domain knowledge for predicting sentimental analysis via opinion mining.

Get full-text (via PubEx)

A domain knowledge graph construction method based on Wikipedia

Journal of Information Science ◽

10.1177/0165551520932510 ◽

2020 ◽

pp. 016555152093251

Author(s):

Haoze Yu ◽

Haisheng Li ◽

Dianhui Mao ◽

Qiang Cai

Keyword(s):

Domain Knowledge ◽

Multiple Scale ◽

Construction Method ◽

Knowledge Graph ◽

Relationship Extraction ◽

Proposed Model ◽

Extraction Algorithm ◽

Structured Knowledge ◽

Extraction Model ◽

Domain Independent

In order to achieve real-time updating of the domain knowledge graph and improve the relationship extraction ability in the construction process, a domain knowledge graph construction method is proposed. Based on the structured knowledge in Wikipedia’s classification system, we acquire concepts and instances contained in subject areas. A relationship extraction algorithm based on co-word analysis is intended to extract the classification relationships in semi-structured open labels. A Bi-GRU remote supervised relationship extraction model based on a multiple-scale attention mechanism and an improved cross-entropy loss function is proposed to obtain the non-classification relationships of concepts in unstructured texts. Experiments show that the proposed model performs better than the existing methods. Based on the obtained concepts, instances and relationships, a domain knowledge graph is constructed and the domain-independent nodes and relationships contained in them are removed through a vector variance algorithm. The effectiveness of the proposed method is verified by constructing a food domain knowledge graph based on Wikipedia.

Get full-text (via PubEx)

An adaptive sentimental analysis using ontology for retail market

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i1.3.10676 ◽

2017 ◽

Vol 7 (1.3) ◽

pp. 176 ◽

Cited By ~ 1

Author(s):

J. Mannar Mannan ◽

Jayavel .J

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Domain Knowledge ◽

Opinion Mining ◽

Identification System ◽

Present System ◽

3D Space ◽

User Data ◽

New Type ◽

Analysis System

The growth of digital documents on web becomes the massive sources for online market analyzing at broad level. The study of market research over online incorporating new parameter called sentiment analysis. The sentiment analysis plays a crucial role for identifying behavior of customers by means of natural language processing from customer feedback about product or services. The opinion mining have done from the user data over web related activities such as search history, blog activities, forums, comments on the social network, express the opinion about the concept/product and suggestion or recommendations. The present system is non-adaptive relation identification system works on existing, predetermined set of relations and it cannot identify the new type relation for opinion mining. The existing system are also neglected the static sentiments of users. This paper proposed ontology based adaptive sentiment analysis system for extracting new features added on the user space. In our work, the ontology and 3D space clustering framework which allows incorporation of domain knowledge for predicting sentimental analysis via opinion mining.

Get full-text (via PubEx)

MIGAN: Malware Image Synthesis Using GANs

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.330110033 ◽

2019 ◽

Vol 33 ◽

pp. 10033-10034 ◽

Cited By ~ 1

Author(s):

Abhishek Singh ◽

Debojyoti Dutta ◽

Amit Saha

Keyword(s):

Language Processing ◽

Domain Knowledge ◽

Data Augmentation ◽

Image Synthesis ◽

Substantial Improvement ◽

Training Data ◽

Malware Analysis ◽

Training Procedure ◽

Original Dataset ◽

Augmentation Techniques

Majority of the advancement in Deep learning (DL) has occurred in domains such as computer vision, and natural language processing, where abundant training data is available. A major obstacle in leveraging DL techniques for malware analysis is the lack of sufficiently big, labeled datasets. In this paper, we take the first steps towards building a model which can synthesize labeled dataset of malware images using GAN. Such a model can be utilized to perform data augmentation for training a classifier. Furthermore, the model can be shared publicly for community to reap benefits of dataset without sharing the original dataset. First, we show the underlying idiosyncrasies of malware images and why existing data augmentation techniques as well as traditional GAN training fail to produce quality artificial samples. Next, we propose a new method for training GAN where we explicitly embed prior domain knowledge about the dataset into the training procedure. We show improvements in training stability and sample quality assessed on different metrics. Our experiments show substantial improvement on baselines and promise for using such a generative model for malware visualization systems.

Get full-text (via PubEx)