VistaNet: Visual Aspect Attention Network for Multimodal Sentiment Analysis

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301305 ◽

2019 ◽

Vol 33 ◽

pp. 305-312 ◽

Cited By ~ 6

Author(s):

Quoc-Tuan Truong ◽

Hady W. Lauw

Keyword(s):

Sentiment Analysis ◽

Visual Information ◽

User Preferences ◽

Attention Network ◽

Visual Component ◽

Analysis Task ◽

Sentiment Detection ◽

Restaurant Reviews ◽

Multimodal Sentiment Analysis ◽

Textual Content

Detecting the sentiment expressed by a document is a key task for many applications, e.g., modeling user preferences, monitoring consumer behaviors, assessing product quality. Traditionally, the sentiment analysis task primarily relies on textual content. Fueled by the rise of mobile phones that are often the only cameras on hand, documents on the Web (e.g., reviews, blog posts, tweets) are increasingly multimodal in nature, with photos in addition to textual content. A question arises whether the visual component could be useful for sentiment analysis as well. In this work, we propose Visual Aspect Attention Network or VistaNet, leveraging both textual and visual components. We observe that in many cases, with respect to sentiment detection, images play a supporting role to text, highlighting the salient aspects of an entity, rather than expressing sentiments independently of the text. Therefore, instead of using visual information as features, VistaNet relies on visual information as alignment for pointing out the important sentences of a document using attention. Experiments on restaurant reviews showcase the effectiveness of visual aspect attention, vis-à-vis visual features or textual attention.

Download Full-text

Combining Textual Clues with Audio-Visual Information for Multimodal Sentiment Analysis

Multimodal Sentiment Analysis - Socio-Affective Computing ◽

10.1007/978-3-319-95020-4_7 ◽

2018 ◽

pp. 153-178 ◽

Cited By ~ 1

Author(s):

Soujanya Poria ◽

Amir Hussain ◽

Erik Cambria

Keyword(s):

Sentiment Analysis ◽

Visual Information ◽

Multimodal Sentiment Analysis

Download Full-text

Multi-Interactive Memory Network for Aspect Based Multimodal Sentiment Analysis

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301371 ◽

2019 ◽

Vol 33 ◽

pp. 371-378 ◽

Cited By ~ 1

Author(s):

Nan Xu ◽

Wenji Mao ◽

Guandan Chen

Keyword(s):

Sentiment Analysis ◽

Visual Information ◽

Specific Aspect ◽

Text And Image ◽

Research Attention ◽

Multimodal Data ◽

Proposed Model ◽

Memory Network ◽

Multimodal Sentiment Analysis ◽

The Given

As a fundamental task of sentiment analysis, aspect-level sentiment analysis aims to identify the sentiment polarity of a specific aspect in the context. Previous work on aspect-level sentiment analysis is text-based. With the prevalence of multimodal user-generated content (e.g. text and image) on the Internet, multimodal sentiment analysis has attracted increasing research attention in recent years. In the context of aspect-level sentiment analysis, multimodal data are often more important than text-only data, and have various correlations including impacts that aspect brings to text and image as well as the interactions associated with text and image. However, there has not been any related work carried out so far at the intersection of aspect-level and multimodal sentiment analysis. To fill this gap, we are among the first to put forward the new task, aspect based multimodal sentiment analysis, and propose a novel Multi-Interactive Memory Network (MIMN) model for this task. Our model includes two interactive memory networks to supervise the textual and visual information with the given aspect, and learns not only the interactive influences between cross-modality data but also the self influences in single-modality data. We provide a new publicly available multimodal aspect-level sentiment dataset to evaluate our model, and the experimental results demonstrate the effectiveness of our proposed model for this new task.

Download Full-text

Sentiment Analysis Using Cuckoo Search for Optimized Feature Selection on Kaggle Tweets

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2019010101 ◽

2019 ◽

Vol 9 (1) ◽

pp. 1-15 ◽

Cited By ~ 4

Author(s):

Akshi Kumar ◽

Arunima Jaiswal ◽

Shikhar Garg ◽

Shobhit Verma ◽

Siddhant Kumar

Keyword(s):

Feature Selection ◽

Sentiment Analysis ◽

Cuckoo Search ◽

The Novel ◽

Learning Techniques ◽

Analysis Task ◽

Classifier Performance ◽

Feature Optimization ◽

Model Benchmark ◽

Textual Content

Selecting the optimal set of features to determine sentiment in online textual content is imperative for superior classification results. Optimal feature selection is computationally hard task and fosters the need for devising novel techniques to improve the classifier performance. In this work, the binary adaptation of cuckoo search (nature inspired, meta-heuristic algorithm) known as the Binary Cuckoo Search is proposed for the optimum feature selection for a sentiment analysis of textual online content. The baseline supervised learning techniques such as SVM, etc., have been firstly implemented with the traditional tf-idf model and then with the novel feature optimization model. Benchmark Kaggle dataset, which includes a collection of tweets is considered to report the results. The results are assessed on the basis of performance accuracy. Empirical analysis validates that the proposed implementation of a binary cuckoo search for feature selection optimization in a sentiment analysis task outperforms the elementary supervised algorithms based on the conventional tf-idf score.

Download Full-text

Various syncretic co‐attention network for multimodal sentiment analysis

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.5954 ◽

2020 ◽

Vol 32 (24) ◽

Author(s):

Meng Cao ◽

Yonghua Zhu ◽

Wenjing Gao ◽

Mengyao Li ◽

Shaoxiu Wang

Keyword(s):

Sentiment Analysis ◽

Attention Network ◽

Multimodal Sentiment Analysis

Download Full-text

Hashtag Recommendation for Multimodal Microblog Using Co-Attention Network

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/478 ◽

2017 ◽

Cited By ~ 29

Author(s):

Qi Zhang ◽

Jiawen Wang ◽

Haoran Huang ◽

Xuanjing Huang ◽

Yeyun Gong

Keyword(s):

Social Media ◽

Visual Information ◽

State Of The Art ◽

Attention Mechanism ◽

Experimental Result ◽

Textual Information ◽

Attention Network ◽

Art Methods ◽

Textual Content ◽

Media Applications

In microblogging services, authors can use hashtags to mark keywords or topics. Many live social media applications (e.g., microblog retrieval, classification) can gain great benefits from these manually labeled tags. However, only a small portion of microblogs contain hashtags inputed by users. Moreover, many microblog posts contain not only textual content but also images. These visual resources also provide valuable information that may not be included in the textual content. So that it can also help to recommend hashtags more accurately. Motivated by the successful use of the attention mechanism, we propose a co-attention network incorporating textual and visual information to recommend hashtags for multimodal tweets. Experimental result on the data collected from Twitter demonstrated that the proposed method can achieve better performance than state-of-the-art methods using textual information only.

Download Full-text