Associating Textual Features with Visual Ones to Improve Affective Image Classification

Abstract Recent studies have utilizes color, texture, and composition information of images to achieve affective image classification. However, the features related to spatial-frequency domain that were proven to be useful for traditional pattern recognition have not been tested in this field yet. Furthermore, the experiments conducted by previous studies are not internationally-comparable due to the experimental paradigm adopted. In addition, contributed by recent advances in methodology, that are, Hilbert-Huang Transform (HHT) (i.e. Empirical Mode Decomposition (EMD) and Hilbert Transform (HT)), the resolution of frequency analysis has been improved. Hence, the goal of this research is to achieve the affective image-classification task by adopting a standard experimental paradigm introduces by psychologists in order to produce international-comparable and reproducible results; and also to explore the affective hidden patterns of images in the spatial-frequency domain. To accomplish these goals, multiple human-subject experiments were conducted in laboratory. Extended Classifier Systems (XCSs) was used for model building because the XCS has been applied to a wide range of classification tasks and proved to be competitive in pattern recognition. To exploit the information in the spatial-frequency domain, the traditional EMD has been extended to a two-dimensional version. To summarize, the model built by using the XCS achieves Area Under Curve (AUC) = 0.91 and accuracy rate over 86%. The result of the XCS was compared with other traditional machine-learning algorithms (e.g., Radial-Basis Function Network (RBF Network)) that are normally used for classification tasks. Contributed by proper selection of features for model building, user-independent findings were obtained. For example, it is found that the horizontal visual stimulations contribute more to the emotion elicitation than the vertical visual stimulation. The effect of hue, saturation, and brightness; is also presented.

Download Full-text

Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features

2020 IEEE Winter Conference on Applications of Computer Vision (WACV) ◽

10.1109/wacv45572.2020.9093373 ◽

2020 ◽

Author(s):

Andres Mafla ◽

Sounak Dey ◽

Ali Furkan Biten ◽

Lluis Gomez ◽

Dimosthenis Karatzas

Keyword(s):

Image Classification ◽

Fine Grained ◽

Textual Features

Download Full-text

Interpretable aesthetic features for affective image classification

2013 IEEE International Conference on Image Processing ◽

10.1109/icip.2013.6738665 ◽

2013 ◽

Cited By ~ 33

Author(s):

Xiaohui Wang ◽

Jia Jia ◽

Jiaming Yin ◽

Lianhong Cai

Keyword(s):

Image Classification ◽

Affective Image Classification

Download Full-text

Utilizing Context Information to Enhance Content-Based Image Classification

Multimedia Data Engineering Applications and Processing ◽

10.4018/978-1-4666-2940-0.ch006 ◽

2013 ◽

pp. 114-130

Author(s):

Qiusha Zhu ◽

Lin Lin ◽

Mei-Ling Shyu ◽

Dianting Liu

Keyword(s):

Image Classification ◽

Multiple Correspondence Analysis ◽

Context Information ◽

Web Based ◽

Web Environment ◽

Classification Framework ◽

Text Information ◽

Human Effort ◽

Textual Features ◽

Content Information

Traditional image classification relies on text information such as tags, which requires a lot of human effort to annotate them. Therefore, recent work focuses more on training the classifiers directly on visual features extracted from image content. The performance of content-based classification is improving steadily, but it is still far below users’ expectation. Moreover, in a web environment, HTML surrounding texts associated with images naturally serve as context information and are complementary to content information. This paper proposes a novel two-stage image classification framework that aims to improve the performance of content-based image classification by utilizing context information of web-based images. A new TF*IDF weighting scheme is proposed to extract discriminant textual features from HTML surrounding texts. Both content-based and context-based classifiers are built by applying multiple correspondence analysis (MCA). Experiments on web-based images from Microsoft Research Asia (MSRA-MM) dataset show that the proposed framework achieves promising results.

Download Full-text