A pornographic web page detecting method based on SVM model using text and image features

We present a multi-modal genre recognition framework that considers the modalities audio, text, and image by features extracted from audio signals, album cover images, and lyrics of music tracks. In contrast to pure learning of features by a neural network as done in the related work, handcrafted features designed for a respective modality are also integrated, allowing for higher interpretability of created models and further theoretical analysis of the impact of individual features on genre prediction. Genre recognition is performed by binary classification of a music track with respect to each genre based on combinations of elementary features. For feature combination a two-level technique is used, which combines aggregation into fixed-length feature vectors with confidence-based fusion of classification results. Extensive experiments have been conducted for three classifier models (Naïve Bayes, Support Vector Machine, and Random Forest) and numerous feature combinations. The results are presented visually, with data reduction for improved perceptibility achieved by multi-objective analysis and restriction to non-dominated data. Feature- and classifier-related hypotheses are formulated based on the data, and their statistical significance is formally analyzed. The statistical analysis shows that the combination of two modalities almost always leads to a significant increase of performance and the combination of three modalities in several cases.

Download Full-text

Text-Image Retrieval With Salient Features

Journal of Database Management ◽

10.4018/jdm.2021100101 ◽

2021 ◽

Vol 32 (4) ◽

pp. 1-13

Author(s):

Xia Feng ◽

Zhiyi Hu ◽

Caihua Liu ◽

W. H. Ip ◽

Huiying Chen

Keyword(s):

Image Retrieval ◽

Recall Rate ◽

Image Features ◽

Feature Representation ◽

Image Feature ◽

Text And Image ◽

Retrieval Task ◽

Retrieval Method ◽

Salient Features ◽

Object Level

In recent years, deep learning has achieved remarkable results in the text-image retrieval task. However, only global image features are considered, and the vital local information is ignored. This results in a failure to match the text well. Considering that object-level image features can help the matching between text and image, this article proposes a text-image retrieval method that fuses salient image feature representation. Fusion of salient features at the object level can improve the understanding of image semantics and thus improve the performance of text-image retrieval. The experimental results show that the method proposed in the paper is comparable to the latest methods, and the recall rate of some retrieval results is better than the current work.

Download Full-text

Identifying Creative Content at the Page Level in the HathiTrust Digital Library Using Machine Learning Methods on Text and Image Features

Diversity, Divergence, Dialogue - Lecture Notes in Computer Science ◽

10.1007/978-3-030-71292-1_37 ◽

2021 ◽

pp. 478-489

Author(s):

Nikolaus Nova Parulian ◽

Glen Worthey

Keyword(s):

Machine Learning ◽

Digital Library ◽

Image Features ◽

Text And Image ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Graphical Figure Classification Using Data Fusion for Integrating Text and Image Features

2013 12th International Conference on Document Analysis and Recognition ◽

10.1109/icdar.2013.142 ◽

2013 ◽

Cited By ~ 3

Author(s):

Beibei Cheng ◽

R. Joe Stanley ◽

Sameer Antani ◽

George R. Thoma

Keyword(s):

Data Fusion ◽

Image Features ◽

Text And Image ◽

Using Data

Download Full-text

Fusion of Text and Image Features: A New Approach to Image Spam Filtering

Advances in Intelligent and Soft Computing - Practical Applications of Intelligent Systems ◽

10.1007/978-3-642-25658-5_15 ◽

2011 ◽

pp. 129-140 ◽

Cited By ~ 1

Author(s):

Congfu Xu ◽

Kevin Chiew ◽

Yafang Chen ◽

Juxin Liu

Keyword(s):

Image Features ◽

Spam Filtering ◽

Text And Image ◽

New Approach ◽

Image Spam

Download Full-text

Spam Email Image Classification Based on Text and Image Features

2019 First International Conference of Computer and Applied Sciences (CAS) ◽

10.1109/cas47993.2019.9075725 ◽

2019 ◽

Author(s):

Estqlal Hammad Dhah ◽

Mohammed Abdullah Naser ◽

Suhad A. Ali

Keyword(s):

Image Classification ◽

Image Features ◽

Text And Image

Download Full-text

Feature Pair Index Graph for Clustering

Journal of Intelligent Systems ◽

10.1515/jisys-2018-0338 ◽

2019 ◽

Vol 29 (1) ◽

pp. 1179-1187

Author(s):

N. Karthika ◽

B. Janet

Keyword(s):

Image Data ◽

Image Features ◽

Index Structure ◽

Text And Image ◽

Text Documents ◽

Structural Pattern ◽

Local Optima ◽

Data Share ◽

Feature Pair ◽

Cluster Methods

Abstract Text documents are significant arrangements of various words, while images are significant arrangements of various pixels/features. In addition, text and image data share a similar semantic structural pattern. With reference to this research, the feature pair is defined as a pair of adjacent image features. The innovative feature pair index graph (FPIG) is constructed from the unique feature pair selected, which is constructed using an inverted index structure. The constructed FPIG is helpful in clustering, classifying and retrieving the image data. The proposed FPIG method is validated against the traditional KMeans++, KMeans and Farthest First cluster methods which have the serious drawback of initial centroid selection and local optima. The FPIG method is analyzed using Iris flower image data, and the analysis yields 88% better results than Farthest First and 28.97% better results than conventional KMeans in terms of sum of squared errors. The paper also discusses the scope for further research in the proposed methodology.

Download Full-text

Truncated attention mechanism and cascade loss for cross-modal person re-identification

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210382 ◽

2021 ◽

pp. 1-13

Author(s):

Shuo Shi ◽

Changwei Huo ◽

Yingchun Guo ◽

Stephen Lean ◽

Gang Yan ◽

...

Keyword(s):

Natural Language ◽

Short Term Memory ◽

Principal Component ◽

Image Features ◽

Attention Mechanism ◽

Text And Image ◽

Level Information ◽

Text Features ◽

Lstm Network ◽

Language Description

Person re-identification with natural language description is a process of retrieving the corresponding person’s image from an image dataset according to a text description of the person. The key challenge in this cross-modal task is to extract visual and text features and construct loss functions to achieve cross-modal matching between text and image. Firstly, we designed a two-branch network framework for person re-identification with natural language description. In this framework we include the following: a Bi-directional Long Short-Term Memory (Bi-LSTM) network is used to extract text features and a truncated attention mechanism is proposed to select the principal component of the text features; a MobileNet is used to extract image features. Secondly, we proposed a Cascade Loss Function (CLF), which includes cross-modal matching loss and single modal classification loss, both with relative entropy function, to fully exploit the identity-level information. The experimental results on the CUHK-PEDES dataset demonstrate that our method achieves better results in Top-5 and Top-10 than other current 10 state-of-the-art algorithms.

Download Full-text

Retrieving Images Using Cross-Language Text and Image Features

Accessing Multilingual Information Repositories - Lecture Notes in Computer Science ◽

10.1007/11878773_80 ◽

2006 ◽

pp. 733-736 ◽

Cited By ~ 1

Author(s):

Mirna Adriani ◽

Framadhan Arnely

Keyword(s):

Image Features ◽

Text And Image ◽

Cross Language ◽

Language Text

Download Full-text