Modified Bag of Visual Words Model for Image Classification

Zainab N. Sultani;  ; Ban N. Dhannoon;

doi:10.22401/anjs.24.2.11

Modified Bag of Visual Words Model for Image Classification

Al-Nahrain Journal of Science ◽

10.22401/anjs.24.2.11 ◽

2021 ◽

Vol 24 (2) ◽

pp. 78-86

Author(s):

Zainab N. Sultani ◽

◽

Ban N. Dhannoon ◽

Keyword(s):

Image Classification ◽

Classification Performance ◽

Image Features ◽

Local Feature ◽

Bag Of Visual Words ◽

Scale Invariant ◽

Visual Words ◽

Challenging Tasks ◽

Feature Information ◽

Scale Invariant Feature

Image classification is acknowledged as one of the most critical and challenging tasks in computer vision. The bag of visual words (BoVW) model has proven to be very efficient for image classification tasks since it can effectively represent distinctive image features in vector space. In this paper, BoVW using Scale-Invariant Feature Transform (SIFT) and Oriented Fast and Rotated BRIEF(ORB) descriptors are adapted for image classification. We propose a novel image classification system using image local feature information obtained from both SIFT and ORB local feature descriptors. As a result, the constructed SO-BoVW model presents highly discriminative features, enhancing the classification performance. Experiments on Caltech-101 and flowers dataset prove the effectiveness of the proposed method.

Download Full-text

A novel signal to image transformation and feature level fusion for multimodal emotion recognition

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2020-0229 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Bahar Hatipoglu Yilmaz ◽

Cemal Kose

Keyword(s):

Emotion Recognition ◽

Classification Performance ◽

Image Features ◽

Multimodal Fusion ◽

Support Vector ◽

Scale Invariant ◽

Transform Method ◽

Vector Machines ◽

Multimodal Emotion Recognition ◽

Scale Invariant Feature

Abstract Emotion is one of the most complex and difficult expression to be predicted. Nowadays, many recognition systems that use classification methods have focused on different types of emotion recognition problems. In this paper, we aimed to propose a multimodal fusion method between electroencephalography (EEG) and electrooculography (EOG) signals for emotion recognition. Therefore, before the feature extraction stage, we applied different angle-amplitude transformations to EEG–EOG signals. These transformations take arbitrary time domain signals and convert them two-dimensional images named as Angle-Amplitude Graph (AAG). Then, we extracted image-based features using a scale invariant feature transform method, fused these features originates basically from EEG–EOG and lastly classified with support vector machines. To verify the validity of these proposed methods, we performed experiments on the multimodal DEAP dataset which is a benchmark dataset widely used for emotion analysis with physiological signals. In the experiments, we applied the proposed emotion recognition procedures on the arousal-valence dimensions. We achieved (91.53%) accuracy for the arousal space and (90.31%) for the valence space after fusion. Experimental results showed that the combination of AAG image features belonging to EEG–EOG signals in the baseline angle amplitude transformation approaches enhanced the classification performance on the DEAP dataset.

Download Full-text

An efficient method to classify GI tract images from WCE using visual words

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v10i6.pp5678-5686 ◽

2020 ◽

Vol 10 (6) ◽

pp. 5678

Author(s):

R. Ponnusamy ◽

S. Sathiamoorthy ◽

R. Visalakshi

Keyword(s):

Local Binary Pattern ◽

Image Features ◽

Support Vector ◽

Gi Tract ◽

Scale Invariant ◽

Processing Scheme ◽

Visual Words ◽

Bag Of Features ◽

Color Correlogram ◽

Scale Invariant Feature

The digital images made with the Wireless Capsule Endoscopy (WCE) from the patient's gastrointestinal tract are used to forecast abnormalities. The big amount of information from WCE pictures could take 2 hours to review GI tract illnesses per patient to research the digestive system and evaluate them. It is highly time consuming and increases healthcare costs considerably. In order to overcome this problem, the CS-LBP (Center Symmetric Local Binary Pattern) and the ACC (Auto Color Correlogram) were proposed to use a novel method based on a visual bag of features (VBOF). In order to solve this issue, we suggested a Visual Bag of Features(VBOF) method by incorporating Scale Invariant Feature Transform (SIFT), Center-Symmetric Local Binary Pattern (CS-LBP) and Auto Color Correlogram (ACC). This combination of features is able to detect the interest point, texture and color information in an image. Features for each image are calculated to create a descriptor with a large dimension. The proposed feature descriptors are clustered by K- means referred to as visual words, and the Support Vector Machine (SVM) method is used to automatically classify multiple disease abnormalities from the GI tract. Finally, post-processing scheme is applied to deal with final classification results i.e. validated the performance of multi-abnormal disease frame detection.

Download Full-text

Commodity Image Classification Based on Improved Bag-of-Visual-Words Model

Complexity ◽

10.1155/2021/5556899 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Huadong Sun ◽

Xu Zhang ◽

Xiaowei Han ◽

Xuesong Jin ◽

Zhijie Zhao

Keyword(s):

Feature Extraction ◽

Image Classification ◽

Local Feature ◽

Image Feature ◽

Bag Of Visual Words ◽

Shape Information ◽

Image Feature Extraction ◽

Visual Words ◽

Speeded Up Robust Features ◽

Rectangular Pattern

With the increasing scale of e-commerce, the complexity of image content makes commodity image classification face great challenges. Image feature extraction often determines the quality of the final classification results. At present, the image feature extraction part mainly includes the underlying visual feature and the intermediate semantic feature. The intermediate semantics of the image acts as a bridge between the underlying features and the advanced semantics of the image, which can make up for the semantic gap to a certain extent and has strong robustness. As a typical intermediate semantic representation method, the bag-of-visual-words (BoVW) model has received extensive attention in image classification. However, the traditional BoVW model loses the location information of local features, and its local feature descriptors mainly focus on the texture shape information of local regions but lack the expression of color information. Therefore, in this paper, the improved bag-of-visual-words model is presented, which contains three aspects of improvement: (1) multiscale local region extraction; (2) local feature description by speeded up robust features (SURF) and color vector angle histogram (CVAH); and (3) diagonal concentric rectangular pattern. Experimental results show that the three aspects of improvement to the BoVW model are complementary, while compared with the traditional BoVW and the BoVW adopting SURF + SPM, the classification accuracy of the improved BoVW is increased by 3.60% and 2.33%, respectively.

Download Full-text

Bag of ARSRG Words (BoAW)

Machine Learning and Knowledge Extraction ◽

10.3390/make1030050 ◽

2019 ◽

Vol 1 (3) ◽

pp. 871-882 ◽

Cited By ~ 3

Author(s):

Mario Manzo ◽

Simone Pellino

Keyword(s):

Graph Transformation ◽

Bag Of Visual Words ◽

New Paradigm ◽

Scale Invariant ◽

Visual Words ◽

Frequency Histogram ◽

Invariant Feature ◽

Lack Of Information ◽

Feature Transform ◽

Scale Invariant Feature

In recent years researchers have worked to understand image contents in computer vision. In particular, the bag of visual words (BoVW) model, which describes images in terms of a frequency histogram of visual words, is the most adopted paradigm. The main drawback is the lack of information about location and the relationships between features. For this purpose, we propose a new paradigm called bag of ARSRG (attributed relational SIFT (scale-invariant feature transform) regions graph) words (BoAW). A digital image is described as a vector in terms of a frequency histogram of graphs. Adopting a set of steps, the images are mapped into a vector space passing through a graph transformation. BoAW is evaluated in an image classification context on standard datasets and its effectiveness is demonstrated through experimental results compared with well-known competitors.

Download Full-text

Sugar beet and volunteer potato classification using Bag-of-Visual-Words model, Scale-Invariant Feature Transform, or Speeded Up Robust Feature descriptors and crop row information

Biosystems Engineering ◽

10.1016/j.biosystemseng.2017.11.015 ◽

2018 ◽

Vol 166 ◽

pp. 210-226 ◽

Cited By ~ 10

Author(s):

Hyun K. Suh ◽

Jan Willem Hofstee ◽

Joris IJsselmuiden ◽

Eldert J. van Henten

Keyword(s):

Sugar Beet ◽

Scale Invariant Feature Transform ◽

Bag Of Visual Words ◽

Scale Invariant ◽

Visual Words ◽

Feature Descriptors ◽

Volunteer Potato ◽

Invariant Feature ◽

Feature Transform ◽

Scale Invariant Feature

Download Full-text

Comparison of Pixel N-Grams with Histogram, Haralick's features and Bag-of-Visual-Words for Texture Image Classification

2018 3rd International Conference for Convergence in Technology (I2CT) ◽

10.1109/i2ct.2018.8529815 ◽

2018 ◽

Cited By ~ 1

Author(s):

Pradnya Kulkarni ◽

Andrew Stranieri

Keyword(s):

Image Classification ◽

Bag Of Visual Words ◽

Texture Image ◽

Visual Words ◽

Texture Image Classification

Download Full-text

Extracting image features for classification by two-tier genetic programming

10.26686/wgtn.13150940 ◽

2020 ◽

Author(s):

Harith Al-Sahaf ◽

A Song ◽

K Neshatian ◽

Mengjie Zhang

Keyword(s):

Genetic Programming ◽

Image Classification ◽

Domain Knowledge ◽

Extraction Process ◽

High Accuracy ◽

Classification Performance ◽

Image Features ◽

Classification Methods ◽

Feature Based ◽

Second Tier

Image classification is a complex but important task especially in the areas of machine vision and image analysis such as remote sensing and face recognition. One of the challenges in image classification is finding an optimal set of features for a particular task because the choice of features has direct impact on the classification performance. However the goodness of a feature is highly problem dependent and often domain knowledge is required. To address these issues we introduce a Genetic Programming (GP) based image classification method, Two-Tier GP, which directly operates on raw pixels rather than features. The first tier in a classifier is for automatically defining features based on raw image input, while the second tier makes decision. Compared to conventional feature based image classification methods, Two-Tier GP achieved better accuracies on a range of different tasks. Furthermore by using the features defined by the first tier of these Two-Tier GP classifiers, conventional classification methods obtained higher accuracies than classifying on manually designed features. Analysis on evolved Two-Tier image classifiers shows that there are genuine features captured in the programs and the mechanism of achieving high accuracy can be revealed. The Two-Tier GP method has clear advantages in image classification, such as high accuracy, good interpretability and the removal of explicit feature extraction process. © 2012 IEEE.

Download Full-text

Efficient Technique for word identification and recognition in Telugu Documents

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b3793.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 6053-6057

Keyword(s):

Word Identification ◽

Complex Structure ◽

Support Vector ◽

Indian Languages ◽

Scale Invariant ◽

Visual Words ◽

Invariant Feature ◽

Feature Transform ◽

Word Images ◽

Scale Invariant Feature

Telugu language is one of the most spoken Indian languages throughout the world. Since it has an old heritage, so Telugu literature and newspaper publications can be scanned to identify individual words. Identification of Telugu word images poses serious problems owing to its complex structure and larger set of individual characters. This paper aims to develop a novel methodology to achieve the same using SIFT (Scale Invariant Feature Transform) features of telugu words and classifying these features using BoVW (bag of visual words). The features are clustered to create a dictionary using k-means clustering. These words are used to create a visual codebook of the word images and the classification is achieved through SVM (Support Vector Machine).

Download Full-text

An effective bag-of-visual-words framework for SAR image classification

10.1117/12.900579 ◽

2011 ◽

Author(s):

Jie Feng ◽

L. C. Jiao ◽

Xiangrong Zhang ◽

Ruican Niu

Keyword(s):

Image Classification ◽

Bag Of Visual Words ◽

Sar Image ◽

Visual Words

Download Full-text

Annular Spatial Pyramid Mapping and Feature Fusion-Based Image Coding Representation and Classification

Wireless Communications and Mobile Computing ◽

10.1155/2020/8838454 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

Mengxi Xu ◽

Yingshu Lu ◽

Xiaobin Wu

Keyword(s):

Image Classification ◽

Feature Fusion ◽

Image Feature ◽

Classification Model ◽

Scale Invariant Feature Transform ◽

Scale Invariant ◽

Invariant Feature ◽

Feature Transform ◽

Scale Invariant Feature ◽

Spatial Pyramid

Conventional image classification models commonly adopt a single feature vector to represent informative contents. However, a single image feature system can hardly extract the entirety of the information contained in images, and traditional encoding methods have a large loss of feature information. Aiming to solve this problem, this paper proposes a feature fusion-based image classification model. This model combines the principal component analysis (PCA) algorithm, processed scale invariant feature transform (P-SIFT) and color naming (CN) features to generate mutually independent image representation factors. At the encoding stage of the scale-invariant feature transform (SIFT) feature, the bag-of-visual-word model (BOVW) is used for feature reconstruction. Simultaneously, in order to introduce the spatial information to our extracted features, the rotation invariant spatial pyramid mapping method is introduced for the P-SIFT and CN feature division and representation. At the stage of feature fusion, we adopt a support vector machine with two kernels (SVM-2K) algorithm, which divides the training process into two stages and finally learns the knowledge from the corresponding kernel matrix for the classification performance improvement. The experiments show that the proposed method can effectively improve the accuracy of image description and the precision of image classification.

Download Full-text