scholarly journals Modified Bag of Visual Words Model for Image Classification

2021 ◽  
Vol 24 (2) ◽  
pp. 78-86
Author(s):  
Zainab N. Sultani ◽  
◽  
Ban N. Dhannoon ◽  

Image classification is acknowledged as one of the most critical and challenging tasks in computer vision. The bag of visual words (BoVW) model has proven to be very efficient for image classification tasks since it can effectively represent distinctive image features in vector space. In this paper, BoVW using Scale-Invariant Feature Transform (SIFT) and Oriented Fast and Rotated BRIEF(ORB) descriptors are adapted for image classification. We propose a novel image classification system using image local feature information obtained from both SIFT and ORB local feature descriptors. As a result, the constructed SO-BoVW model presents highly discriminative features, enhancing the classification performance. Experiments on Caltech-101 and flowers dataset prove the effectiveness of the proposed method.

2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Bahar Hatipoglu Yilmaz ◽  
Cemal Kose

Abstract Emotion is one of the most complex and difficult expression to be predicted. Nowadays, many recognition systems that use classification methods have focused on different types of emotion recognition problems. In this paper, we aimed to propose a multimodal fusion method between electroencephalography (EEG) and electrooculography (EOG) signals for emotion recognition. Therefore, before the feature extraction stage, we applied different angle-amplitude transformations to EEG–EOG signals. These transformations take arbitrary time domain signals and convert them two-dimensional images named as Angle-Amplitude Graph (AAG). Then, we extracted image-based features using a scale invariant feature transform method, fused these features originates basically from EEG–EOG and lastly classified with support vector machines. To verify the validity of these proposed methods, we performed experiments on the multimodal DEAP dataset which is a benchmark dataset widely used for emotion analysis with physiological signals. In the experiments, we applied the proposed emotion recognition procedures on the arousal-valence dimensions. We achieved (91.53%) accuracy for the arousal space and (90.31%) for the valence space after fusion. Experimental results showed that the combination of AAG image features belonging to EEG–EOG signals in the baseline angle amplitude transformation approaches enhanced the classification performance on the DEAP dataset.


Author(s):  
R. Ponnusamy ◽  
S. Sathiamoorthy ◽  
R. Visalakshi

The digital images made with the Wireless Capsule Endoscopy (WCE) from the patient's gastrointestinal tract are used to forecast abnormalities. The big amount of information from WCE pictures could take 2 hours to review GI tract illnesses per patient to research the digestive system and evaluate them. It is highly time consuming and increases healthcare costs considerably. In order to overcome this problem, the CS-LBP (Center Symmetric Local Binary Pattern) and the ACC (Auto Color Correlogram) were proposed to use a novel method based on a visual bag of features (VBOF). In order to solve this issue, we suggested a Visual Bag of Features(VBOF) method by incorporating Scale Invariant Feature Transform (SIFT), Center-Symmetric Local Binary Pattern (CS-LBP) and Auto Color Correlogram (ACC). This combination of features is able to detect the interest point, texture and color information in an image. Features for each image are calculated to create a descriptor with a large dimension. The proposed feature descriptors are clustered by K- means referred to as visual words, and the Support Vector Machine (SVM) method is used to automatically classify multiple disease abnormalities from the GI tract. Finally, post-processing scheme is applied to deal with final classification results i.e. validated the performance of multi-abnormal disease frame detection.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Huadong Sun ◽  
Xu Zhang ◽  
Xiaowei Han ◽  
Xuesong Jin ◽  
Zhijie Zhao

With the increasing scale of e-commerce, the complexity of image content makes commodity image classification face great challenges. Image feature extraction often determines the quality of the final classification results. At present, the image feature extraction part mainly includes the underlying visual feature and the intermediate semantic feature. The intermediate semantics of the image acts as a bridge between the underlying features and the advanced semantics of the image, which can make up for the semantic gap to a certain extent and has strong robustness. As a typical intermediate semantic representation method, the bag-of-visual-words (BoVW) model has received extensive attention in image classification. However, the traditional BoVW model loses the location information of local features, and its local feature descriptors mainly focus on the texture shape information of local regions but lack the expression of color information. Therefore, in this paper, the improved bag-of-visual-words model is presented, which contains three aspects of improvement: (1) multiscale local region extraction; (2) local feature description by speeded up robust features (SURF) and color vector angle histogram (CVAH); and (3) diagonal concentric rectangular pattern. Experimental results show that the three aspects of improvement to the BoVW model are complementary, while compared with the traditional BoVW and the BoVW adopting SURF + SPM, the classification accuracy of the improved BoVW is increased by 3.60% and 2.33%, respectively.


2019 ◽  
Vol 1 (3) ◽  
pp. 871-882 ◽  
Author(s):  
Mario Manzo ◽  
Simone Pellino

In recent years researchers have worked to understand image contents in computer vision. In particular, the bag of visual words (BoVW) model, which describes images in terms of a frequency histogram of visual words, is the most adopted paradigm. The main drawback is the lack of information about location and the relationships between features. For this purpose, we propose a new paradigm called bag of ARSRG (attributed relational SIFT (scale-invariant feature transform) regions graph) words (BoAW). A digital image is described as a vector in terms of a frequency histogram of graphs. Adopting a set of steps, the images are mapped into a vector space passing through a graph transformation. BoAW is evaluated in an image classification context on standard datasets and its effectiveness is demonstrated through experimental results compared with well-known competitors.


2020 ◽  
Author(s):  
Harith Al-Sahaf ◽  
A Song ◽  
K Neshatian ◽  
Mengjie Zhang

Image classification is a complex but important task especially in the areas of machine vision and image analysis such as remote sensing and face recognition. One of the challenges in image classification is finding an optimal set of features for a particular task because the choice of features has direct impact on the classification performance. However the goodness of a feature is highly problem dependent and often domain knowledge is required. To address these issues we introduce a Genetic Programming (GP) based image classification method, Two-Tier GP, which directly operates on raw pixels rather than features. The first tier in a classifier is for automatically defining features based on raw image input, while the second tier makes decision. Compared to conventional feature based image classification methods, Two-Tier GP achieved better accuracies on a range of different tasks. Furthermore by using the features defined by the first tier of these Two-Tier GP classifiers, conventional classification methods obtained higher accuracies than classifying on manually designed features. Analysis on evolved Two-Tier image classifiers shows that there are genuine features captured in the programs and the mechanism of achieving high accuracy can be revealed. The Two-Tier GP method has clear advantages in image classification, such as high accuracy, good interpretability and the removal of explicit feature extraction process. © 2012 IEEE.


2019 ◽  
Vol 8 (2) ◽  
pp. 6053-6057

Telugu language is one of the most spoken Indian languages throughout the world. Since it has an old heritage, so Telugu literature and newspaper publications can be scanned to identify individual words. Identification of Telugu word images poses serious problems owing to its complex structure and larger set of individual characters. This paper aims to develop a novel methodology to achieve the same using SIFT (Scale Invariant Feature Transform) features of telugu words and classifying these features using BoVW (bag of visual words). The features are clustered to create a dictionary using k-means clustering. These words are used to create a visual codebook of the word images and the classification is achieved through SVM (Support Vector Machine).


2011 ◽  
Author(s):  
Jie Feng ◽  
L. C. Jiao ◽  
Xiangrong Zhang ◽  
Ruican Niu

2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Mengxi Xu ◽  
Yingshu Lu ◽  
Xiaobin Wu

Conventional image classification models commonly adopt a single feature vector to represent informative contents. However, a single image feature system can hardly extract the entirety of the information contained in images, and traditional encoding methods have a large loss of feature information. Aiming to solve this problem, this paper proposes a feature fusion-based image classification model. This model combines the principal component analysis (PCA) algorithm, processed scale invariant feature transform (P-SIFT) and color naming (CN) features to generate mutually independent image representation factors. At the encoding stage of the scale-invariant feature transform (SIFT) feature, the bag-of-visual-word model (BOVW) is used for feature reconstruction. Simultaneously, in order to introduce the spatial information to our extracted features, the rotation invariant spatial pyramid mapping method is introduced for the P-SIFT and CN feature division and representation. At the stage of feature fusion, we adopt a support vector machine with two kernels (SVM-2K) algorithm, which divides the training process into two stages and finally learns the knowledge from the corresponding kernel matrix for the classification performance improvement. The experiments show that the proposed method can effectively improve the accuracy of image description and the precision of image classification.


Sign in / Sign up

Export Citation Format

Share Document