Sparse Based Image Classification With Bag-of-Visual-Words Representations

Author(s):  
Yuanyuan Zuo ◽  
Bo Zhang

The sparse representation based classification algorithm has been used to solve the problem of human face recognition, but the image database is restricted to human frontal faces with only slight illumination and expression changes. This paper applies the sparse representation based algorithm to the problem of generic image classification, with a certain degree of intra-class variations and background clutter. Experiments are conducted with the sparse representation based algorithm and Support Vector Machine (SVM) classifiers on 25 object categories selected from the Caltech101 dataset. Experimental results show that without the time-consuming parameter optimization, the sparse representation based algorithm achieves comparable performance with SVM. The experiments also demonstrate that the algorithm is robust to a certain degree of background clutter and intra-class variations with the bag-of-visual-words representations. The sparse representation based algorithm can also be applied to generic image classification task when the appropriate image feature is used.

Author(s):  
Yuanyuan Zuo ◽  
Bo Zhang

The sparse representation based classification algorithm has been used to solve the problem of human face recognition, but the image database is restricted to human frontal faces with only slight illumination and expression changes. This paper applies the sparse representation based algorithm to the problem of generic image classification, with a certain degree of intra-class variations and background clutter. Experiments are conducted with the sparse representation based algorithm and Support Vector Machine (SVM) classifiers on 25 object categories selected from the Caltech101 dataset. Experimental results show that without the time-consuming parameter optimization, the sparse representation based algorithm achieves comparable performance with SVM. The experiments also demonstrate that the algorithm is robust to a certain degree of background clutter and intra-class variations with the bag-of-visual-words representations. The sparse representation based algorithm can also be applied to generic image classification task when the appropriate image feature is used.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Huadong Sun ◽  
Xu Zhang ◽  
Xiaowei Han ◽  
Xuesong Jin ◽  
Zhijie Zhao

With the increasing scale of e-commerce, the complexity of image content makes commodity image classification face great challenges. Image feature extraction often determines the quality of the final classification results. At present, the image feature extraction part mainly includes the underlying visual feature and the intermediate semantic feature. The intermediate semantics of the image acts as a bridge between the underlying features and the advanced semantics of the image, which can make up for the semantic gap to a certain extent and has strong robustness. As a typical intermediate semantic representation method, the bag-of-visual-words (BoVW) model has received extensive attention in image classification. However, the traditional BoVW model loses the location information of local features, and its local feature descriptors mainly focus on the texture shape information of local regions but lack the expression of color information. Therefore, in this paper, the improved bag-of-visual-words model is presented, which contains three aspects of improvement: (1) multiscale local region extraction; (2) local feature description by speeded up robust features (SURF) and color vector angle histogram (CVAH); and (3) diagonal concentric rectangular pattern. Experimental results show that the three aspects of improvement to the BoVW model are complementary, while compared with the traditional BoVW and the BoVW adopting SURF + SPM, the classification accuracy of the improved BoVW is increased by 3.60% and 2.33%, respectively.


2014 ◽  
Vol 678 ◽  
pp. 189-192
Author(s):  
Jun Li ◽  
Yuan Jiang Liao ◽  
Hong Mei Zhang

We propose a pedestrian detection approach based on bag-of-visual-words and SVM method. The image feature extraction and representation are extremely challenging tasks in pedestrian detection approach, which could impact the performance of pedestrian detection. In this paper, we propose that visual vocabulary is built by clustering SIFT features of image to visual words. Classification is taken using the support vector machine (SVM), for SVM having good non-linear function learning and generalization capability solid. Numerical experiments in the evaluation of INRIATREC pedestrian data sets and the action movies demonstrate that our method shows better performance.


2011 ◽  
Author(s):  
Jie Feng ◽  
L. C. Jiao ◽  
Xiangrong Zhang ◽  
Ruican Niu

2017 ◽  
Vol 31 (2) ◽  
pp. 310-319 ◽  
Author(s):  
Anton Ustyuzhanin ◽  
Karl-Heinz Dammer ◽  
Antje Giebel ◽  
Cornelia Weltzien ◽  
Michael Schirrmann

Common ragweed is a plant species causing allergic and asthmatic symptoms in humans. To control its propagation, an early identification system is needed. However, due to its similar appearance with mugwort, proper differentiation between these two weed species is important. Therefore, we propose a method to discriminate common ragweed and mugwort leaves based on digital images using bag of visual words (BoVW). BoVW is an object-based image classification that has gained acceptance in many areas of science. We compared speeded-up robust features (SURF) and grid sampling for keypoint selection. The image vocabulary was built using K-means clustering. The image classifier was trained using support vector machines. To check the robustness of the classifier, specific model runs were conducted with and without damaged leaves in the trainings dataset. The results showed that the BoVW model allows the discrimination between common ragweed and mugwort leaves with high accuracy. Based on SURF keypoints with 50% of 788 images in total as training data, we achieved a 100% correct recognition of the two plant species. The grid sampling resulted in slightly less recognition accuracy (98 to 99%). In addition, the classification based on SURF was up to 31 times faster.


2021 ◽  
Vol 24 (2) ◽  
pp. 78-86
Author(s):  
Zainab N. Sultani ◽  
◽  
Ban N. Dhannoon ◽  

Image classification is acknowledged as one of the most critical and challenging tasks in computer vision. The bag of visual words (BoVW) model has proven to be very efficient for image classification tasks since it can effectively represent distinctive image features in vector space. In this paper, BoVW using Scale-Invariant Feature Transform (SIFT) and Oriented Fast and Rotated BRIEF(ORB) descriptors are adapted for image classification. We propose a novel image classification system using image local feature information obtained from both SIFT and ORB local feature descriptors. As a result, the constructed SO-BoVW model presents highly discriminative features, enhancing the classification performance. Experiments on Caltech-101 and flowers dataset prove the effectiveness of the proposed method.


Author(s):  
M. Xue ◽  
B. Wei ◽  
L. Yang

Abstract. SegNet model is an improved model of Full Convolutional Networks (FCN). Its encoder, i.e. image feature extraction, is still a convolutional neural network (CNN). Aiming at the problem that most traditional CNN training uses error back propagation algorithm (BP algorithm), which has slow convergence speed and is easy to fall into local optimum solution, this paper takes SegNet as the research object, and proposes a method of extracting partial weights by using genetic algorithm (GA) to select features of SegNet model, and to alleviate the problem that SegNet is easy to fall into local optimal solution. In the training process of SegNet model, the weight of convolution layer of SegNet model used to extract features is optimized through selection, crossover and mutation of genetic algorithm, and then the improved SegNet semantic model (GA-SegNet model) is obtained by GA. In order to verify the image classification effect of the proposed GA-SegNet model, the same high-resolution remote sensing image data are used for experiments, and the model is compared with maximum likelihood (ML), support vector machine (SVM), traditional CNN and SegNet semantic model without GA improvement. The experimental results show that the proposed GA-SegNet model has the best classification accuracy and effect, which GA overcomes the problem of premature convergence of BP random gradient descent to a certain extent, and improves the classification performance of SegNet semantic model.


Sign in / Sign up

Export Citation Format

Share Document