Robotic hand grasping of objects classified by using support vector machine and bag of visual words

Abstract: The human eye can distinguish objects from digital images, however, computers do not have the ability as human eyes that can directly distinguish objects from digital images. Therefore the bag of visual words method was created. Bag of visual words is a method for presenting digital images based on local features. Bag of visual words illustrates how an image can be taken its characteristics, so that computers can distinguish objects on digital images. The test results show that the bag of visual words are still not maximal in classifying digital image categories, especially the chair category, which is only able to produce the most accurate accuracy of 75%. To improve the performance quality of bag of visual words in classifying digital image categories, especially the chair category, you can add an approach to determine the good number of K in clustering the visual words pattern. Keywords: Bag Of Visual Words, Classification, Digital Image, Speed-Up Robust Feature, Support Vector Machine Abstrak: Secara kasat mata manusia bisa membedakan objek pada citra digital, namun, komputer tidak memiliki kemampuan sebagai mata manusia yang dapat secara langsung membedakan objek pada citra digital. Maka dari itu diciptakanlah metode bag of visual words. Bag of visual words adalah metode untuk menyajikan citra digital berdasarkan fitur lokal. Bag of visual words menggambarkan bagaimana suatu gambar dapat diambil karakteristiknya, sehingga komputer dapat membedakan objek pada citra digital. Hasil pengujian menunjukkan bag of visual words masih belum maksimal dalam mengklasifikasi kategori citra digital khususnya kategori chair, yang hanya mampu menghasilkan akurasi paling akurat sebesar 75 %. Untuk meningkatkan kualitas kinerja bag of visual words dalam mengklasifikasi kategori citra digital khususnya kategori chair, dapat menambahkan pendekatan untuk menentukan jumlah K yang baik dalam mengkluster pola visual words. Kata kunci: Bag Of Visual Words, Klasifikasi, Citra Digital, Speed-Up Robust Feature, Support Vector Machine

Get full-text (via PubEx)

Discrimination of Common Ragweed (Ambrosia artemisiifolia) and Mugwort (Artemisia vulgaris) Based on Bag of Visual Words Model

Weed Technology ◽

10.1614/wt-d-16-00068.1 ◽

2017 ◽

Vol 31 (2) ◽

pp. 310-319 ◽

Cited By ~ 2

Author(s):

Anton Ustyuzhanin ◽

Karl-Heinz Dammer ◽

Antje Giebel ◽

Cornelia Weltzien ◽

Michael Schirrmann

Keyword(s):

Plant Species ◽

Ambrosia Artemisiifolia ◽

Training Data ◽

Support Vector ◽

Identification System ◽

Bag Of Visual Words ◽

Common Ragweed ◽

Visual Words ◽

Grid Sampling ◽

Speeded Up Robust Features

Common ragweed is a plant species causing allergic and asthmatic symptoms in humans. To control its propagation, an early identification system is needed. However, due to its similar appearance with mugwort, proper differentiation between these two weed species is important. Therefore, we propose a method to discriminate common ragweed and mugwort leaves based on digital images using bag of visual words (BoVW). BoVW is an object-based image classification that has gained acceptance in many areas of science. We compared speeded-up robust features (SURF) and grid sampling for keypoint selection. The image vocabulary was built using K-means clustering. The image classifier was trained using support vector machines. To check the robustness of the classifier, specific model runs were conducted with and without damaged leaves in the trainings dataset. The results showed that the BoVW model allows the discrimination between common ragweed and mugwort leaves with high accuracy. Based on SURF keypoints with 50% of 788 images in total as training data, we achieved a 100% correct recognition of the two plant species. The grid sampling resulted in slightly less recognition accuracy (98 to 99%). In addition, the classification based on SURF was up to 31 times faster.

Get full-text (via PubEx)

Emotion Recognition from Speech Using the Bag-of-Visual Words on Audio Segment Spectrograms

Technologies ◽

10.3390/technologies7010020 ◽

2019 ◽

Vol 7 (1) ◽

pp. 20 ◽

Cited By ~ 3

Author(s):

Evaggelos Spyrou ◽

Rozalia Nikopoulou ◽

Ioannis Vernikos ◽

Phivos Mylonas

Keyword(s):

Computer Vision ◽

Emotion Recognition ◽

Affective State ◽

Real Life ◽

Support Vector ◽

Bag Of Visual Words ◽

Educational Training ◽

Visual Words ◽

Speeded Up Robust Features ◽

Digital World

It is noteworthy nowadays that monitoring and understanding a human’s emotional state plays a key role in the current and forthcoming computational technologies. On the other hand, this monitoring and analysis should be as unobtrusive as possible, since in our era the digital world has been smoothly adopted in everyday life activities. In this framework and within the domain of assessing humans’ affective state during their educational training, the most popular way to go is to use sensory equipment that would allow their observing without involving any kind of direct contact. Thus, in this work, we focus on human emotion recognition from audio stimuli (i.e., human speech) using a novel approach based on a computer vision inspired methodology, namely the bag-of-visual words method, applied on several audio segment spectrograms. The latter are considered to be the visual representation of the considered audio segment and may be analyzed by exploiting well-known traditional computer vision techniques, such as construction of a visual vocabulary, extraction of speeded-up robust features (SURF) features, quantization into a set of visual words, and image histogram construction. As a last step, support vector machines (SVM) classifiers are trained based on the aforementioned information. Finally, to further generalize the herein proposed approach, we utilize publicly available datasets from several human languages to perform cross-language experiments, both in terms of actor-created and real-life ones.

Get full-text (via PubEx)

CATEGORIZATION OF SIMILAR OBJECTS USING BAG OF VISUAL WORDS AND SUPPORT VECTOR MACHINES

Proceedings of the 4th International Conference on Agents and Artificial Intelligence ◽

10.5220/0003714702310236 ◽

2012 ◽

Keyword(s):

Support Vector Machines ◽

Support Vector ◽

Bag Of Visual Words ◽

Visual Words ◽

Vector Machines

Get full-text (via PubEx)

Weed Mapping with UAS Imagery and a Bag of Visual Words Based Image Classifier

Remote Sensing ◽

10.3390/rs10101530 ◽

2018 ◽

Vol 10 (10) ◽

pp. 1530 ◽

Cited By ~ 11

Author(s):

Michael Pflanz ◽

Henning Nordmeyer ◽

Michael Schirrmann

Keyword(s):

Plant Protection ◽

Unmanned Aircraft ◽

Aerial Images ◽

Support Vector ◽

Bag Of Visual Words ◽

Weed Species ◽

Weed Detection ◽

Site Specific ◽

Wheat Field ◽

Visual Words

Weed detection with aerial images is a great challenge to generate field maps for site-specific plant protection application. The requirements might be met with low altitude flights of unmanned aerial vehicles (UAV), to provide adequate ground resolutions for differentiating even single weeds accurately. The following study proposed and tested an image classifier based on a Bag of Visual Words (BoVW) framework for mapping weed species, using a small unmanned aircraft system (UAS) with a commercial camera on board, at low flying altitudes. The image classifier was trained with support vector machines after building a visual dictionary of local features from many collected UAS images. A window-based processing of the models was used for mapping the weed occurrences in the UAS imagery. The UAS flight campaign was carried out over a weed infested wheat field, and images were acquired between a 1 and 6 m flight altitude. From the UAS images, 25,452 weed plants were annotated on species level, along with wheat and soil as background classes for training and validation of the models. The results showed that the BoVW model allowed the discrimination of single plants with high accuracy for Matricaria recutita L. (88.60%), Papaver rhoeas L. (89.08%), Viola arvensis M. (87.93%), and winter wheat (94.09%), within the generated maps. Regarding site specific weed control, the classified UAS images would enable the selection of the right herbicide based on the distribution of the predicted weed species.

Get full-text (via PubEx)

Sparse Based Image Classification With Bag-of-Visual-Words Representations

International Journal of Software Science and Computational Intelligence ◽

10.4018/jssci.2011010101 ◽

2011 ◽

Vol 3 (1) ◽

pp. 1-15 ◽

Cited By ~ 2

Author(s):

Yuanyuan Zuo ◽

Bo Zhang

Keyword(s):

Sparse Representation ◽

Image Classification ◽

Image Feature ◽

Support Vector ◽

Bag Of Visual Words ◽

Human Face Recognition ◽

Visual Words ◽

Object Categories ◽

Comparable Performance ◽

Background Clutter

The sparse representation based classification algorithm has been used to solve the problem of human face recognition, but the image database is restricted to human frontal faces with only slight illumination and expression changes. This paper applies the sparse representation based algorithm to the problem of generic image classification, with a certain degree of intra-class variations and background clutter. Experiments are conducted with the sparse representation based algorithm and Support Vector Machine (SVM) classifiers on 25 object categories selected from the Caltech101 dataset. Experimental results show that without the time-consuming parameter optimization, the sparse representation based algorithm achieves comparable performance with SVM. The experiments also demonstrate that the algorithm is robust to a certain degree of background clutter and intra-class variations with the bag-of-visual-words representations. The sparse representation based algorithm can also be applied to generic image classification task when the appropriate image feature is used.

Get full-text (via PubEx)

Automatic Classification of Optical Defects of Mirrors from Ronchigram Images Using Bag of Visual Words and Support Vector Machines

Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-319-75193-1_86 ◽

2018 ◽

pp. 719-726

Author(s):

Daniel Zapata ◽

Angel Cruz-Roa ◽

Andrés Jiménez

Keyword(s):

Support Vector Machines ◽

Automatic Classification ◽

Support Vector ◽

Bag Of Visual Words ◽

Visual Words ◽

Vector Machines

Get full-text (via PubEx)

A myoelectric interface for robotic hand control using support vector machine

2007 IEEE/RSJ International Conference on Intelligent Robots and Systems ◽

10.1109/iros.2007.4399301 ◽

2007 ◽

Cited By ~ 36

Author(s):

Masahiro Yoshikawa ◽

Masahiko Mikawa ◽

Kazuyo Tanaka

Keyword(s):

Support Vector Machine ◽

Support Vector ◽

Robotic Hand ◽

Hand Control

Get full-text (via PubEx)

Dynamic Spatio-Temporal Bag of Expressions (D-STBoE) Model for Human Action Recognition

Sensors ◽

10.3390/s19122790 ◽

2019 ◽

Vol 19 (12) ◽

pp. 2790 ◽

Cited By ~ 5

Author(s):

Saima Nazir ◽

Muhammad Haroon Yousaf ◽

Jean-Christophe Nebel ◽

Sergio A. Velastin

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Fixed Number ◽

Visual Word ◽

Support Vector ◽

Bag Of Visual Words ◽

Visual Words ◽

Visual Expression ◽

Spatio Temporal

Human action recognition (HAR) has emerged as a core research domain for video understanding and analysis, thus attracting many researchers. Although significant results have been achieved in simple scenarios, HAR is still a challenging task due to issues associated with view independence, occlusion and inter-class variation observed in realistic scenarios. In previous research efforts, the classical bag of visual words approach along with its variations has been widely used. In this paper, we propose a Dynamic Spatio-Temporal Bag of Expressions (D-STBoE) model for human action recognition without compromising the strengths of the classical bag of visual words approach. Expressions are formed based on the density of a spatio-temporal cube of a visual word. To handle inter-class variation, we use class-specific visual word representation for visual expression generation. In contrast to the Bag of Expressions (BoE) model, the formation of visual expressions is based on the density of spatio-temporal cubes built around each visual word, as constructing neighborhoods with a fixed number of neighbors could include non-relevant information making a visual expression less discriminative in scenarios with occlusion and changing viewpoints. Thus, the proposed approach makes the model more robust to occlusion and changing viewpoint challenges present in realistic scenarios. Furthermore, we train a multi-class Support Vector Machine (SVM) for classifying bag of expressions into action classes. Comprehensive experiments on four publicly available datasets: KTH, UCF Sports, UCF11 and UCF50 show that the proposed model outperforms existing state-of-the-art human action recognition methods in term of accuracy to 99.21%, 98.60%, 96.94 and 94.10%, respectively.

Get full-text (via PubEx)