Image-Searching for Office Equipment Using Bag-of-Keypoints and AdaBoost

2011 ◽  
Vol 23 (6) ◽  
pp. 1080-1090 ◽  
Author(s):  
Seiji Aoyagi ◽  
◽  
Atsushi Kohama ◽  
Yuki Inaura ◽  
Masato Suzuki ◽  
...  

For an indoor mobile robot’s Simultaneous Localization And Mapping (SLAM), a method of processing only one monocular image (640×480 pixel) of the environment is proposed. This method imitates a human’s ability to grasp at a glance the overall situation of a room, i.e., its layout and any objects or obstacles in it. Specific object recognition of a desk through the use of several camera angles is dealt with as one example. The proposed method has the following steps. 1) The bag-of-keypoints method is applied to the image to detect the existence of the object in the input image. 2) If the existence of the object is verified, the angle of the object is further detected using the bag-ofkeypoints method. 3) The candidates for the projection from template image to input image are obtained using Scale Invariant Feature Transform (SIFT) or edge information. Whether or not the projected area correctly corresponds to the object is checked using the AdaBoost classifier, based on various image features such as Haar-like features. Through these steps, the desk is eventually extractedwith angle information if it exists in the image.

Data ◽  
2018 ◽  
Vol 3 (4) ◽  
pp. 52 ◽  
Author(s):  
Oleksii Gorokhovatskyi ◽  
Volodymyr Gorokhovatskyi ◽  
Olena Peredrii

In this paper, we propose an investigation of the properties of structural image recognition methods in the cluster space of characteristic features. Recognition, which is based on key point descriptors like SIFT (Scale-invariant Feature Transform), SURF (Speeded Up Robust Features), ORB (Oriented FAST and Rotated BRIEF), etc., often relating to the search for corresponding descriptor values between an input image and all etalon images, which require many operations and time. Recognition of the previously quantized (clustered) sets of descriptor features is described. Clustering is performed across the complete set of etalon image descriptors and followed by screening, which allows for representation of each etalon image in vector form as a distribution of clusters. Due to such representations, the number of computation and comparison procedures, which are the core of the recognition process, might be reduced tens of times. Respectively, the preprocessing stage takes additional time for clustering. The implementation of the proposed approach was tested on the Leeds Butterfly dataset. The dependence of cluster amount on recognition performance and processing time was investigated. It was proven that recognition may be performed up to nine times faster with only a moderate decrease in quality recognition compared to searching for correspondences between all existing descriptors in etalon images and input one without quantization.


2015 ◽  
Vol 4 (3) ◽  
pp. 70-89
Author(s):  
Ramesh Chand Pandey ◽  
Sanjay Kumar Singh ◽  
K K Shukla

Copy-Move is one of the most common technique for digital image tampering or forgery. Copy-Move in an image might be done to duplicate something or to hide an undesirable region. In some cases where these images are used for important purposes such as evidence in court of law, it is important to verify their authenticity. In this paper the authors propose a novel method to detect single region Copy-Move Forgery Detection (CMFD) using Speed-Up Robust Features (SURF), Histogram Oriented Gradient (HOG), Scale Invariant Features Transform (SIFT), and hybrid features such as SURF-HOG and SIFT-HOG. SIFT and SURF image features are immune to various transformations like rotation, scaling, translation, so SIFT and SURF image features help in detecting Copy-Move regions more accurately in compared to other image features. Further the authors have detected multiple regions COPY-MOVE forgery using SURF and SIFT image features. Experimental results demonstrate commendable performance of proposed methods.


2021 ◽  
Vol 24 (2) ◽  
pp. 78-86
Author(s):  
Zainab N. Sultani ◽  
◽  
Ban N. Dhannoon ◽  

Image classification is acknowledged as one of the most critical and challenging tasks in computer vision. The bag of visual words (BoVW) model has proven to be very efficient for image classification tasks since it can effectively represent distinctive image features in vector space. In this paper, BoVW using Scale-Invariant Feature Transform (SIFT) and Oriented Fast and Rotated BRIEF(ORB) descriptors are adapted for image classification. We propose a novel image classification system using image local feature information obtained from both SIFT and ORB local feature descriptors. As a result, the constructed SO-BoVW model presents highly discriminative features, enhancing the classification performance. Experiments on Caltech-101 and flowers dataset prove the effectiveness of the proposed method.


Sensors ◽  
2020 ◽  
Vol 20 (17) ◽  
pp. 4922
Author(s):  
Like Cao ◽  
Jie Ling ◽  
Xiaohui Xiao

Noise appears in images captured by real cameras. This paper studies the influence of noise on monocular feature-based visual Simultaneous Localization and Mapping (SLAM). First, an open-source synthetic dataset with different noise levels is introduced in this paper. Then the images in the dataset are denoised using the Fast and Flexible Denoising convolutional neural Network (FFDNet); the matching performances of Scale Invariant Feature Transform (SIFT), Speeded Up Robust Features (SURF) and Oriented FAST and Rotated BRIEF (ORB) which are commonly used in feature-based SLAM are analyzed in comparison and the results show that ORB has a higher correct matching rate than that of SIFT and SURF, the denoised images have a higher correct matching rate than noisy images. Next, the Absolute Trajectory Error (ATE) of noisy and denoised sequences are evaluated on ORB-SLAM2 and the results show that the denoised sequences perform better than the noisy sequences at any noise level. Finally, the completely clean sequence in the dataset and the sequences in the KITTI dataset are denoised and compared with the original sequence through comprehensive experiments. For the clean sequence, the Root-Mean-Square Error (RMSE) of ATE after denoising has decreased by 16.75%; for KITTI sequences, 7 out of 10 sequences have lower RMSE than the original sequences. The results show that the denoised image can achieve higher accuracy in the monocular feature-based visual SLAM under certain conditions.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Bahar Hatipoglu Yilmaz ◽  
Cemal Kose

Abstract Emotion is one of the most complex and difficult expression to be predicted. Nowadays, many recognition systems that use classification methods have focused on different types of emotion recognition problems. In this paper, we aimed to propose a multimodal fusion method between electroencephalography (EEG) and electrooculography (EOG) signals for emotion recognition. Therefore, before the feature extraction stage, we applied different angle-amplitude transformations to EEG–EOG signals. These transformations take arbitrary time domain signals and convert them two-dimensional images named as Angle-Amplitude Graph (AAG). Then, we extracted image-based features using a scale invariant feature transform method, fused these features originates basically from EEG–EOG and lastly classified with support vector machines. To verify the validity of these proposed methods, we performed experiments on the multimodal DEAP dataset which is a benchmark dataset widely used for emotion analysis with physiological signals. In the experiments, we applied the proposed emotion recognition procedures on the arousal-valence dimensions. We achieved (91.53%) accuracy for the arousal space and (90.31%) for the valence space after fusion. Experimental results showed that the combination of AAG image features belonging to EEG–EOG signals in the baseline angle amplitude transformation approaches enhanced the classification performance on the DEAP dataset.


Author(s):  
R. Ponnusamy ◽  
S. Sathiamoorthy ◽  
R. Visalakshi

The digital images made with the Wireless Capsule Endoscopy (WCE) from the patient's gastrointestinal tract are used to forecast abnormalities. The big amount of information from WCE pictures could take 2 hours to review GI tract illnesses per patient to research the digestive system and evaluate them. It is highly time consuming and increases healthcare costs considerably. In order to overcome this problem, the CS-LBP (Center Symmetric Local Binary Pattern) and the ACC (Auto Color Correlogram) were proposed to use a novel method based on a visual bag of features (VBOF). In order to solve this issue, we suggested a Visual Bag of Features(VBOF) method by incorporating Scale Invariant Feature Transform (SIFT), Center-Symmetric Local Binary Pattern (CS-LBP) and Auto Color Correlogram (ACC). This combination of features is able to detect the interest point, texture and color information in an image. Features for each image are calculated to create a descriptor with a large dimension. The proposed feature descriptors are clustered by K- means referred to as visual words, and the Support Vector Machine (SVM) method is used to automatically classify multiple disease abnormalities from the GI tract. Finally, post-processing scheme is applied to deal with final classification results i.e. validated the performance of multi-abnormal disease frame detection.


Robotica ◽  
2013 ◽  
Vol 32 (4) ◽  
pp. 533-549 ◽  
Author(s):  
Yin-Tien Wang ◽  
Guan-Yu Lin

SUMMARYA robot mapping procedure using a modified speeded-up robust feature (SURF) is proposed for building persistent maps with visual landmarks in robot simultaneous localization and mapping (SLAM). SURFs are scale-invariant features that automatically recover the scale and orientation of image features in different scenes. However, the SURF method is not originally designed for applications in dynamic environments. The repeatability of the detected SURFs will be reduced owing to the dynamic effect. This study investigated and modified SURF algorithms to improve robustness in representing visual landmarks in robot SLAM systems. Many modifications of the SURF algorithms are proposed in this study including the orientation representation of features, the vector dimension of feature description, and the number of detected features in an image. The concept of sparse representation is also used to describe the environmental map and to reduce the computational complexity when using extended Kalman filter (EKF) for state estimation. Effective procedures of data association and map management for SURFs in SLAM are also designed to improve accuracy in robot state estimation. Experimental works were performed on an actual system with binocular vision sensors to validate the feasibility and effectiveness of the proposed algorithms. The experimental examples include the evaluation of state estimation using EKF SLAM and the implementation of indoor SLAM. In the experiments, the performance of the modified SURF algorithms was compared with the original SURF algorithms. The experimental results confirm that the modified SURF provides better repeatability and better robustness for representing the landmarks in visual SLAM systems.


Author(s):  
Shiraz Ahmad ◽  
Zhe-Ming Lu

Many proposed digital image watermarking techniques are sensitive to geometric attacks, such as rotation, scaling, translation, or their composites. Geometric distortions, even by slight amounts, can inevitably damage the watermark and/or disable the capability of the watermark detector to reliably perform its function. In this chapter, the authors exploit the invariant image features to design geometric distortions-invariant watermarking system, and present two watermarking techniques. First technique utilizes the bounding box scale-invariant feature transform and discrete orthogonal Hahn moments to embed the watermark into the selective image patches, and the second technique uses only the Hahn moments to globally embed watermark into the whole image. First technique is non-blind and uses the original image during detection. While exhibiting excellent resistance against different geometric distortions, this technique also has fairly good resistance to image cropping like attacks. However, this technique exhibits a reduced data payload. The second technique is designed to be blind and the watermark is blindly extracted using the independent component analysis. For this technique an improved data payload is achieved but with a little compromise on resistance against cropping like attacks. The implementations are supported with thorough discussions and the experimental results prove and demonstrate the effectiveness of the proposed schemes against several kinds of geometric attacks.


2013 ◽  
Vol 284-287 ◽  
pp. 3310-3314
Author(s):  
Jenq Haur Wang ◽  
Jhih Siang Syu ◽  
Chuan Ming Liu ◽  
Yen Lin Chen

In existing image search systems, image queries can be used to find similar images through content-based image retrieval (CBIR). In order to obtain more related images, users often need to provide descriptions of the image as the keywords for search engine to extract more relevant information. But it is difficult to find appropriate keywords and text description from the image content. Searching for relevant information from search engines takes a lot of time. In this paper, we propose a CBIR system which effectively finds similar images by comparing image contents and the image annotation embedded in the image. First, we use discrete wavelet transform and two-dimensional code to embed the relevant text information or tags into the image. Then, we extract color ratios and Scale-Invariant Feature Transform (SIFT) descriptors as the image features for similarity matching. The experimental results showed that our proposed approach can accurately find similar images, and extract image-related textual information to provide useful tags for users.


Sign in / Sign up

Export Citation Format

Share Document