A SIFT Feature-Based Template Matching Method for Detecting and Counting Objects in Life Space

Autonomous robots are at advanced stage in various fields, and they are expected to autonomously work at the scenes of nursing care or medical care in the near future. In this paper, we focus on object counting task by images. Since the number of objects is not a mere physical quantity, it is difficult for conventional phisical sensors to measure such quantity and an intelligent sensing with higher-order recognition is required to accomplish such counting task. It is often that we count the number of objects in various situations. In the case of several objects, we can recognize the number at a glance. On the other hand, in the case of a dozen of objects, the task to count the number might become troublesome. Thus, simple and easy way to enumerate the objects automatically has been expected. In this study, we propose a method to recognize the number of objects by image. In general, the target object to count varies according to user's request. In order to accept the user's various requests, the region belonging to the desired object in the image is selected as a template. Main process of the proposed method is to search and count regions which resembles the template. To achieve robustness against spatial transformation, such as translation, rotation, and scaling, scale-invariant feature transform (SIFT) is employed as a feature. To show the effectiveness, the proposed method is applied to few images containing everyday objects, e.g., binders, cans etc.

Download Full-text

Comparison of Contour Based and Feature Based Tracking Methods for Control of Microbiorobots

Volume 12: Micro and Nano Systems, Parts A and B ◽

10.1115/imece2009-10564 ◽

2009 ◽

Author(s):

Dal Hyung Kim ◽

Edward Steager ◽

Min Jun Kim

Keyword(s):

Small Error ◽

Small Scale ◽

Scale Invariant ◽

Sift Algorithm ◽

Tracking Method ◽

Miniature Robots ◽

A Cell ◽

Feature Based ◽

Scale Invariant Feature ◽

Exact Positions

Miniature robots should be precisely controlled because of a small workspace and size of their shapes. Small error of control could lead to failure of tasks such as an assembly. Tracking is one of the most important techniques because control of a small scale robot is hard to accomplish without object’s motion information. In this paper, we compare the feature based and the region based tracking methods with microbiorobot. Invariant features can be extracted using Scale Invariant Feature Transfrom (SIFT) algorithm because microbiorobot is a rigid body unlike a cell. We clearly showed that the feature based tracking method track exact positions of the objects than region based tracking method when objects are close contacted or overlapped. Also, the feature based tracking method allows tracking of objects even though partial object disappears or illumination is changed.

Download Full-text

Automatic Reel Editing in Chip on Film Quality Control With Computer Vision

International Journal of Systems and Service-Oriented Engineering ◽

10.4018/ijssoe.2021010101 ◽

2021 ◽

Vol 11 (1) ◽

pp. 1-14

Author(s):

Shing Hwang Doong

Keyword(s):

Quality Control ◽

Integrated Circuits ◽

Object Detection ◽

Object Tracking ◽

Template Matching ◽

Scale Invariant ◽

Detection Techniques ◽

Novel Method ◽

Scale Invariant Feature ◽

Chip On Film

Chip on film (COF) is a special packaging technology to pack integrated circuits in a flexible carrier tape. Chips packed with COF are primarily used in the display industry. Reel editing is a critical step in COF quality control to remove sections of congregating NG (not good) chips from a reel. Today, COF manufactures hire workers to count consecutive NG chips in a rolling reel with naked eyes. When the count is greater than a preset number, the corresponding section is removed. A novel method using object detection and object tracking is proposed to solve this problem. Object detection techniques including convolutional neural network (CNN), template matching (TM), and scale invariant feature transform (SIFT) were used to detect NG marks, and object tracking was used to track them with IDs so that congregating NG chips could be counted reliably. Using simulation videos similar to worksite scenes, experiments show that both CNN and TM detectors could solve the reel editing problem, while SIFT detectors failed. Furthermore, TM is better than CNN by yielding a real time solution.

Download Full-text

Study on the Influence of Image Noise on Monocular Feature-Based Visual SLAM Based on FFDNet

Sensors ◽

10.3390/s20174922 ◽

2020 ◽

Vol 20 (17) ◽

pp. 4922

Author(s):

Like Cao ◽

Jie Ling ◽

Xiaohui Xiao

Keyword(s):

Visual Slam ◽

Scale Invariant ◽

Original Sequence ◽

Speeded Up Robust Features ◽

Localization And Mapping ◽

Feature Based ◽

Scale Invariant Feature ◽

Influence Of Noise ◽

Denoised Image ◽

Better Than

Noise appears in images captured by real cameras. This paper studies the influence of noise on monocular feature-based visual Simultaneous Localization and Mapping (SLAM). First, an open-source synthetic dataset with different noise levels is introduced in this paper. Then the images in the dataset are denoised using the Fast and Flexible Denoising convolutional neural Network (FFDNet); the matching performances of Scale Invariant Feature Transform (SIFT), Speeded Up Robust Features (SURF) and Oriented FAST and Rotated BRIEF (ORB) which are commonly used in feature-based SLAM are analyzed in comparison and the results show that ORB has a higher correct matching rate than that of SIFT and SURF, the denoised images have a higher correct matching rate than noisy images. Next, the Absolute Trajectory Error (ATE) of noisy and denoised sequences are evaluated on ORB-SLAM2 and the results show that the denoised sequences perform better than the noisy sequences at any noise level. Finally, the completely clean sequence in the dataset and the sequences in the KITTI dataset are denoised and compared with the original sequence through comprehensive experiments. For the clean sequence, the Root-Mean-Square Error (RMSE) of ATE after denoising has decreased by 16.75%; for KITTI sequences, 7 out of 10 sequences have lower RMSE than the original sequences. The results show that the denoised image can achieve higher accuracy in the monocular feature-based visual SLAM under certain conditions.

Download Full-text

Scale-invariant optical flow in tracking using a pan-tilt-zoom camera

Robotica ◽

10.1017/s0263574714002665 ◽

2014 ◽

Vol 34 (9) ◽

pp. 1923-1947 ◽

Cited By ~ 2

Author(s):

Salam Dhou ◽

Yuichi Motai

Keyword(s):

Optical Flow ◽

Focal Length ◽

Tracking Error ◽

Feature Points ◽

Scale Invariant ◽

Tracking Method ◽

Camera Control ◽

Speed Up ◽

Feature Based ◽

Scale Invariant Feature

SUMMARYAn efficient method for tracking a target using a single Pan-Tilt-Zoom (PTZ) camera is proposed. The proposed Scale-Invariant Optical Flow (SIOF) method estimates the motion of the target and rotates the camera accordingly to keep the target at the center of the image. Also, SIOF estimates the scale of the target and changes the focal length relatively to adjust the Field of View (FoV) and keep the target appear in the same size in all captured frames. SIOF is a feature-based tracking method. Feature points used are extracted and tracked using Optical Flow (OF) and Scale-Invariant Feature Transform (SIFT). They are combined in groups and used to achieve robust tracking. The feature points in these groups are used within a twist model to recover the 3D free motion of the target. The merits of this proposed method are (i) building an efficient scale-invariant tracking method that tracks the target and keep it in the FoV of the camera with the same size, and (ii) using tracking with prediction and correction to speed up the PTZ control and achieve smooth camera control. Experimental results were performed on online video streams and validated the efficiency of the proposed method SIOF, comparing with OF, SIFT, and other tracking methods. The proposed SIOF has around 36% less average tracking error and around 70% less tracking overshoot than OF.

Download Full-text

Classification of wood knots using artificial neural networks with texture and local feature-based image descriptors

Holzforschung ◽

10.1515/hf-2021-0051 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Sung-Wook Hwang ◽

Taekyeong Lee ◽

Hyunbin Kim ◽

Hyunwoo Chung ◽

Jong Gyu Choi ◽

...

Keyword(s):

Local Feature ◽

Scale Invariant Feature Transform ◽

Scale Invariant ◽

Invariant Feature ◽

Feature Based ◽

Feature Transform ◽

Artificial Neural ◽

Occurrence Matrix ◽

Scale Invariant Feature

Abstract This paper describes feature-based techniques for wood knot classification. For automated classification of macroscopic wood knot images, models were established using artificial neural networks with texture and local feature descriptors, and the performances of feature extraction algorithms were compared. Classification models trained with texture descriptors, gray-level co-occurrence matrix and local binary pattern, achieved better performance than those trained with local feature descriptors, scale-invariant feature transform and dense scale-invariant feature transform. Hence, it was confirmed that wood knot classification was more appropriate for texture classification rather than an approach based on morphological classification. The gray-level co-occurrence matrix produced the highest F1 score despite representing images with relatively low-dimensional feature vectors. The scale-invariant feature transform algorithm could not detect a sufficient number of features from the knot images; hence, the histogram of oriented gradients and dense scale-invariant feature transform algorithms that describe the entire image were better for wood knot classification. The artificial neural network model provided better classification performance than the support vector machine and k-nearest neighbor models, which suggests the suitability of the nonlinear classification model for wood knot classification.

Download Full-text

Classified Scale-Invariant Feature Transform Feature Based Elastic Image Registration for 2-DE Gels

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2015.1469 ◽

2015 ◽

Vol 5 (4) ◽

pp. 855-861 ◽

Cited By ~ 1

Author(s):

Qiaofeng Ou ◽

Bangshu Xiong ◽

Haodong Zhang ◽

Yong Yang ◽

Huisheng Zhang

Keyword(s):

Image Registration ◽

Scale Invariant Feature Transform ◽

Scale Invariant ◽

Invariant Feature ◽

Feature Based ◽

Elastic Image Registration ◽

Feature Transform ◽

Scale Invariant Feature

Download Full-text

Object Detection and Recognition Using Template Matching with SIFT Features Assisted by Invisible Floor Marks

Journal of Robotics and Mechatronics ◽

10.20965/jrm.2009.p0689 ◽

2009 ◽

Vol 21 (6) ◽

pp. 689-697 ◽

Cited By ~ 5

Author(s):

Seiji Aoyagi ◽

◽

Nobuhiko Hattori ◽

Atsushi Kohama ◽

Sho Komai ◽

...

Keyword(s):

Template Matching ◽

Spatial Relationship ◽

Scale Invariant ◽

Shape Information ◽

Solid Models ◽

Indoor Mobile Robot ◽

Monocular Image ◽

Sift Features ◽

3D Solid ◽

Scale Invariant Feature

For simultaneously localizing and mapping (SLAM) an indoor mobile robot, a method to process a monocular image of entire environmental view is proposed. To ensure that an object can be searched for, invisible floor marks are proposed for modifying the environment and which are useful in narrowing the search area in an image. Specifically our approach involves: 1) narrowing the searched area using invisible floor marks, 2) extracting features based on scale-invariant feature transform (SIFT), 3) using template matching with SIFT features assisted by partial templates and the spatial relationship to the floor, and 4) verifying object recognition with an AdaBoost classifier using Haar-like features based on object shape information. A robot is localized relative to the floor using the floor marks, then, objects in a clattered image are extracted and recognized, and 3D solid models of them are mapped on the floor to build a highly structured 3D map. Recognition was over 80% successful, including tables and chairs and taking several tens of seconds per 640 × 480 pixel image.

Download Full-text