Regularizing Deep Networks with Label Geometry for Accurate Object Localization on Small Training Datasets

Object localization is one of the inherent tasks of computer vision. It plays an intrinsic role in object detection tasks that initiate with a recognition procedure of figuring out the presence of single/multiple instances of objects of interest in a given image. It involves determination of precise locations of object instances. This paper presents an overview of some of the popularly used approaches to the object localization problem, involving efficient branch-and-bound strategy for sub-window search, super-pixel neighborhood information based approach, boosted local structured Histogram of Oriented Gradients-Local Binary Patterns (HOG-LBP) based strategy, multi-instance learning based weakly supervised object localization, object localization by utilizing deep networks and image tag based object localization. The performance of the mentioned approaches have been compared on the basis of their results on PASCAL-VOC 2007 dataset.

Download Full-text

3D Object Recognition and Relative Localization using a 3D sensor Embedded on a Mobile Robot

10.31237/osf.io/p4thn ◽

2020 ◽

Author(s):

Gopi Krishna Erabati

Keyword(s):

Pose Estimation ◽

Autonomous Systems ◽

Microsoft Kinect ◽

Object Localization ◽

Initial Alignment ◽

Relative Localization ◽

3D Data ◽

Industry Automation ◽

Ofthe Object ◽

3D Descriptors

The technology in current research scenario is marching towards automation forhigher productivity with accurate and precise product development. Vision andRobotics are domains which work to create autonomous systems and are the keytechnology in quest for mass productivity. The automation in an industry canbe achieved by detecting interactive objects and estimating the pose to manipulatethem. Therefore the object localization ( i.e., pose) includes position andorientation of object, has profound ?significance. The application of object poseestimation varies from industry automation to entertainment industry and fromhealth care to surveillance. The objective of pose estimation of objects is verysigni?cant in many cases, like in order for the robots to manipulate the objects,for accurate rendering of Augmented Reality (AR) among others.This thesis tries to solve the issue of object pose estimation using 3D dataof scene acquired from 3D sensors (e.g. Kinect, Orbec Astra Pro among others).The 3D data has an advantage of independence from object texture and invarianceto illumination. The proposal is divided into two phases : An o?ine phasewhere the 3D model template of the object ( for estimation of pose) is built usingIterative Closest Point (ICP) algorithm. And an online phase where the pose ofthe object is estimated by aligning the scene to the model using ICP, providedwith an initial alignment using 3D descriptors (like Fast Point Feature Transform(FPFH)).The approach we develop is to be integrated on two di?erent platforms :1)Humanoid robot `Pyrene' which has Orbec Astra Pro 3D sensor for data acquisition,and 2)Unmanned Aerial Vehicle (UAV) which has Intel Realsense Euclidon it. The datasets of objects (like electric drill, brick, a small cylinder, cake box)are acquired using Microsoft Kinect, Orbec Astra Pro and Intel RealSense Euclidsensors to test the performance of this technique. The objects which are used totest this approach are the ones which are used by robot. This technique is testedin two scenarios, fi?rstly, when the object is on the table and secondly when theobject is held in hand by a person. The range of objects from the sensor is 0.6to 1.6m. This technique could handle occlusions of the object by hand (when wehold the object), as ICP can work even if partial object is visible in the scene.

Download Full-text