Scale-Invariant Vote-Based 3D Recognition and Registration from Point Clouds

3D Object Detection Using Scale Invariant and Feature Reweighting Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019267 ◽

2019 ◽

Vol 33 ◽

pp. 9267-9274 ◽

Cited By ~ 6

Author(s):

Xin Zhao ◽

Zhe Liu ◽

Ruolan Hu ◽

Kaiqi Huang

Keyword(s):

Object Detection ◽

Network Architecture ◽

Point Clouds ◽

Scale Invariant ◽

3D Object ◽

Outdoor Scenes ◽

Indoor Scenes ◽

Bounding Boxes ◽

The One ◽

3D Object Detection

3D object detection plays an important role in a large number of real-world applications. It requires us to estimate the localizations and the orientations of 3D objects in real scenes. In this paper, we present a new network architecture which focuses on utilizing the front view images and frustum point clouds to generate 3D detection results. On the one hand, a PointSIFT module is utilized to improve the performance of 3D segmentation. It can capture the information from different orientations in space and the robustness to different scale shapes. On the other hand, our network obtains the useful features and suppresses the features with less information by a SENet module. This module reweights channel features and estimates the 3D bounding boxes more effectively. Our method is evaluated on both KITTI dataset for outdoor scenes and SUN-RGBD dataset for indoor scenes. The experimental results illustrate that our method achieves better performance than the state-of-the-art methods especially when point clouds are highly sparse.

Download Full-text

Scale invariant point feature (SIPF) for 3D point clouds and 3D multi-scale object detection

Neural Computing and Applications ◽

10.1007/s00521-017-2964-1 ◽

2017 ◽

Vol 29 (5) ◽

pp. 1209-1224 ◽

Cited By ~ 3

Author(s):

Baowei Lin ◽

Fasheng Wang ◽

Fangda Zhao ◽

Yi Sun

Keyword(s):

Object Detection ◽

Invariant Point ◽

Point Clouds ◽

Scale Invariant ◽

Multi Scale ◽

3D Point Clouds ◽

Point Feature

Download Full-text

Global Monocular Indoor Positioning of a Robotic Vehicle with a Floorplan

Sensors ◽

10.3390/s19030634 ◽

2019 ◽

Vol 19 (3) ◽

pp. 634 ◽

Cited By ~ 2

Author(s):

John Noonan ◽

Hector Rotstein ◽

Amir Geva ◽

Ehud Rivlin

Keyword(s):

Point Clouds ◽

Indoor Positioning ◽

Global Scale ◽

Bundle Adjustment ◽

Necessary Condition ◽

Fusion Algorithm ◽

Real World Data ◽

Scale Invariant ◽

Planar Structures ◽

Robotic Vehicle

This paper presents a global monocular indoor positioning system for a robotic vehicle starting from a known pose. The proposed system does not depend on a dense 3D map, require prior environment exploration or installation, or rely on the scene remaining the same, photometrically or geometrically. The approach presents a new way of providing global positioning relying on the sparse knowledge of the building floorplan by utilizing special algorithms to resolve the unknown scale through wall–plane association. This Wall Plane Fusion algorithm presented finds correspondences between walls of the floorplan and planar structures present in the 3D point cloud. In order to extract planes from point clouds that contain scale ambiguity, the Scale Invariant Planar RANSAC (SIPR) algorithm was developed. The best wall–plane correspondence is used as an external constraint to a custom Bundle Adjustment optimization which refines the motion estimation solution and enforces a global scale solution. A necessary condition is that only one wall needs to be in view. The feasibility of using the algorithms is tested with synthetic and real-world data; extensive testing is performed in an indoor simulation environment using the Unreal Engine and Microsoft Airsim. The system performs consistently across all three types of data. The tests presented in this paper show that the standard deviation of the error did not exceed 6 cm.

Download Full-text

Efficient Coarse Registration of Pairwise TLS Point Clouds Using Ortho Projected Feature Images

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9040255 ◽

2020 ◽

Vol 9 (4) ◽

pp. 255

Author(s):

Hua Liu ◽

Xiaoming Zhang ◽

Yuancheng Xu ◽

Xiaoyong Chen

Keyword(s):

Parameter Estimation ◽

Degrees Of Freedom ◽

High Efficiency ◽

Point Clouds ◽

Azimuth Angle ◽

Translation Vector ◽

Scale Invariant ◽

Registration Method ◽

Coarse Registration ◽

Vertical Translation

The degree of automation and efficiency are among the most important factors that influence the availability of Terrestrial light detection and ranging (LiDAR) Scanning (TLS) registration algorithms. This paper proposes an Ortho Projected Feature Images (OPFI) based 4 Degrees of Freedom (DOF) coarse registration method, which is fully automated and with high efficiency, for TLS point clouds acquired using leveled or inclination compensated LiDAR scanners. The proposed 4DOF registration algorithm decomposes the parameter estimation into two parts: (1) the parameter estimation of horizontal translation vector and azimuth angle; and (2) the parameter estimation of the vertical translation vector. The parameter estimation of the horizontal translation vector and the azimuth angle is achieved by ortho projecting the TLS point clouds into feature images and registering the ortho projected feature images by Scale Invariant Feature Transform (SIFT) key points and descriptors. The vertical translation vector is estimated using the height difference of source points and target points in the overlapping regions after horizontally aligned. Three real TLS datasets captured by the Riegl VZ-400 and the Trimble SX10 and one simulated dataset were used to validate the proposed method. The proposed method was compared with four state-of-the-art 4DOF registration methods. The experimental results showed that: (1) the accuracy of the proposed coarse registration method ranges from 0.02 m to 0.07 m in horizontal and 0.01 m to 0.02 m in elevation, which is at centimeter-level and sufficient for fine registration; and (2) as many as 120 million points can be registered in less than 50 s, which is much faster than the compared methods.

Download Full-text

Similarity Index Based Approach for Identifying Similar Grotto Statues to Support Virtual Restoration

Remote Sensing ◽

10.3390/rs13061201 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1201

Author(s):

Wei Hua ◽

Miaole Hou ◽

Yunfei Qiao ◽

Xuesheng Zhao ◽

Shishuo Xu ◽

...

Keyword(s):

Three Dimensional ◽

Similarity Index ◽

Point Clouds ◽

Scale Invariant ◽

Artistic Style ◽

Reference Information ◽

3D Point Clouds ◽

Recognition Condition ◽

Yungang Grottoes ◽

Scale Invariant Feature

Grottoes, with caves and statues, are an important part of immovable heritage. Statues in a particular grotto setting are often similar in geometric form and artistic style, and identifying the similarity between these statues can help provide important references for value recognition, condition assessment, repair, and the virtual restoration of statues. Traditionally, such reference information mainly depended on expert empirical judgment, which is highly subjective, lacks quantitative analysis, and cannot provide effective scientific support for the virtual restoration of grotto statues. This paper presents a similarity index based approach for identifying similarities between grotto statues by studying 11 small Buddhist statues carved on the 18th cave in the Yungang Grottoes, located in Datong, China. The similarity index is determined according to the hash values calculated based on the pHash method using the orthophoto images of Buddhist statues to identify similar statues. Similar feature points between the identified statues are then matched using the Scale Invariant Feature Transform (SIFT) operator to support the repair and reconstruction of damaged statues. The experimental results show that the variation of similarity index values confirms the visual inspection of the statues’ appearance in the orthophotos. The additional analysis of three-dimensional (3D) point clouds also confirms that the similarity index based approach is accurate in the initial screening of similar grotto statues.

Download Full-text

FINDING A GOOD FEATURE DETECTOR-DESCRIPTOR COMBINATION FOR THE 2D KEYPOINT-BASED REGISTRATION OF TLS POINT CLOUDS

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsannals-ii-3-w5-121-2015 ◽

2015 ◽

Vol II-3/W5 ◽

pp. 121-128 ◽

Cited By ~ 16

Author(s):

S. Urban ◽

M. Weinmann

Keyword(s):

Point Cloud ◽

Laser Scanning ◽

Point Clouds ◽

Scale Invariant ◽

Point Cloud Registration ◽

Speeded Up Robust Features ◽

Extensive Evaluation ◽

Feature Correspondences ◽

Intensity Images ◽

Keypoint Detector

The automatic and accurate registration of terrestrial laser scanning (TLS) data is a topic of great interest in the domains of city modeling, construction surveying or cultural heritage. While numerous of the most recent approaches focus on keypoint-based point cloud registration relying on forward-projected 2D keypoints detected in panoramic intensity images, little attention has been paid to the selection of appropriate keypoint detector-descriptor combinations. Instead, keypoints are commonly detected and described by applying well-known methods such as the Scale Invariant Feature Transform (SIFT) or Speeded-Up Robust Features (SURF). In this paper, we present a framework for evaluating the influence of different keypoint detector-descriptor combinations on the results of point cloud registration. For this purpose, we involve five different approaches for extracting local features from the panoramic intensity images and exploit the range information of putative feature correspondences in order to define bearing vectors which, in turn, may be exploited to transfer the task of point cloud registration from the object space to the observation space. With an extensive evaluation of our framework on a standard benchmark TLS dataset, we clearly demonstrate that replacing SIFT and SURF detectors and descriptors by more recent approaches significantly alleviates point cloud registration in terms of accuracy, efficiency and robustness.

Download Full-text

ALIGNMENT OF POINT CLOUD DSMs FROM TLS AND UAV PLATFORMS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xl-1-w4-369-2015 ◽

2015 ◽

Vol XL-1/W4 ◽

pp. 369-373 ◽

Cited By ~ 5

Author(s):

R. A. Persad ◽

C. Armenakis

Keyword(s):

Unmanned Aerial Vehicles ◽

Point Cloud ◽

Point Clouds ◽

Regions Of Interest ◽

Significant Progress ◽

Scale Invariant ◽

Sensory Data ◽

3D Point Clouds ◽

Laser Scanners ◽

Reference Parameter

The co-registration of 3D point clouds has received considerable attention from various communities, particularly those in photogrammetry, computer graphics and computer vision. Although significant progress has been made, various challenges such as coarse alignment using multi-sensory data with different point densities and minimal overlap still exist. There is a need to address such data integration issues, particularly with the advent of new data collection platforms such as the unmanned aerial vehicles (UAVs). In this study, we propose an approach to align 3D point clouds derived photogrammetrically from UAV approximately vertical images with point clouds measured by terrestrial laser scanners (TLS). The method begins by automatically extracting 3D surface keypoints on both point cloud datasets. Afterwards, regions of interest around each keypoint are established to facilitate the establishment of scale-invariant descriptors for each of them. We use the popular SURF descriptor for matching the keypoints. In our experiments, we report the accuracies of the automatically derived transformation parameters in comparison to manually-derived reference parameter data.

Download Full-text

ADAPTION OF INVARIANT FEATURES IN IMAGE FOR POINT CLOUDS REGISTRATION

Jurnal Teknologi ◽

10.11113/jt.v75.5279 ◽

2015 ◽

Vol 75 (10) ◽

Author(s):

Mohd Azwan Abbas ◽

Halim Setan ◽

Zulkepli Majid ◽

Albert K. Chong ◽

Lau Chong Luh ◽

...

Keyword(s):

Laser Scanner ◽

Point Clouds ◽

Scale Invariant ◽

Registration Method ◽

Flat Surfaces ◽

Invariant Features ◽

Limited Class ◽

Coarse Registration ◽

Straight Lines ◽

Scale Invariant Feature

Currently, coarse registration methods for scanner are required heavy operator intervention either before or after scanning process. There also have an automatic registration method but only applicable to a limited class of objects (e.g. straight lines and flat surfaces). This study is devoted to a search of a computationally feasible automatic coarse registration method with a broad range of applicability. Nowadays, most laser scanner systems are supplied with a camera, such that the scanned data can also be photographed. The proposed approach will exploit the invariant features detected from image to associate point cloud registration. Three types of detectors are included: scale invariant feature transform (SIFT), 2) Harris affine, and 3) maximally stable extremal regions (MSER). All detected features will transform into the laser scanner coordinate system, and their performance is measured based on the number of corresponding points. Several objects with different observation techniques were performed to evaluate the capability of proposed approach and also to evaluate the performance of selected detectors.

Download Full-text

A WEIGHTED CLOSED-FORM SOLUTION FOR RGB-D DATA REGISTRATION

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xli-b3-403-2016 ◽

2016 ◽

Vol XLI-B3 ◽

pp. 403-409 ◽

Cited By ~ 2

Author(s):

K. M. Vestena ◽

D. R. Dos Santos ◽

E. M. Oilveira Jr. ◽

N. L. Pavan ◽

K. Khoshelham

Keyword(s):

Closed Form ◽

Closed Form Solution ◽

Point Clouds ◽

Form Solution ◽

Body Motion ◽

Random Errors ◽

Scale Invariant ◽

Loop Closure ◽

Dual Number ◽

Data Registration

Existing 3D indoor mapping of RGB-D data are prominently point-based and feature-based methods. In most cases iterative closest point (ICP) and its variants are generally used for pairwise registration process. Considering that the ICP algorithm requires an relatively accurate initial transformation and high overlap a weighted closed-form solution for RGB-D data registration is proposed. In this solution, we weighted and normalized the 3D points based on the theoretical random errors and the dual-number quaternions are used to represent the 3D rigid body motion. Basically, dual-number quaternions provide a closed-form solution by minimizing a cost function. The most important advantage of the closed-form solution is that it provides the optimal transformation in one-step, it does not need to calculate good initial estimates and expressively decreases the demand for computer resources in contrast to the iterative method. Basically, first our method exploits RGB information. We employed a scale invariant feature transformation (SIFT) for extracting, detecting, and matching features. It is able to detect and describe local features that are invariant to scaling and rotation. To detect and filter outliers, we used random sample consensus (RANSAC) algorithm, jointly with an statistical dispersion called interquartile range (IQR). After, a new RGB-D loop-closure solution is implemented based on the volumetric information between pair of point clouds and the dispersion of the random errors. The loop-closure consists to recognize when the sensor revisits some region. Finally, a globally consistent map is created to minimize the registration errors via a graph-based optimization. The effectiveness of the proposed method is demonstrated with a Kinect dataset. The experimental results show that the proposed method can properly map the indoor environment with an absolute accuracy around 1.5% of the travel of a trajectory.

Download Full-text

A WEIGHTED CLOSED-FORM SOLUTION FOR RGB-D DATA REGISTRATION

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xli-b3-403-2016 ◽

2016 ◽

Vol XLI-B3 ◽

pp. 403-409

Author(s):

K. M. Vestena ◽

D. R. Dos Santos ◽

E. M. Oilveira Jr. ◽

N. L. Pavan ◽

K. Khoshelham

Keyword(s):

Closed Form ◽

Closed Form Solution ◽

Point Clouds ◽

Form Solution ◽

Body Motion ◽

Random Errors ◽

Scale Invariant ◽

Loop Closure ◽

Dual Number ◽

Data Registration

Existing 3D indoor mapping of RGB-D data are prominently point-based and feature-based methods. In most cases iterative closest point (ICP) and its variants are generally used for pairwise registration process. Considering that the ICP algorithm requires an relatively accurate initial transformation and high overlap a weighted closed-form solution for RGB-D data registration is proposed. In this solution, we weighted and normalized the 3D points based on the theoretical random errors and the dual-number quaternions are used to represent the 3D rigid body motion. Basically, dual-number quaternions provide a closed-form solution by minimizing a cost function. The most important advantage of the closed-form solution is that it provides the optimal transformation in one-step, it does not need to calculate good initial estimates and expressively decreases the demand for computer resources in contrast to the iterative method. Basically, first our method exploits RGB information. We employed a scale invariant feature transformation (SIFT) for extracting, detecting, and matching features. It is able to detect and describe local features that are invariant to scaling and rotation. To detect and filter outliers, we used random sample consensus (RANSAC) algorithm, jointly with an statistical dispersion called interquartile range (IQR). After, a new RGB-D loop-closure solution is implemented based on the volumetric information between pair of point clouds and the dispersion of the random errors. The loop-closure consists to recognize when the sensor revisits some region. Finally, a globally consistent map is created to minimize the registration errors via a graph-based optimization. The effectiveness of the proposed method is demonstrated with a Kinect dataset. The experimental results show that the proposed method can properly map the indoor environment with an absolute accuracy around 1.5% of the travel of a trajectory.

Download Full-text