scholarly journals Scale-Invariant Vote-Based 3D Recognition and Registration from Point Clouds

Author(s):  
Minh-Tri Pham ◽  
Oliver J. Woodford ◽  
Frank Perbet ◽  
Atsuto Maki ◽  
Riccardo Gherardi ◽  
...  
Keyword(s):  
Author(s):  
Xin Zhao ◽  
Zhe Liu ◽  
Ruolan Hu ◽  
Kaiqi Huang

3D object detection plays an important role in a large number of real-world applications. It requires us to estimate the localizations and the orientations of 3D objects in real scenes. In this paper, we present a new network architecture which focuses on utilizing the front view images and frustum point clouds to generate 3D detection results. On the one hand, a PointSIFT module is utilized to improve the performance of 3D segmentation. It can capture the information from different orientations in space and the robustness to different scale shapes. On the other hand, our network obtains the useful features and suppresses the features with less information by a SENet module. This module reweights channel features and estimates the 3D bounding boxes more effectively. Our method is evaluated on both KITTI dataset for outdoor scenes and SUN-RGBD dataset for indoor scenes. The experimental results illustrate that our method achieves better performance than the state-of-the-art methods especially when point clouds are highly sparse.


Sensors ◽  
2019 ◽  
Vol 19 (3) ◽  
pp. 634 ◽  
Author(s):  
John Noonan ◽  
Hector Rotstein ◽  
Amir Geva ◽  
Ehud Rivlin

This paper presents a global monocular indoor positioning system for a robotic vehicle starting from a known pose. The proposed system does not depend on a dense 3D map, require prior environment exploration or installation, or rely on the scene remaining the same, photometrically or geometrically. The approach presents a new way of providing global positioning relying on the sparse knowledge of the building floorplan by utilizing special algorithms to resolve the unknown scale through wall–plane association. This Wall Plane Fusion algorithm presented finds correspondences between walls of the floorplan and planar structures present in the 3D point cloud. In order to extract planes from point clouds that contain scale ambiguity, the Scale Invariant Planar RANSAC (SIPR) algorithm was developed. The best wall–plane correspondence is used as an external constraint to a custom Bundle Adjustment optimization which refines the motion estimation solution and enforces a global scale solution. A necessary condition is that only one wall needs to be in view. The feasibility of using the algorithms is tested with synthetic and real-world data; extensive testing is performed in an indoor simulation environment using the Unreal Engine and Microsoft Airsim. The system performs consistently across all three types of data. The tests presented in this paper show that the standard deviation of the error did not exceed 6 cm.


2020 ◽  
Vol 9 (4) ◽  
pp. 255
Author(s):  
Hua Liu ◽  
Xiaoming Zhang ◽  
Yuancheng Xu ◽  
Xiaoyong Chen

The degree of automation and efficiency are among the most important factors that influence the availability of Terrestrial light detection and ranging (LiDAR) Scanning (TLS) registration algorithms. This paper proposes an Ortho Projected Feature Images (OPFI) based 4 Degrees of Freedom (DOF) coarse registration method, which is fully automated and with high efficiency, for TLS point clouds acquired using leveled or inclination compensated LiDAR scanners. The proposed 4DOF registration algorithm decomposes the parameter estimation into two parts: (1) the parameter estimation of horizontal translation vector and azimuth angle; and (2) the parameter estimation of the vertical translation vector. The parameter estimation of the horizontal translation vector and the azimuth angle is achieved by ortho projecting the TLS point clouds into feature images and registering the ortho projected feature images by Scale Invariant Feature Transform (SIFT) key points and descriptors. The vertical translation vector is estimated using the height difference of source points and target points in the overlapping regions after horizontally aligned. Three real TLS datasets captured by the Riegl VZ-400 and the Trimble SX10 and one simulated dataset were used to validate the proposed method. The proposed method was compared with four state-of-the-art 4DOF registration methods. The experimental results showed that: (1) the accuracy of the proposed coarse registration method ranges from 0.02 m to 0.07 m in horizontal and 0.01 m to 0.02 m in elevation, which is at centimeter-level and sufficient for fine registration; and (2) as many as 120 million points can be registered in less than 50 s, which is much faster than the compared methods.


2021 ◽  
Vol 13 (6) ◽  
pp. 1201
Author(s):  
Wei Hua ◽  
Miaole Hou ◽  
Yunfei Qiao ◽  
Xuesheng Zhao ◽  
Shishuo Xu ◽  
...  

Grottoes, with caves and statues, are an important part of immovable heritage. Statues in a particular grotto setting are often similar in geometric form and artistic style, and identifying the similarity between these statues can help provide important references for value recognition, condition assessment, repair, and the virtual restoration of statues. Traditionally, such reference information mainly depended on expert empirical judgment, which is highly subjective, lacks quantitative analysis, and cannot provide effective scientific support for the virtual restoration of grotto statues. This paper presents a similarity index based approach for identifying similarities between grotto statues by studying 11 small Buddhist statues carved on the 18th cave in the Yungang Grottoes, located in Datong, China. The similarity index is determined according to the hash values calculated based on the pHash method using the orthophoto images of Buddhist statues to identify similar statues. Similar feature points between the identified statues are then matched using the Scale Invariant Feature Transform (SIFT) operator to support the repair and reconstruction of damaged statues. The experimental results show that the variation of similarity index values confirms the visual inspection of the statues’ appearance in the orthophotos. The additional analysis of three-dimensional (3D) point clouds also confirms that the similarity index based approach is accurate in the initial screening of similar grotto statues.


Author(s):  
S. Urban ◽  
M. Weinmann

The automatic and accurate registration of terrestrial laser scanning (TLS) data is a topic of great interest in the domains of city modeling, construction surveying or cultural heritage. While numerous of the most recent approaches focus on keypoint-based point cloud registration relying on forward-projected 2D keypoints detected in panoramic intensity images, little attention has been paid to the selection of appropriate keypoint detector-descriptor combinations. Instead, keypoints are commonly detected and described by applying well-known methods such as the Scale Invariant Feature Transform (SIFT) or Speeded-Up Robust Features (SURF). In this paper, we present a framework for evaluating the influence of different keypoint detector-descriptor combinations on the results of point cloud registration. For this purpose, we involve five different approaches for extracting local features from the panoramic intensity images and exploit the range information of putative feature correspondences in order to define bearing vectors which, in turn, may be exploited to transfer the task of point cloud registration from the object space to the observation space. With an extensive evaluation of our framework on a standard benchmark TLS dataset, we clearly demonstrate that replacing SIFT and SURF detectors and descriptors by more recent approaches significantly alleviates point cloud registration in terms of accuracy, efficiency and robustness.


Author(s):  
R. A. Persad ◽  
C. Armenakis

The co-registration of 3D point clouds has received considerable attention from various communities, particularly those in photogrammetry, computer graphics and computer vision. Although significant progress has been made, various challenges such as coarse alignment using multi-sensory data with different point densities and minimal overlap still exist. There is a need to address such data integration issues, particularly with the advent of new data collection platforms such as the unmanned aerial vehicles (UAVs). In this study, we propose an approach to align 3D point clouds derived photogrammetrically from UAV approximately vertical images with point clouds measured by terrestrial laser scanners (TLS). The method begins by automatically extracting 3D surface keypoints on both point cloud datasets. Afterwards, regions of interest around each keypoint are established to facilitate the establishment of scale-invariant descriptors for each of them. We use the popular SURF descriptor for matching the keypoints. In our experiments, we report the accuracies of the automatically derived transformation parameters in comparison to manually-derived reference parameter data.


2015 ◽  
Vol 75 (10) ◽  
Author(s):  
Mohd Azwan Abbas ◽  
Halim Setan ◽  
Zulkepli Majid ◽  
Albert K. Chong ◽  
Lau Chong Luh ◽  
...  

Currently, coarse registration methods for scanner are required heavy operator intervention either before or after scanning process. There also have an automatic registration method but only applicable to a limited class of objects (e.g. straight lines and flat surfaces). This study is devoted to a search of a computationally feasible automatic coarse registration method with a broad range of applicability. Nowadays, most laser scanner systems are supplied with a camera, such that the scanned data can also be photographed. The proposed approach will exploit the invariant features detected from image to associate point cloud registration. Three types of detectors are included: scale invariant feature transform (SIFT), 2) Harris affine, and 3) maximally stable extremal regions (MSER). All detected features will transform into the laser scanner coordinate system, and their performance is measured based on the number of corresponding points. Several objects with different observation techniques were performed to evaluate the capability of proposed approach and also to evaluate the performance of selected detectors.  


Author(s):  
K. M. Vestena ◽  
D. R. Dos Santos ◽  
E. M. Oilveira Jr. ◽  
N. L. Pavan ◽  
K. Khoshelham

Existing 3D indoor mapping of RGB-D data are prominently point-based and feature-based methods. In most cases iterative closest point (ICP) and its variants are generally used for pairwise registration process. Considering that the ICP algorithm requires an relatively accurate initial transformation and high overlap a weighted closed-form solution for RGB-D data registration is proposed. In this solution, we weighted and normalized the 3D points based on the theoretical random errors and the dual-number quaternions are used to represent the 3D rigid body motion. Basically, dual-number quaternions provide a closed-form solution by minimizing a cost function. The most important advantage of the closed-form solution is that it provides the optimal transformation in one-step, it does not need to calculate good initial estimates and expressively decreases the demand for computer resources in contrast to the iterative method. Basically, first our method exploits RGB information. We employed a scale invariant feature transformation (SIFT) for extracting, detecting, and matching features. It is able to detect and describe local features that are invariant to scaling and rotation. To detect and filter outliers, we used random sample consensus (RANSAC) algorithm, jointly with an statistical dispersion called interquartile range (IQR). After, a new RGB-D loop-closure solution is implemented based on the volumetric information between pair of point clouds and the dispersion of the random errors. The loop-closure consists to recognize when the sensor revisits some region. Finally, a globally consistent map is created to minimize the registration errors via a graph-based optimization. The effectiveness of the proposed method is demonstrated with a Kinect dataset. The experimental results show that the proposed method can properly map the indoor environment with an absolute accuracy around 1.5% of the travel of a trajectory.


Author(s):  
K. M. Vestena ◽  
D. R. Dos Santos ◽  
E. M. Oilveira Jr. ◽  
N. L. Pavan ◽  
K. Khoshelham

Existing 3D indoor mapping of RGB-D data are prominently point-based and feature-based methods. In most cases iterative closest point (ICP) and its variants are generally used for pairwise registration process. Considering that the ICP algorithm requires an relatively accurate initial transformation and high overlap a weighted closed-form solution for RGB-D data registration is proposed. In this solution, we weighted and normalized the 3D points based on the theoretical random errors and the dual-number quaternions are used to represent the 3D rigid body motion. Basically, dual-number quaternions provide a closed-form solution by minimizing a cost function. The most important advantage of the closed-form solution is that it provides the optimal transformation in one-step, it does not need to calculate good initial estimates and expressively decreases the demand for computer resources in contrast to the iterative method. Basically, first our method exploits RGB information. We employed a scale invariant feature transformation (SIFT) for extracting, detecting, and matching features. It is able to detect and describe local features that are invariant to scaling and rotation. To detect and filter outliers, we used random sample consensus (RANSAC) algorithm, jointly with an statistical dispersion called interquartile range (IQR). After, a new RGB-D loop-closure solution is implemented based on the volumetric information between pair of point clouds and the dispersion of the random errors. The loop-closure consists to recognize when the sensor revisits some region. Finally, a globally consistent map is created to minimize the registration errors via a graph-based optimization. The effectiveness of the proposed method is demonstrated with a Kinect dataset. The experimental results show that the proposed method can properly map the indoor environment with an absolute accuracy around 1.5% of the travel of a trajectory.


Sign in / Sign up

Export Citation Format

Share Document