scholarly journals Comparing Seven Methodogies for Rigid Alignment of Point Clouds with Focus on Frame-to-Frame Registration in Depth Sequences

2018 ◽  
Vol 9 (2) ◽  
pp. 1
Author(s):  
Fernando Akio Yamada ◽  
Gilson Antonio Giraldi ◽  
Marcelo Bernardes Vieira ◽  
Liliane Rodrigues Almeida ◽  
Antonio Lopes Apolinário Jr.

Pairwise rigid registration aims to find the rigid transformation that best registers two surfaces represented by point clouds. This work presents a comparison between seven algorithms, with different strategies to tackle rigid registration tasks. We focus on the frame-to-frame problem, in which the point clouds are extracted from a video sequence with depth information generating partial overlapping 3D data. We use both point clouds and RGB-D video streams in the experimental results. The former is considered under different viewpoints with the addition of a case-study simulating missing data. Since the ground truth rotation is provided, we discuss four different metrics to measure the rotation error in this case. Among the seven considered techniques, the Sparse ICP and Sparse ICP-CTSF outperform the other five ones in the point cloud registration experiments without considering incomplete data. However, the evaluation facing missing data indicates sensitivity for these methods against this problem and favors ICP-CTSF in such situations. In the tests with video sequences, the depth information is segmented in the first step, to get the target region. Next, the registration algorithms are applied and the average root mean squared error, rotation and translation errors are computed. Besides, we analyze the robustness of the algorithms against spatial and temporal sampling rates. We conclude from the experiments using a depth video sequences that ICP-CTSF is the best technique for frame-to-frame registration.

2019 ◽  
Vol 11 (12) ◽  
pp. 1471 ◽  
Author(s):  
Grazia Tucci ◽  
Antonio Gebbia ◽  
Alessandro Conti ◽  
Lidia Fiorini ◽  
Claudio Lubello

The monitoring and metric assessment of piles of natural or man-made materials plays a fundamental role in the production and management processes of multiple activities. Over time, the monitoring techniques have undergone an evolution linked to the progress of measure and data processing techniques; starting from classic topography to global navigation satellite system (GNSS) technologies up to the current survey systems like laser scanner and close-range photogrammetry. Last-generation 3D data management software allow for the processing of increasingly truer high-resolution 3D models. This study shows the results of a test for the monitoring and computing of stockpile volumes of material coming from the differentiated waste collection inserted in the recycling chain, performed by means of an unmanned aerial vehicle (UAV) photogrammetric survey and the generation of 3D models starting from point clouds. The test was carried out with two UAV flight sessions, with vertical and oblique camera configurations, and using a terrestrial laser scanner for measuring the ground control points and as ground truth for testing the two survey configurations. The computations of the volumes were carried out using two software and comparisons were made both with reference to the different survey configurations and to the computation software.


Author(s):  
A. Torresani ◽  
F. Remondino

<p><strong>Abstract.</strong> In the last years we are witnessing an increasing quality (and quantity) of video streams and a growing capability of SLAM-based methods to derive 3D data from video. Video sequences can be easily acquired by non-expert surveyors and possibly used for 3D documentation purposes. The aim of the paper is to evaluate the possibility to perform 3D reconstructions of heritage scenarios using videos ("videogrammetry"), e.g. acquired with smartphones. Video frames are extracted from the sequence using a fixed-time interval and two advanced methods. Frames are then processed applying automated image orientation / Structure from Motion (SfM) and dense image matching / Multi-View Stereo (MVS) methods. Obtained 3D dense point clouds are the visually validated as well as compared with photogrammetric ground truth archived acquiring image with a reflex camera or analysing 3D data's noise on flat surfaces.</p>


Author(s):  
E. Pellis ◽  
A. Masiero ◽  
G. Tucci ◽  
M. Betti ◽  
P. Grussenmeyer

Abstract. Creating three-dimensional as-built models from point clouds is still a challenging task in the Cultural Heritage environment. Nowadays, performing such task typically requires the quite time-consuming manual intervention of an expert operator, in particular to deal with the complexities and peculiarities of heritage buildings. Motivated by these considerations, the development of automatic or semi-automatic tools to ease the completion of such task has recently became a very hot topic in the research community. Among the tools that can be considered to such aim, the use of deep learning methods for the semantic segmentation and classification of 2D and 3D data seems to be one of the most promising approaches. Indeed, these kinds of methods have already been successfully applied in several applications enabling scene understanding and comprehension, and, in particular, to ease the process of geometrical and informative model creation. Nevertheless, their use in the specific case of heritage buildings is still quite limited, and the already published results not completely satisfactory. The quite limited availability of dedicated benchmarks for the considered task in the heritage context can also be one of the factors for the not so satisfying results in the literature.Hence, this paper aims at partially reducing the issues related to the limited availability of benchmarks in the heritage context by presenting a new dataset for semantic segmentation of heritage buildings. The dataset is composed by both images and point clouds of the considered buildings, in order to enable the implementation, validation and comparison of both point-based and multiview-based semantic segmentation approaches. Ground truth segmentation is provided, for both the images and point clouds related to each building, according to the class definition used in the ARCHdataset, hence potentially enabling also the integration and comparison of the results obtained on such dataset.


2021 ◽  
Vol 18 (1) ◽  
pp. 172988142199332
Author(s):  
Xintao Ding ◽  
Boquan Li ◽  
Jinbao Wang

Indoor object detection is a very demanding and important task for robot applications. Object knowledge, such as two-dimensional (2D) shape and depth information, may be helpful for detection. In this article, we focus on region-based convolutional neural network (CNN) detector and propose a geometric property-based Faster R-CNN method (GP-Faster) for indoor object detection. GP-Faster incorporates geometric property in Faster R-CNN to improve the detection performance. In detail, we first use mesh grids that are the intersections of direct and inverse proportion functions to generate appropriate anchors for indoor objects. After the anchors are regressed to the regions of interest produced by a region proposal network (RPN-RoIs), we then use 2D geometric constraints to refine the RPN-RoIs, in which the 2D constraint of every classification is a convex hull region enclosing the width and height coordinates of the ground-truth boxes on the training set. Comparison experiments are implemented on two indoor datasets SUN2012 and NYUv2. Since the depth information is available in NYUv2, we involve depth constraints in GP-Faster and propose 3D geometric property-based Faster R-CNN (DGP-Faster) on NYUv2. The experimental results show that both GP-Faster and DGP-Faster increase the performance of the mean average precision.


Drones ◽  
2021 ◽  
Vol 5 (2) ◽  
pp. 37
Author(s):  
Bingsheng Wei ◽  
Martin Barczyk

We consider the problem of vision-based detection and ranging of a target UAV using the video feed from a monocular camera onboard a pursuer UAV. Our previously published work in this area employed a cascade classifier algorithm to locate the target UAV, which was found to perform poorly in complex background scenes. We thus study the replacement of the cascade classifier algorithm with newer machine learning-based object detection algorithms. Five candidate algorithms are implemented and quantitatively tested in terms of their efficiency (measured as frames per second processing rate), accuracy (measured as the root mean squared error between ground truth and detected location), and consistency (measured as mean average precision) in a variety of flight patterns, backgrounds, and test conditions. Assigning relative weights of 20%, 40% and 40% to these three criteria, we find that when flying over a white background, the top three performers are YOLO v2 (76.73 out of 100), Faster RCNN v2 (63.65 out of 100), and Tiny YOLO (59.50 out of 100), while over a realistic background, the top three performers are Faster RCNN v2 (54.35 out of 100, SSD MobileNet v1 (51.68 out of 100) and SSD Inception v2 (50.72 out of 100), leading us to recommend Faster RCNN v2 as the recommended solution. We then provide a roadmap for further work in integrating the object detector into our vision-based UAV tracking system.


Author(s):  
E. Grilli ◽  
E. M. Farella ◽  
A. Torresani ◽  
F. Remondino

<p><strong>Abstract.</strong> In the last years, the application of artificial intelligence (Machine Learning and Deep Learning methods) for the classification of 3D point clouds has become an important task in modern 3D documentation and modelling applications. The identification of proper geometric and radiometric features becomes fundamental to classify 2D/3D data correctly. While many studies have been conducted in the geospatial field, the cultural heritage sector is still partly unexplored. In this paper we analyse the efficacy of the geometric covariance features as a support for the classification of Cultural Heritage point clouds. To analyse the impact of the different features calculated on spherical neighbourhoods at various radius sizes, we present results obtained on four different heritage case studies using different features configurations.</p>


2010 ◽  
Vol 1 (4) ◽  
pp. 17-45
Author(s):  
Antons Rebguns ◽  
Diana F. Spears ◽  
Richard Anderson-Sprecher ◽  
Aleksey Kletsov

This paper presents a novel theoretical framework for swarms of agents. Before deploying a swarm for a task, it is advantageous to predict whether a desired percentage of the swarm will succeed. The authors present a framework that uses a small group of expendable “scout” agents to predict the success probability of the entire swarm, thereby preventing many agent losses. The scouts apply one of two formulas to predict – the standard Bernoulli trials formula or the new Bayesian formula. For experimental evaluation, the framework is applied to simulated agents navigating around obstacles to reach a goal location. Extensive experimental results compare the mean-squared error of the predictions of both formulas with ground truth, under varying circumstances. Results indicate the accuracy and robustness of the Bayesian approach. The framework also yields an intriguing result, namely, that both formulas usually predict better in the presence of (Lennard-Jones) inter-agent forces than when their independence assumptions hold.


Sensors ◽  
2019 ◽  
Vol 19 (3) ◽  
pp. 563 ◽  
Author(s):  
J. Osuna-Coutiño ◽  
Jose Martinez-Carranza

High-Level Structure (HLS) extraction in a set of images consists of recognizing 3D elements with useful information to the user or application. There are several approaches to HLS extraction. However, most of these approaches are based on processing two or more images captured from different camera views or on processing 3D data in the form of point clouds extracted from the camera images. In contrast and motivated by the extensive work developed for the problem of depth estimation in a single image, where parallax constraints are not required, in this work, we propose a novel methodology towards HLS extraction from a single image with promising results. For that, our method has four steps. First, we use a CNN to predict the depth for a single image. Second, we propose a region-wise analysis to refine depth estimates. Third, we introduce a graph analysis to segment the depth in semantic orientations aiming at identifying potential HLS. Finally, the depth sections are provided to a new CNN architecture that predicts HLS in the shape of cubes and rectangular parallelepipeds.


Sign in / Sign up

Export Citation Format

Share Document