scholarly journals AUTOMATED MOSAICKING OF MULTIPLE 3D POINT CLOUDS GENERATED FROM A DEPTH CAMERA

Author(s):  
H. Kim ◽  
W. Yoon ◽  
T. Kim

In this paper, we propose a method for automated mosaicking of multiple 3D point clouds generated from a depth camera. A depth camera generates depth data by using ToF (Time of Flight) method and intensity data by using intensity of returned signal. The depth camera used in this paper was a SR4000 from MESA Imaging. This camera generates a depth map and intensity map of 176 x 44 pixels. Generated depth map saves physical depth data with mm of precision. Generated intensity map contains texture data with many noises. We used texture maps for extracting tiepoints and depth maps for assigning z coordinates to tiepoints and point cloud mosaicking. There are four steps in the proposed mosaicking method. In the first step, we acquired multiple 3D point clouds by rotating depth camera and capturing data per rotation. In the second step, we estimated 3D-3D transformation relationships between subsequent point clouds. For this, 2D tiepoints were extracted automatically from the corresponding two intensity maps. They were converted into 3D tiepoints using depth maps. We used a 3D similarity transformation model for estimating the 3D-3D transformation relationships. In the third step, we converted local 3D-3D transformations into a global transformation for all point clouds with respect to a reference one. In the last step, the extent of single depth map mosaic was calculated and depth values per mosaic pixel were determined by a ray tracing method. For experiments, 8 depth maps and intensity maps were used. After the four steps, an output mosaicked depth map of 454x144 was generated. It is expected that the proposed method would be useful for developing an effective 3D indoor mapping method in future.

Author(s):  
H. Kim ◽  
W. Yoon ◽  
T. Kim

In this paper, we propose a method for automated mosaicking of multiple 3D point clouds generated from a depth camera. A depth camera generates depth data by using ToF (Time of Flight) method and intensity data by using intensity of returned signal. The depth camera used in this paper was a SR4000 from MESA Imaging. This camera generates a depth map and intensity map of 176 x 44 pixels. Generated depth map saves physical depth data with mm of precision. Generated intensity map contains texture data with many noises. We used texture maps for extracting tiepoints and depth maps for assigning z coordinates to tiepoints and point cloud mosaicking. There are four steps in the proposed mosaicking method. In the first step, we acquired multiple 3D point clouds by rotating depth camera and capturing data per rotation. In the second step, we estimated 3D-3D transformation relationships between subsequent point clouds. For this, 2D tiepoints were extracted automatically from the corresponding two intensity maps. They were converted into 3D tiepoints using depth maps. We used a 3D similarity transformation model for estimating the 3D-3D transformation relationships. In the third step, we converted local 3D-3D transformations into a global transformation for all point clouds with respect to a reference one. In the last step, the extent of single depth map mosaic was calculated and depth values per mosaic pixel were determined by a ray tracing method. For experiments, 8 depth maps and intensity maps were used. After the four steps, an output mosaicked depth map of 454x144 was generated. It is expected that the proposed method would be useful for developing an effective 3D indoor mapping method in future.


Author(s):  
Guoqiang Chen ◽  
Zhuangzhuang Mao ◽  
Huailong Yi ◽  
Xiaofeng Li ◽  
Bingxin Bai ◽  
...  

Object detection is a crucial task of autonomous driving. This paper addresses an effective algorithm for pedestrian detection of the panoramic depth map transformed from the 3D-LiDAR data. Firstly, the 3D point clouds are transformed into panoramic depth maps, and then the panoramic depth maps are enhanced. Secondly, the grounds of the 3D point clouds are removed. The remaining point clouds are clustered, filtered and projected onto the previously generated panoramic depth maps, and new panoramic depth maps are obtained. Finally, the new panoramic depth maps are jointed to generate depth maps with different sizes, which are used as input of the improved PVANET for pedestrian detection. The 2D image of the panoramic depth map applied to the proposed algorithm is transformed from 3D point cloud, effectively containing the panorama of the sensor, and is more suitable for the environment perception of autonomous driving. Compared with the detection algorithm based on RGB images, the proposed algorithm cannot be affected by light, and can maintain the normal average precision of pedestrian detection at night. In order to increase the robustness of detecting small objects like pedestrians, the network structure based on the original PVANET is modified in this paper. A new dataset is built by processing the 3D-LiDAR data and the model trained on the new dataset perform well. The experimental results show that the proposed algorithm achieves high accuracy and robustness in pedestrian detection under different illumination conditions. Furthermore, when trained on the new dataset, the model exhibits average precision improvements of 2.8–5.1 % over the original PVANET, making it more suitable for autonomous driving applications.


Author(s):  
Caleb Beckwith ◽  
Shaojin Zhang ◽  
Sven K. Esche ◽  
Zhou Zhang

Abstract Bachelor of Technology (B.Tech) of robotics is a skill-oriented degree, and the students are usually not well-prepared both in theoretical knowledge and the opportunities to reach cutting-edge technologies. To overcome the above two difficulties, some challengeable projects are designed as the undergraduate projects of B.Tech of robotics. Among them, a SLAM robot spider is implemented. This project employed robotics vision, PID control, dynamics, kinematics, and additive manufacturing. Its structure is fabricated through additive manufacturing. The skeleton is composed of three main parts: six legs, torso, and head. Each leg has three joints which are driven by servo motors. The torso is used to mount the sensors, control modules, communication modules, and power source. The ‘NVIDIA Jetson Nano’ is used to control the motors, manage the communication interfaces, and process the sensing data. The ‘Intel RealSense depth camera’ and ’Intel RealSense tracking camera’ are used to futile the task of SLAM. The depth camera is used to acquire depth data to generate 3D point clouds. The tracking camera is an auxiliary reference to help to steer and to locate the position. Besides, an iPad tablet is used to provide a manual control option and render the scene in real-time.


Sensors ◽  
2020 ◽  
Vol 21 (1) ◽  
pp. 201
Author(s):  
Michael Bekele Maru ◽  
Donghwan Lee ◽  
Kassahun Demissie Tola ◽  
Seunghee Park

Modeling a structure in the virtual world using three-dimensional (3D) information enhances our understanding, while also aiding in the visualization, of how a structure reacts to any disturbance. Generally, 3D point clouds are used for determining structural behavioral changes. Light detection and ranging (LiDAR) is one of the crucial ways by which a 3D point cloud dataset can be generated. Additionally, 3D cameras are commonly used to develop a point cloud containing many points on the external surface of an object around it. The main objective of this study was to compare the performance of optical sensors, namely a depth camera (DC) and terrestrial laser scanner (TLS) in estimating structural deflection. We also utilized bilateral filtering techniques, which are commonly used in image processing, on the point cloud data for enhancing their accuracy and increasing the application prospects of these sensors in structure health monitoring. The results from these sensors were validated by comparing them with the outputs from a linear variable differential transformer sensor, which was mounted on the beam during an indoor experiment. The results showed that the datasets obtained from both the sensors were acceptable for nominal deflections of 3 mm and above because the error range was less than ±10%. However, the result obtained from the TLS were better than those obtained from the DC.


Author(s):  
W. Nguatem ◽  
M. Drauschke ◽  
H. Mayer

We present a workflow for the automatic generation of building models with levels of detail (LOD) 1 to 3 according to the CityGML standard (Gröger et al., 2012). We start with orienting unsorted image sets employing (Mayer et al., 2012), we compute depth maps using semi-global matching (SGM) (Hirschmüller, 2008), and fuse these depth maps to reconstruct dense 3D point clouds (Kuhn et al., 2014). Based on planes segmented from these point clouds, we have developed a stochastic method for roof model selection (Nguatem et al., 2013) and window model selection (Nguatem et al., 2014). We demonstrate our workflow up to the export into CityGML.


Sensors ◽  
2020 ◽  
Vol 20 (10) ◽  
pp. 2940 ◽  
Author(s):  
Kamil Sidor ◽  
Marian Wysocki

In this paper we propose a way of using depth maps transformed into 3D point clouds to classify human activities. The activities are described as time sequences of feature vectors based on the Viewpoint Feature Histogram descriptor (VFH) computed using the Point Cloud Library. Recognition is performed by two types of classifiers: (i) k-NN nearest neighbors’ classifier with Dynamic Time Warping measure, (ii) bidirectional long short-term memory (BiLSTM) deep learning networks. Reduction of classification time for the k-NN by introducing a two tier model and improvement of BiLSTM-based classification via transfer learning and combining multiple networks by fuzzy integral are discussed. Our classification results obtained on two representative datasets: University of Texas at Dallas Multimodal Human Action Dataset and Mining Software Repositories Action 3D Dataset are comparable or better than the current state of the art.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 944
Author(s):  
Stefano Pini ◽  
Guido Borghi ◽  
Roberto Vezzani ◽  
Davide Maltoni ◽  
Rita Cucchiara

Nowadays, we are witnessing the wide diffusion of active depth sensors. However, the generalization capabilities and performance of the deep face recognition approaches that are based on depth data are hindered by the different sensor technologies and the currently available depth-based datasets, which are limited in size and acquired through the same device. In this paper, we present an analysis on the use of depth maps, as obtained by active depth sensors and deep neural architectures for the face recognition task. We compare different depth data representations (depth and normal images, voxels, point clouds), deep models (two-dimensional and three-dimensional Convolutional Neural Networks, PointNet-based networks), and pre-processing and normalization techniques in order to determine the configuration that maximizes the recognition accuracy and is capable of generalizing better on unseen data and novel acquisition settings. Extensive intra- and cross-dataset experiments, which were performed on four public databases, suggest that representations and methods that are based on normal images and point clouds perform and generalize better than other 2D and 3D alternatives. Moreover, we propose a novel challenging dataset, namely MultiSFace, in order to specifically analyze the influence of the depth map quality and the acquisition distance on the face recognition accuracy.


Author(s):  
S. Song ◽  
R. Qin

Abstract. Image-based 3D modelling are rather mature nowadays with well-acquired images through standard photogrammetric processing pipeline, while fusing 3D dataset generated from images with different views for surface reconstruction remains to be a challenge. Meshing algorithms for image-based 3D dataset requires visibility information for surfaces and such information can be difficult to obtain for 3D point clouds generated from images with different views, sources, resolutions and uncertainties. In this paper, we propose a novel multi-source mesh reconstruction and texture mapping pipeline optimized to address such a challenge. Our key contributions are 1) we extended state-of-the-art image-based surface reconstruction method by incorporating geometric information produced by satellite images to create wide-area surface model. 2) We extended a texture mapping method to accommodate images acquired from different sensors, i.e. side-view perspective images and satellite images. Experiments show that our method creates conforming surface model from these two sources, as well as consistent and well-balanced textures from images with drastically different radiometry (satellite images vs. street-view level images). We compared our proposed pipeline with a typical fusion pipeline - Poisson reconstruction and the results show that our pipeline shows distinctive advantages.


2009 ◽  
Vol 89 (2-3) ◽  
pp. 152-176 ◽  
Author(s):  
Nick Pears ◽  
Tom Heseltine ◽  
Marcelo Romero

Sign in / Sign up

Export Citation Format

Share Document