scholarly journals SIDELOADING – INGESTION OF LARGE POINT CLOUDS INTO THE APACHE SPARK BIG DATA ENGINE

Author(s):  
J. Boehm ◽  
K. Liu ◽  
C. Alis

In the geospatial domain we have now reached the point where data volumes we handle have clearly grown beyond the capacity of most desktop computers. This is particularly true in the area of point cloud processing. It is therefore naturally lucrative to explore established big data frameworks for big geospatial data. The very first hurdle is the import of geospatial data into big data frameworks, commonly referred to as data ingestion. Geospatial data is typically encoded in specialised binary file formats, which are not naturally supported by the existing big data frameworks. Instead such file formats are supported by software libraries that are restricted to single CPU execution. We present an approach that allows the use of existing point cloud file format libraries on the Apache Spark big data framework. We demonstrate the ingestion of large volumes of point cloud data into a compute cluster. The approach uses a map function to distribute the data ingestion across the nodes of a cluster. We test the capabilities of the proposed method to load billions of points into a commodity hardware compute cluster and we discuss the implications on scalability and performance. The performance is benchmarked against an existing native Apache Spark data import implementation.

Author(s):  
J. Boehm ◽  
K. Liu ◽  
C. Alis

In the geospatial domain we have now reached the point where data volumes we handle have clearly grown beyond the capacity of most desktop computers. This is particularly true in the area of point cloud processing. It is therefore naturally lucrative to explore established big data frameworks for big geospatial data. The very first hurdle is the import of geospatial data into big data frameworks, commonly referred to as data ingestion. Geospatial data is typically encoded in specialised binary file formats, which are not naturally supported by the existing big data frameworks. Instead such file formats are supported by software libraries that are restricted to single CPU execution. We present an approach that allows the use of existing point cloud file format libraries on the Apache Spark big data framework. We demonstrate the ingestion of large volumes of point cloud data into a compute cluster. The approach uses a map function to distribute the data ingestion across the nodes of a cluster. We test the capabilities of the proposed method to load billions of points into a commodity hardware compute cluster and we discuss the implications on scalability and performance. The performance is benchmarked against an existing native Apache Spark data import implementation.


Author(s):  
C. Alis ◽  
J. Boehm ◽  
K. Liu

As laser scanning technology improves and costs are coming down, the amount of point cloud data being generated can be prohibitively difficult and expensive to process on a single machine. This data explosion is not only limited to point cloud data. Voluminous amounts of high-dimensionality and quickly accumulating data, collectively known as Big Data, such as those generated by social media, Internet of Things devices and commercial transactions, are becoming more prevalent as well. New computing paradigms and frameworks are being developed to efficiently handle the processing of Big Data, many of which utilize a compute cluster composed of several commodity grade machines to process chunks of data in parallel. <br><br> A central concept in many of these frameworks is data locality. By its nature, Big Data is large enough that the entire dataset would not fit on the memory and hard drives of a single node hence replicating the entire dataset to each worker node is impractical. The data must then be partitioned across worker nodes in a manner that minimises data transfer across the network. This is a challenge for point cloud data because there exist different ways to partition data and they may require data transfer. <br><br> We propose a partitioning based on <i>Z</i>-order which is a form of locality-sensitive hashing. The <i>Z</i>-order or Morton code is computed by dividing each dimension to form a grid then interleaving the binary representation of each dimension. For example, the <i>Z</i>-order code for the grid square with coordinates (<i>x</i> = 1 = 01<sub>2</sub>, <i>y</i> = 3 = 11<sub>2</sub>) is 1011<sub>2</sub> = 11. The number of points in each partition is controlled by the number of bits per dimension: the more bits, the fewer the points. The number of bits per dimension also controls the level of detail with more bits yielding finer partitioning. We present this partitioning method by implementing it on Apache Spark and investigating how different parameters affect the accuracy and running time of the <i>k</i> nearest neighbour algorithm for a hemispherical and a triangular wave point cloud.


Author(s):  
C. Alis ◽  
J. Boehm ◽  
K. Liu

As laser scanning technology improves and costs are coming down, the amount of point cloud data being generated can be prohibitively difficult and expensive to process on a single machine. This data explosion is not only limited to point cloud data. Voluminous amounts of high-dimensionality and quickly accumulating data, collectively known as Big Data, such as those generated by social media, Internet of Things devices and commercial transactions, are becoming more prevalent as well. New computing paradigms and frameworks are being developed to efficiently handle the processing of Big Data, many of which utilize a compute cluster composed of several commodity grade machines to process chunks of data in parallel. <br><br> A central concept in many of these frameworks is data locality. By its nature, Big Data is large enough that the entire dataset would not fit on the memory and hard drives of a single node hence replicating the entire dataset to each worker node is impractical. The data must then be partitioned across worker nodes in a manner that minimises data transfer across the network. This is a challenge for point cloud data because there exist different ways to partition data and they may require data transfer. <br><br> We propose a partitioning based on <i>Z</i>-order which is a form of locality-sensitive hashing. The <i>Z</i>-order or Morton code is computed by dividing each dimension to form a grid then interleaving the binary representation of each dimension. For example, the <i>Z</i>-order code for the grid square with coordinates (<i>x</i> = 1 = 01<sub>2</sub>, <i>y</i> = 3 = 11<sub>2</sub>) is 1011<sub>2</sub> = 11. The number of points in each partition is controlled by the number of bits per dimension: the more bits, the fewer the points. The number of bits per dimension also controls the level of detail with more bits yielding finer partitioning. We present this partitioning method by implementing it on Apache Spark and investigating how different parameters affect the accuracy and running time of the <i>k</i> nearest neighbour algorithm for a hemispherical and a triangular wave point cloud.


Author(s):  
Jiayong Yu ◽  
Longchen Ma ◽  
Maoyi Tian, ◽  
Xiushan Lu

The unmanned aerial vehicle (UAV)-mounted mobile LiDAR system (ULS) is widely used for geomatics owing to its efficient data acquisition and convenient operation. However, due to limited carrying capacity of a UAV, sensors integrated in the ULS should be small and lightweight, which results in decrease in the density of the collected scanning points. This affects registration between image data and point cloud data. To address this issue, the authors propose a method for registering and fusing ULS sequence images and laser point clouds, wherein they convert the problem of registering point cloud data and image data into a problem of matching feature points between the two images. First, a point cloud is selected to produce an intensity image. Subsequently, the corresponding feature points of the intensity image and the optical image are matched, and exterior orientation parameters are solved using a collinear equation based on image position and orientation. Finally, the sequence images are fused with the laser point cloud, based on the Global Navigation Satellite System (GNSS) time index of the optical image, to generate a true color point cloud. The experimental results show the higher registration accuracy and fusion speed of the proposed method, thereby demonstrating its accuracy and effectiveness.


2021 ◽  
Vol 13 (11) ◽  
pp. 2195
Author(s):  
Shiming Li ◽  
Xuming Ge ◽  
Shengfu Li ◽  
Bo Xu ◽  
Zhendong Wang

Today, mobile laser scanning and oblique photogrammetry are two standard urban remote sensing acquisition methods, and the cross-source point-cloud data obtained using these methods have significant differences and complementarity. Accurate co-registration can make up for the limitations of a single data source, but many existing registration methods face critical challenges. Therefore, in this paper, we propose a systematic incremental registration method that can successfully register MLS and photogrammetric point clouds in the presence of a large number of missing data, large variations in point density, and scale differences. The robustness of this method is due to its elimination of noise in the extracted linear features and its 2D incremental registration strategy. There are three main contributions of our work: (1) the development of an end-to-end automatic cross-source point-cloud registration method; (2) a way to effectively extract the linear feature and restore the scale; and (3) an incremental registration strategy that simplifies the complex registration process. The experimental results show that this method can successfully achieve cross-source data registration, while other methods have difficulty obtaining satisfactory registration results efficiently. Moreover, this method can be extended to more point-cloud sources.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 884
Author(s):  
Chia-Ming Tsai ◽  
Yi-Horng Lai ◽  
Yung-Da Sun ◽  
Yu-Jen Chung ◽  
Jau-Woei Perng

Numerous sensors can obtain images or point cloud data on land, however, the rapid attenuation of electromagnetic signals and the lack of light in water have been observed to restrict sensing functions. This study expands the utilization of two- and three-dimensional detection technologies in underwater applications to detect abandoned tires. A three-dimensional acoustic sensor, the BV5000, is used in this study to collect underwater point cloud data. Some pre-processing steps are proposed to remove noise and the seabed from raw data. Point clouds are then processed to obtain two data types: a 2D image and a 3D point cloud. Deep learning methods with different dimensions are used to train the models. In the two-dimensional method, the point cloud is transferred into a bird’s eye view image. The Faster R-CNN and YOLOv3 network architectures are used to detect tires. Meanwhile, in the three-dimensional method, the point cloud associated with a tire is cut out from the raw data and is used as training data. The PointNet and PointConv network architectures are then used for tire classification. The results show that both approaches provide good accuracy.


Sensors ◽  
2020 ◽  
Vol 21 (1) ◽  
pp. 201
Author(s):  
Michael Bekele Maru ◽  
Donghwan Lee ◽  
Kassahun Demissie Tola ◽  
Seunghee Park

Modeling a structure in the virtual world using three-dimensional (3D) information enhances our understanding, while also aiding in the visualization, of how a structure reacts to any disturbance. Generally, 3D point clouds are used for determining structural behavioral changes. Light detection and ranging (LiDAR) is one of the crucial ways by which a 3D point cloud dataset can be generated. Additionally, 3D cameras are commonly used to develop a point cloud containing many points on the external surface of an object around it. The main objective of this study was to compare the performance of optical sensors, namely a depth camera (DC) and terrestrial laser scanner (TLS) in estimating structural deflection. We also utilized bilateral filtering techniques, which are commonly used in image processing, on the point cloud data for enhancing their accuracy and increasing the application prospects of these sensors in structure health monitoring. The results from these sensors were validated by comparing them with the outputs from a linear variable differential transformer sensor, which was mounted on the beam during an indoor experiment. The results showed that the datasets obtained from both the sensors were acceptable for nominal deflections of 3 mm and above because the error range was less than ±10%. However, the result obtained from the TLS were better than those obtained from the DC.


Aerospace ◽  
2018 ◽  
Vol 5 (3) ◽  
pp. 94 ◽  
Author(s):  
Hriday Bavle ◽  
Jose Sanchez-Lopez ◽  
Paloma Puente ◽  
Alejandro Rodriguez-Ramos ◽  
Carlos Sampedro ◽  
...  

This paper presents a fast and robust approach for estimating the flight altitude of multirotor Unmanned Aerial Vehicles (UAVs) using 3D point cloud sensors in cluttered, unstructured, and dynamic indoor environments. The objective is to present a flight altitude estimation algorithm, replacing the conventional sensors such as laser altimeters, barometers, or accelerometers, which have several limitations when used individually. Our proposed algorithm includes two stages: in the first stage, a fast clustering of the measured 3D point cloud data is performed, along with the segmentation of the clustered data into horizontal planes. In the second stage, these segmented horizontal planes are mapped based on the vertical distance with respect to the point cloud sensor frame of reference, in order to provide a robust flight altitude estimation even in presence of several static as well as dynamic ground obstacles. We validate our approach using the IROS 2011 Kinect dataset available in the literature, estimating the altitude of the RGB-D camera using the provided 3D point clouds. We further validate our approach using a point cloud sensor on board a UAV, by means of several autonomous real flights, closing its altitude control loop using the flight altitude estimated by our proposed method, in presence of several different static as well as dynamic ground obstacles. In addition, the implementation of our approach has been integrated in our open-source software framework for aerial robotics called Aerostack.


Author(s):  
Franco Spettu ◽  
Simone Teruggi ◽  
Francesco Canali ◽  
Cristiana Achille ◽  
Francesco Fassi

Cultural Heritage (CH) 3D digitisation is getting increasing attention and importance. Advanced survey techniques provide as output a 3D point cloud, wholly and accurately describing even the most complex architectural geometry with a priori established accuracy. These 3D point models are generally used as the base for the realisation of 2D technical drawings and 3D advanced representations. During the last 12 years, the 3DSurveyGroup (3DSG, Politecnico di Milano) conduced an omni-comprehensive, multi-technique survey, obtaining the full point cloud of Milan Cathedral, from which were produced the 2D technical drawings and the 3D model of the Main Spire used by the Veneranda Fabbrica del Duomo di Milano (VF) to plan its periodic maintenance and inspection activities on the Cathedral. Using the survey product directly to plan VF activities would help to skip a long-lasting, uneconomical and manual process of 2D and 3D technical elaboration extraction. In order to do so, the unstructured point cloud data must be enriched with semantics, providing a hierarchical structure that can communicate with a powerful, flexible information system able to effectively manage both point clouds and 3D geometries as hybrid models. For this purpose, the point cloud was segmented using a machine-learning algorithm with multi-level multi-resolution (MLMR) approach in order to obtain a manageable, reliable and repeatable dataset. This reverse engineering process allowed to identify directly on the point cloud the main architectonic elements that are then re-organised in a logical structure inserted inside the informative system built inside the 3DExperience environment, developed by Dassault Systémes.


Sign in / Sign up

Export Citation Format

Share Document