scholarly journals PEDESTRIAN DETECTION AND TRACKING IN SPARSE MLS POINT CLOUDS USING A NEURAL NETWORK AND VOTING-BASED APPROACH

Author(s):  
B. Borgmann ◽  
M. Hebel ◽  
M. Arens ◽  
U. Stilla

Abstract. This paper presents and extends an approach for the detection of pedestrians in unstructured point clouds resulting from single MLS (mobile laser scanning) scans. The approach is based on a neural network and a subsequent voting process. The neural network processes point clouds subdivided into local point neighborhoods. The member points of these neighborhoods are directly processed by the network, hence a conversion in a structured representation of the data is not needed. The network also uses meta information of the neighborhoods themselves to improve the results, like their distance to the ground plane. It decides if the neighborhood is part of an object of interest and estimates the center of said object. This information is then used in a voting process. By searching for maxima in the voting space, the discrimination between an actual object and incorrectly classified neighborhoods is made. Since a single labeled object can be subdivided into multiple local neighborhoods, we are able to train the neural network with comparatively low amounts of labeled data. Considerations are made to deal with the varying and sparse point density that is typical for single MLS scans. We supplement the detection with a 3D tracking which, although straightforward, allows us to deal with objects which are occluded for short periods of time to improve the quality of the results. Overall, our approach performs reasonably well for the detection and tracking of pedestrians in single MLS scans as long as the local point density is not too low. Given the LiDAR sensor we used, this is the case up to distances of 22 m.

Author(s):  
B. Borgmann ◽  
M. Hebel ◽  
M. Arens ◽  
U. Stilla

<p><strong>Abstract.</strong> This paper presents an approach which uses a <i>PointNet</i>-like neural network to detect objects of certain types in MLS point clouds. In our case, it is used for the detection of pedestrians, but the approach can easily be adapted to other object classes. In the first step, we process local point neighborhoods with the neural network to determine a descriptive feature. This is then further processed to generate two outputs of the network. The first output classifies the neighborhood and determines if it is part of an object of interest. If this is the case, the second output determines where it is located in relation to the object center. This regression output allows us to use a voting process for the actual object detection. This processing step is inspired by approaches based on implicit shape models (ISM). It is able to deal with a certain amount of incorrectly classified neighborhoods, since it combines the results of multiple neighborhoods for the detection of an object. A benefit of our approach as compared to other machine learning methods is its low demand for training data. In our experiments, we achieved a promising detection performance even with less than 1000 training examples.</p>


Author(s):  
Z. Lari ◽  
K. Al-Durgham ◽  
A. Habib

Terrestrial laser scanning (TLS) systems have been established as a leading tool for the acquisition of high density three-dimensional point clouds from physical objects. The collected point clouds by these systems can be utilized for a wide spectrum of object extraction, modelling, and monitoring applications. Pole-like features are among the most important objects that can be extracted from TLS data especially those acquired in urban areas and industrial sites. However, these features cannot be completely extracted and modelled using a single TLS scan due to significant local point density variations and occlusions caused by the other objects. Therefore, multiple TLS scans from different perspectives should be integrated through a registration procedure to provide a complete coverage of the pole-like features in a scene. To date, different segmentation approaches have been proposed for the extraction of pole-like features from either single or multiple-registered TLS scans. These approaches do not consider the internal characteristics of a TLS point cloud (local point density variations and noise level in data) and usually suffer from computational inefficiency. To overcome these problems, two recently-developed PCA-based parameter-domain and spatial-domain approaches for the segmentation of pole-like features are introduced, in this paper. Moreover, the performance of the proposed segmentation approaches for the extraction of pole-like features from a single or multiple-registered TLS scans is investigated in this paper. The alignment of the utilized TLS scans is implemented using an Iterative Closest Projected Point (ICPP) registration procedure. Qualitative and quantitative evaluation of the extracted pole-like features from single and multiple-registered TLS scans, using both of the proposed segmentation approaches, is conducted to verify the extraction of more complete pole-like features using multipleregistered TLS scans.


2021 ◽  
Vol 13 (18) ◽  
pp. 3736
Author(s):  
Sung-Hwan Park ◽  
Hyung-Sup Jung ◽  
Sunmin Lee ◽  
Eun-Sook Kim

The role of forests is increasing because of rapid land use changes worldwide that have implications on ecosystems and the carbon cycle. Therefore, it is necessary to obtain accurate information about forests and build forest inventories. However, it is difficult to assess the internal structure of the forest through 2D remote sensing techniques and fieldwork. In this aspect, we proposed a method for estimating the vertical structure of forests based on full-waveform light detection and ranging (FW LiDAR) data in this study. Voxel-based tree point density maps were generated by estimating the number of canopy height points in each voxel grid from the raster digital terrain model (DTM) and canopy height points after pre-processing the LiDAR point clouds. We applied an unsupervised classification algorithm to the voxel-based tree point density maps and identified seven classes by profile pattern analysis for the forest vertical types. The classification accuracy was found to be 72.73% from the validation from 11 field investigation sites, which was additionally confirmed through comparative analysis with aerial images. Based on this pre-classification reference map, which is assumed to be ground truths, the deep neural network (DNN) model was finally applied to perform the final classification. As a result of accuracy assessment, it showed accuracy of 92.72% with a good performance. These results demonstrate the potential of vertical structure estimation for extensive forests using FW LiDAR data and that the distinction between one-storied and two-storied forests can be clearly represented. This technique is expected to contribute to efficient and effective management of forests based on accurate information derived from the proposed method.


Author(s):  
A. V. Vo ◽  
C. N. Lokugam Hewage ◽  
N. A. Le Khac ◽  
M. Bertolotto ◽  
D. Laefer

Abstract. Point density is an important property that dictates the usability of a point cloud data set. This paper introduces an efficient, scalable, parallel algorithm for computing the local point density index, a sophisticated point cloud density metric. Computing the local point density index is non-trivial, because this computation involves a neighbour search that is required for each, individual point in the potentially large, input point cloud. Most existing algorithms and software are incapable of computing point density at scale. Therefore, the algorithm introduced in this paper aims to address both the needed computational efficiency and scalability for considering this factor in large, modern point clouds such as those collected in national or regional scans. The proposed algorithm is composed of two stages. In stage 1, a point-level, parallel processing step is performed to partition an unstructured input point cloud into partially overlapping, buffered tiles. A buffer is provided around each tile so that the data partitioning does not introduce spatial discontinuity into the final results. In stage 2, the buffered tiles are distributed to different processors for computing the local point density index in parallel. That tile-level parallel processing step is performed using a conventional algorithm with an R-tree data structure. While straight-forward, the proposed algorithm is efficient and particularly suitable for processing large point clouds. Experiments conducted using a 1.4 billion point data set acquired over part of Dublin, Ireland demonstrated an efficiency factor of up to 14.8/16. More specifically, the computational time was reduced by 14.8 times when the number of processes (i.e. executors) increased by 16 times. Computing the local point density index for the 1.4 billion point data set took just over 5 minutes with 16 executors and 8 cores per executor. The reduction in computational time was nearly 70 times compared to the 6 hours required without parallelism.


Author(s):  
Qin Qin ◽  
Josef Vychodil ◽  
◽  

This paper proposes a new multi-feature detection method of local pedestrian based on a convolutional neural network (CNN), which provides a reliable basis for multi-feature fusion in pedestrian detection. According to the standard of pedestrian detection ratio, the pedestrian under the detection window would be segmented, using the sample labels to guide the local characteristics of CNN learning, the supervised learning after the network can obtain the local feature fusion more pedestrian description ability. Finally, a large number of experiments have been performed. The experimental results show that the local features of the neural network are better than those of most pedestrian features and combination features.


Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8382
Author(s):  
Hongjae Lee ◽  
Jiyoung Jung

Urban scene modeling is a challenging but essential task for various applications, such as 3D map generation, city digitization, and AR/VR/metaverse applications. To model man-made structures, such as roads and buildings, which are the major components in general urban scenes, we present a clustering-based plane segmentation neural network using 3D point clouds, called hybrid K-means plane segmentation (HKPS). The proposed method segments unorganized 3D point clouds into planes by training the neural network to estimate the appropriate number of planes in the point cloud based on hybrid K-means clustering. We consider both the Euclidean distance and cosine distance to cluster nearby points in the same direction for better plane segmentation results. Our network does not require any labeled information for training. We evaluated the proposed method using the Virtual KITTI dataset and showed that our method outperforms conventional methods in plane segmentation. Our code is publicly available.


Author(s):  
F. Pirotti ◽  
F. Tonion

<p><strong>Abstract.</strong> In this investigation a comparison between two machine learning (ML) models for semantic classification of an aerial laser scanner point cloud is presented. One model is Random Forest (RF), the other is a multi-layer neural network, TensorFlow (TF). Accuracy results were compared over a growing set of training data, using a stratified independent sampling over classes from 5% to 50% of the total dataset. Results show RF to have average F1&amp;thinsp;=&amp;thinsp;0.823 for the 9 classes considered, whereas TF had average F1&amp;thinsp;=&amp;thinsp;0.450. F1 values where higher for RF than TF, due to complexity in the determination of a suitable composition of the hidden layers of the neural network in TF, and this can likely be improved to reach higher accuracy values. Further study in this sense is planned.</p>


Sign in / Sign up

Export Citation Format

Share Document