Automatic 3D Landmark Extraction System Based on an Encoder–Decoder Using Fusion of Vision and LiDAR

To provide a realistic environment for remote sensing applications, point clouds are used to realize a three-dimensional (3D) digital world for the user. Motion recognition of objects, e.g., humans, is required to provide realistic experiences in the 3D digital world. To recognize a user’s motions, 3D landmarks are provided by analyzing a 3D point cloud collected through a light detection and ranging (LiDAR) system or a red green blue (RGB) image collected visually. However, manual supervision is required to extract 3D landmarks as to whether they originate from the RGB image or the 3D point cloud. Thus, there is a need for a method for extracting 3D landmarks without manual supervision. Herein, an RGB image and a 3D point cloud are used to extract 3D landmarks. The 3D point cloud is utilized as the relative distance between a LiDAR and a user. Because it cannot contain all information the user’s entire body due to disparities, it cannot generate a dense depth image that provides the boundary of user’s body. Therefore, up-sampling is performed to increase the density of the depth image generated based on the 3D point cloud; the density depends on the 3D point cloud. This paper proposes a system for extracting 3D landmarks using 3D point clouds and RGB images without manual supervision. A depth image provides the boundary of a user’s motion and is generated by using 3D point cloud and RGB image collected by a LiDAR and an RGB camera, respectively. To extract 3D landmarks automatically, an encoder–decoder model is trained with the generated depth images, and the RGB images and 3D landmarks are extracted from these images with the trained encoder model. The method of extracting 3D landmarks using RGB depth (RGBD) images was verified experimentally, and 3D landmarks were extracted to evaluate the user’s motions with RGBD images. In this manner, landmarks could be extracted according to the user’s motions, rather than by extracting them using the RGB images. The depth images generated by the proposed method were 1.832 times denser than the up-sampling-based depth images generated with bilateral filtering.

Download Full-text

Locating Mechanical Switches Using RGB-D Sensor Mounted on a Disaster Response Robot

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.6.iriacv-016 ◽

2020 ◽

Vol 2020 (6) ◽

pp. 16-1-16-7

Author(s):

Takuya Kanda ◽

Kazuya Miyakawa ◽

Jeonghwang Hayashi ◽

Jun Ohya ◽

Hiroyuki Ogata ◽

...

Keyword(s):

Disaster Response ◽

Point Cloud ◽

Real Space ◽

Depth Image ◽

3D Point Cloud ◽

Point Cloud Data ◽

Cloud Data ◽

Depth Images ◽

Bounding Box ◽

Rgb Image

To achieve one of the tasks required for disaster response robots, this paper proposes a method for locating 3D structured switches’ points to be pressed by the robot in disaster sites using RGBD images acquired by Kinect sensor attached to our disaster response robot. Our method consists of the following five steps: 1)Obtain RGB and depth images using an RGB-D sensor. 2) Detect the bounding box of switch area from the RGB image using YOLOv3. 3)Generate 3D point cloud data of the target switch by combining the bounding box and the depth image.4)Detect the center position of the switch button from the RGB image in the bounding box using Convolutional Neural Network (CNN). 5)Estimate the center of the button’s face in real space from the detection result in step 4) and the 3D point cloud data generated in step3) In the experiment, the proposed method is applied to two types of 3D structured switch boxes to evaluate the effectiveness. The results show that our proposed method can locate the switch button accurately enough for the robot operation.

Download Full-text

DETERMINATION OF PARKING SPACE AND ITS CONCURRENT USAGE OVER TIME USING SEMANTICALLY SEGMENTED MOBILE MAPPING DATA

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2021-185-2021 ◽

2021 ◽

Vol XLIII-B2-2021 ◽

pp. 185-192

Author(s):

A. Leichter ◽

U. Feuerhake ◽

M. Sester

Keyword(s):

Public Space ◽

Point Cloud ◽

Point Clouds ◽

Parking Space ◽

3D Point Cloud ◽

Cloud Data ◽

3D Point Clouds ◽

3D Data ◽

Rgb Images ◽

Bounding Boxes

Abstract. Public space is a scarce good in cities. There are many concurrent usages, which makes an adequate allocation of space both difficult and highly attractive. A lot of space is allocated by parking cars – even if the parking spaces are not occupied by cars all the time. In this work, we analyze space demand and usage by parking cars, in order to evaluate, when this space could be used for other purposes. The analysis is based on 3D point clouds acquired at several times during a day. We propose a processing pipeline to extract car bounding boxes from a given 3D point cloud. For the car extraction we utilize a label transfer technique for transfers from semantically segmented 2D RGB images to 3D point cloud data. This semantically segmented 3D data allows us to identify car instances. Subsequently, we aggregate and analyze information about parking cars. We present an exemplary analysis of the urban area where we extracted 15.000 cars at five different points in time. Based on this aggregated we present analytical results for time dependent parking behavior, parking space availability and utilization.

Download Full-text

Food Volume Estimation Based on Deep Learning View Synthesis from a Single Depth Map

Nutrients ◽

10.3390/nu10122005 ◽

2018 ◽

Vol 10 (12) ◽

pp. 2005 ◽

Cited By ~ 12

Author(s):

Frank Lo ◽

Yingnan Sun ◽

Jianing Qiu ◽

Benny Lo

Keyword(s):

Neural Network ◽

Deep Learning ◽

Point Cloud ◽

Volume Estimation ◽

Assessment System ◽

View Synthesis ◽

Depth Image ◽

3D Point Cloud ◽

Viewing Angle ◽

3D Point Clouds

An objective dietary assessment system can help users to understand their dietary behavior and enable targeted interventions to address underlying health problems. To accurately quantify dietary intake, measurement of the portion size or food volume is required. For volume estimation, previous research studies mostly focused on using model-based or stereo-based approaches which rely on manual intervention or require users to capture multiple frames from different viewing angles which can be tedious. In this paper, a view synthesis approach based on deep learning is proposed to reconstruct 3D point clouds of food items and estimate the volume from a single depth image. A distinct neural network is designed to use a depth image from one viewing angle to predict another depth image captured from the corresponding opposite viewing angle. The whole 3D point cloud map is then reconstructed by fusing the initial data points with the synthesized points of the object items through the proposed point cloud completion and Iterative Closest Point (ICP) algorithms. Furthermore, a database with depth images of food object items captured from different viewing angles is constructed with image rendering and used to validate the proposed neural network. The methodology is then evaluated by comparing the volume estimated by the synthesized 3D point cloud with the ground truth volume of the object items.

Download Full-text

Fast and Robust Flight Altitude Estimation of Multirotor UAVs in Dynamic Unstructured Environments Using 3D Point Cloud Sensors

Aerospace ◽

10.3390/aerospace5030094 ◽

2018 ◽

Vol 5 (3) ◽

pp. 94 ◽

Cited By ~ 5

Author(s):

Hriday Bavle ◽

Jose Sanchez-Lopez ◽

Paloma Puente ◽

Alejandro Rodriguez-Ramos ◽

Carlos Sampedro ◽

...

Keyword(s):

Point Cloud ◽

Point Clouds ◽

Estimation Algorithm ◽

Indoor Environments ◽

3D Point Cloud ◽

Flight Altitude ◽

Cloud Data ◽

3D Point Clouds ◽

Laser Altimeters ◽

Altitude Estimation

This paper presents a fast and robust approach for estimating the flight altitude of multirotor Unmanned Aerial Vehicles (UAVs) using 3D point cloud sensors in cluttered, unstructured, and dynamic indoor environments. The objective is to present a flight altitude estimation algorithm, replacing the conventional sensors such as laser altimeters, barometers, or accelerometers, which have several limitations when used individually. Our proposed algorithm includes two stages: in the first stage, a fast clustering of the measured 3D point cloud data is performed, along with the segmentation of the clustered data into horizontal planes. In the second stage, these segmented horizontal planes are mapped based on the vertical distance with respect to the point cloud sensor frame of reference, in order to provide a robust flight altitude estimation even in presence of several static as well as dynamic ground obstacles. We validate our approach using the IROS 2011 Kinect dataset available in the literature, estimating the altitude of the RGB-D camera using the provided 3D point clouds. We further validate our approach using a point cloud sensor on board a UAV, by means of several autonomous real flights, closing its altitude control loop using the flight altitude estimated by our proposed method, in presence of several different static as well as dynamic ground obstacles. In addition, the implementation of our approach has been integrated in our open-source software framework for aerial robotics called Aerostack.

Download Full-text

Multi-view attention-convolution pooling network for 3D point cloud classification

Applied Intelligence ◽

10.1007/s10489-021-02840-2 ◽

2021 ◽

Author(s):

Wenju Wang ◽

Tao Wang ◽

Yu Cai

Keyword(s):

Point Cloud ◽

Point Clouds ◽

Information Loss ◽

Feature Representation ◽

3D Point Cloud ◽

Current Output ◽

3D Point Clouds ◽

Cloud Classification ◽

Feature Information ◽

Point Cloud Classification

AbstractClassifying 3D point clouds is an important and challenging task in computer vision. Currently, classification methods using multiple views lose characteristic or detail information during the representation or processing of views. For this reason, we propose a multi-view attention-convolution pooling network framework for 3D point cloud classification tasks. This framework uses Res2Net to extract the features from multiple 2D views. Our attention-convolution pooling method finds more useful information in the input data related to the current output, effectively solving the problem of feature information loss caused by feature representation and the detail information loss during dimensionality reduction. Finally, we obtain the probability distribution of the model to be classified using a full connection layer and the softmax function. The experimental results show that our framework achieves higher classification accuracy and better performance than other contemporary methods using the ModelNet40 dataset.

Download Full-text

IMAGE TO POINT CLOUD TRANSLATION USING CONDITIONAL GENERATIVE ADVERSARIAL NETWORK FOR AIRBORNE LIDAR DATA

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-2-2021-169-2021 ◽

2021 ◽

Vol V-2-2021 ◽

pp. 169-174

Author(s):

T. Shinohara ◽

H. Xiu ◽

M. Matsuoka

Keyword(s):

Point Cloud ◽

Large Scale ◽

Point Clouds ◽

Airborne Lidar ◽

Real Point ◽

3D Point Cloud ◽

Generative Adversarial Network ◽

Adversarial Network ◽

3D Point Clouds ◽

Latent Features

Abstract. This study introduces a novel image to a 3D point-cloud translation method with a conditional generative adversarial network that creates a large-scale 3D point cloud. This can generate supervised point clouds observed via airborne LiDAR from aerial images. The network is composed of an encoder to produce latent features of input images, generator to translate latent features to fake point clouds, and discriminator to classify false or real point clouds. The encoder is a pre-trained ResNet; to overcome the difficulty of generating 3D point clouds in an outdoor scene, we use a FoldingNet with features from ResNet. After a fixed number of iterations, our generator can produce fake point clouds that correspond to the input image. Experimental results show that our network can learn and generate certain point clouds using the data from the 2018 IEEE GRSS Data Fusion Contest.

Download Full-text

An Improved Method for Restoring the Shape of 3D Point Cloud Surfaces

International Journal of Synthetic Emotions ◽

10.4018/ijse.2018070103 ◽

2018 ◽

Vol 9 (2) ◽

pp. 37-53

Author(s):

Sinh Van Nguyen ◽

Ha Manh Tran ◽

Minh Khai Tran

Keyword(s):

Geometric Modeling ◽

Point Cloud ◽

Point Clouds ◽

Tangent Plane ◽

Surface Patch ◽

3D Point Cloud ◽

Improved Method ◽

Initial Shape ◽

Cloud Data ◽

3D Point Clouds

Building 3D objects or reconstructing their surfaces from 3D point cloud data are researched activities in the field of geometric modeling and computer graphics. In the recent years, they are also studied and used in some fields such as: graph models and simulation; image processing or restoration of digital heritages. This article presents an improved method for restoring the shape of 3D point cloud surfaces. The method is a combination of creating a Bezier surface patch and computing tangent plane of 3D points to fill holes on a surface of 3D point clouds. This method is described as follows: at first, a boundary for each hole on the surface is identified. The holes are then filled by computing Bezier curves of surface patches to find missing points. After that, the holes are refined based on two steps (rough and elaborate) to adjust the inserted points and preserve the local curvature of the holes. The contribution of the proposed method has been shown in processing time and the novelty of combined computation in this method has preserved the initial shape of the surface

Download Full-text

3D Shape Completion with Multi-View Consistent Inference

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6734 ◽

2020 ◽

Vol 34 (07) ◽

pp. 10997-11004 ◽

Cited By ~ 5

Author(s):

Tao Hu ◽

Zhizhong Han ◽

Matthias Zwicker

Keyword(s):

Point Clouds ◽

Shape Descriptor ◽

Depth Image ◽

Regularization Term ◽

3D Shape ◽

Depth Images ◽

3D Point Clouds ◽

Geometric Consistency ◽

Shape Completion ◽

Data Term

3D shape completion is important to enable machines to perceive the complete geometry of objects from partial observations. To address this problem, view-based methods have been presented. These methods represent shapes as multiple depth images, which can be back-projected to yield corresponding 3D point clouds, and they perform shape completion by learning to complete each depth image using neural networks. While view-based methods lead to state-of-the-art results, they currently do not enforce geometric consistency among the completed views during the inference stage. To resolve this issue, we propose a multi-view consistent inference technique for 3D shape completion, which we express as an energy minimization problem including a data term and a regularization term. We formulate the regularization term as a consistency loss that encourages geometric consistency among multiple views, while the data term guarantees that the optimized views do not drift away too much from a learned shape descriptor. Experimental results demonstrate that our method completes shapes more accurately than previous techniques.

Download Full-text

Point Siamese Network for Person Tracking Using 3D Point Clouds

Sensors ◽

10.3390/s20010143 ◽

2019 ◽

Vol 20 (1) ◽

pp. 143

Author(s):

Yubo Cui ◽

Zheng Fang ◽

Sifan Zhou

Keyword(s):

Feature Extraction ◽

Point Cloud ◽

Unscented Kalman Filter ◽

Point Clouds ◽

Target Person ◽

Bayesian Filtering ◽

3D Point Cloud ◽

Person Tracking ◽

Siamese Network ◽

3D Point Clouds

Person tracking is an important issue in both computer vision and robotics. However, most existing person tracking methods using 3D point cloud are based on the Bayesian Filtering framework which are not robust in challenging scenes. In contrast with the filtering methods, in this paper, we propose a neural network to cope with person tracking using only 3D point cloud, named Point Siamese Network (PSN). PSN consists of two input branches named template and search, respectively. After finding the target person (by reading the label or using a detector), we get the inputs of the two branches and create feature spaces for them using feature extraction network. Meanwhile, a similarity map based on the feature space is proposed between them. We can obtain the target person from the map. Furthermore, we add an attention module to the template branch to guide feature extraction. To evaluate the performance of the proposed method, we compare it with the Unscented Kalman Filter (UKF) on 3 custom labeled challenging scenes and the KITTI dataset. The experimental results show that the proposed method performs better than UKF in robustness and accuracy and has a real-time speed. In addition, we publicly release our collected dataset and the labeled sequences to the research community.

Download Full-text

6-DOF Localization for a Mobile Robot Using Outdoor 3D Point Clouds

Journal of Robotics and Mechatronics ◽

10.20965/jrm.2010.p0158 ◽

2010 ◽

Vol 22 (2) ◽

pp. 158-166 ◽

Cited By ~ 7

Author(s):

Taro Suzuki ◽

◽

Yoshiharu Amano ◽

Takumi Hashizume

Keyword(s):

Mobile Robot ◽

Point Cloud ◽

Field Experiments ◽

Point Clouds ◽

Measurement Model ◽

3D Point Cloud ◽

Point Cloud Data ◽

Cloud Data ◽

Continuous Contact ◽

3D Point Clouds

This paper describes outdoor localization for a mobile robot using a laser scanner and three-dimensional (3D) point cloud data. A Mobile Mapping System (MMS) measures outdoor 3D point clouds easily and precisely. The full six-dimensional state of a mobile robot is estimated combining dead reckoning and 3D point cloud data. Two-dimensional (2D) position and orientation are extended to 3D using 3D point clouds assuming that the mobile robot remains in continuous contact with the road surface. Our approach applies a particle filter to correct position error in the laser measurement model in 3D point cloud space. Field experiments were conducted to evaluate the accuracy of our proposal. As the result of the experiment, it was confirmed that a localization precision of 0.2 m (RMS) is possible using our proposal.

Download Full-text