scholarly journals Object-Oriented 3D Semantic Mapping Based on Instance Segmentation

Author(s):  
Jinxin Chi ◽  
◽  
Hao Wu ◽  
Guohui Tian

Service robots gain both geometric and semantic information about the environment with the help of semantic mapping, providing more intelligent services. However, a majority of studies for semantic mapping thus far require priori knowledge 3D object models or maps with a few object categories that neglect separate individual objects. In view of these problems, an object-oriented 3D semantic mapping method is proposed by combining state-of-the-art deep-learning-based instance segmentation and a visual simultaneous localization and mapping (SLAM) algorithm, which helps robots not only gain navigation-oriented geometric information about the surrounding environment, but also obtain individually-oriented attribute and location information about the objects. Meanwhile, an object recognition and target association algorithm applied to continuous image frames is proposed by combining visual SLAM, which uses visual consistency between image frames to promote the result of object matching and recognition over continuous image frames, and improve the object recognition accuracy. Finally, a 3D semantic mapping system is implemented based on Mask R-CNN and ORB-SLAM2 frameworks. A simulation experiment is carried out on the ICL-NUIM dataset and the experimental results show that the system can generally recognize all the types of objects in the scene and generate fine point cloud models of these objects, which verifies the effectiveness of our algorithm.

2021 ◽  
Vol 13 (10) ◽  
pp. 1923
Author(s):  
Ali Hosseininaveh ◽  
Fabio Remondino

Imaging network design is a crucial step in most image-based 3D reconstruction applications based on Structure from Motion (SfM) and multi-view stereo (MVS) methods. This paper proposes a novel photogrammetric algorithm for imaging network design for building 3D reconstruction purposes. The proposed methodology consists of two main steps: (i) the generation of candidate viewpoints and (ii) the clustering and selection of vantage viewpoints. The first step includes the identification of initial candidate viewpoints, selecting the candidate viewpoints in the optimum range, and defining viewpoint direction stages. In the second step, four challenging approaches—named façade pointing, centre pointing, hybrid, and both centre & façade pointing—are proposed. The entire methodology is implemented and evaluated in both simulation and real-world experiments. In the simulation experiment, a building and its environment are computer-generated in the ROS (Robot Operating System) Gazebo environment and a map is created by using a simulated robot and Gmapping algorithm based on a Simultaneously Localization and Mapping (SLAM) algorithm using a simulated Unmanned Ground Vehicle (UGV). In the real-world experiment, the proposed methodology is evaluated for all four approaches for a real building with two common approaches, called continuous image capturing and continuous image capturing & clustering and selection approaches. The results of both evaluations reveal that the fusion of centre & façade pointing approach is more efficient than all other approaches in terms of both accuracy and completeness criteria.


2013 ◽  
Vol 133 (1) ◽  
pp. 18-27 ◽  
Author(s):  
Hisato Fukuda ◽  
Satoshi Mori ◽  
Katsutoshi Sakata ◽  
Yoshinori Kobayashi ◽  
Yoshinori Kuno

2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Nick Le Large ◽  
Frank Bieder ◽  
Martin Lauer

Abstract For the application of an automated, driverless race car, we aim to assure high map and localization quality for successful driving on previously unknown, narrow race tracks. To achieve this goal, it is essential to choose an algorithm that fulfills the requirements in terms of accuracy, computational resources and run time. We propose both a filter-based and a smoothing-based Simultaneous Localization and Mapping (SLAM) algorithm and evaluate them using real-world data collected by a Formula Student Driverless race car. The accuracy is measured by comparing the SLAM-generated map to a ground truth map which was acquired using high-precision Differential GPS (DGPS) measurements. The results of the evaluation show that both algorithms meet required time constraints thanks to a parallelized architecture, with GraphSLAM draining the computational resources much faster than Extended Kalman Filter (EKF) SLAM. However, the analysis of the maps generated by the algorithms shows that GraphSLAM outperforms EKF SLAM in terms of accuracy.


2021 ◽  
Vol 13 (12) ◽  
pp. 2351
Author(s):  
Alessandro Torresani ◽  
Fabio Menna ◽  
Roberto Battisti ◽  
Fabio Remondino

Mobile and handheld mapping systems are becoming widely used nowadays as fast and cost-effective data acquisition systems for 3D reconstruction purposes. While most of the research and commercial systems are based on active sensors, solutions employing only cameras and photogrammetry are attracting more and more interest due to their significantly minor costs, size and power consumption. In this work we propose an ARM-based, low-cost and lightweight stereo vision mobile mapping system based on a Visual Simultaneous Localization And Mapping (V-SLAM) algorithm. The prototype system, named GuPho (Guided Photogrammetric System) also integrates an in-house guidance system which enables optimized image acquisitions, robust management of the cameras and feedback on positioning and acquisition speed. The presented results show the effectiveness of the developed prototype in mapping large scenarios, enabling motion blur prevention, robust camera exposure control and achieving accurate 3D results.


2017 ◽  
Vol 91 (2) ◽  
pp. 279-289 ◽  
Author(s):  
Matheus dos Santos ◽  
Paulo Drews ◽  
Pedro Núñez ◽  
Silvia Botelho

Sensors ◽  
2020 ◽  
Vol 20 (19) ◽  
pp. 5570
Author(s):  
Yiming Ding ◽  
Zhi Xiong ◽  
Wanling Li ◽  
Zhiguo Cao ◽  
Zhengchun Wang

The combination of biomechanics and inertial pedestrian navigation research provides a very promising approach for pedestrian positioning in environments where Global Positioning System (GPS) signal is unavailable. However, in practical applications such as fire rescue and indoor security, the inertial sensor-based pedestrian navigation system is facing various challenges, especially the step length estimation errors and heading drift in running and sprint. In this paper, a trinal-node, including two thigh-worn inertial measurement units (IMU) and one waist-worn IMU, based simultaneous localization and occupation grid mapping method is proposed. Specifically, the gait detection and segmentation are realized by the zero-crossing detection of the difference of thighs pitch angle. A piecewise function between the step length and the probability distribution of waist horizontal acceleration is established to achieve accurate step length estimation both in regular walking and drastic motions. In addition, the simultaneous localization and mapping method based on occupancy grids, which involves the historic trajectory to improve the pedestrian’s pose estimation is introduced. The experiments show that the proposed trinal-node pedestrian inertial odometer can identify and segment each gait cycle in the walking, running, and sprint. The average step length estimation error is no more than 3.58% of the total travel distance in the motion speed from 1.23 m/s to 3.92 m/s. In combination with the proposed simultaneous localization and mapping method based on the occupancy grid, the localization error is less than 5 m in a single-story building of 2643.2 m2.


Author(s):  
Z. Wang ◽  
J. Li ◽  
A. Wang ◽  
J. Wang

In the last years several V-SLAM(Visual Simultaneous Localization and Mapping) approaches have appeared showing impressive reconstructions of the world. However these maps are built with far more than the required information. This limitation comes from the whole process of each key-frame. In this paper we present for the first time a mapping method based on the LOOK UP TABLE(LUT) for visual SLAM that can improve the mapping effectively. As this method relies on extracting features in each cell divided from image, it can get the pose of camera that is more representative of the whole key-frame. The tracking direction of key-frames is obtained by counting the number of parallax directions of feature points. LUT stored all mapping needs the number of cell corresponding to the tracking direction which can reduce the redundant information in the key-frame, and is more efficient to mapping. The result shows that a better map with less noise is build using less than one-third of the time. We believe that the capacity of LUT efficiently building maps makes it a good choice for the community to investigate in the scene reconstruction problems.


Sign in / Sign up

Export Citation Format

Share Document