scholarly journals JD-SLAM: Joint camera pose estimation and moving object segmentation for simultaneous localization and mapping in dynamic scenes

2021 ◽  
Vol 18 (1) ◽  
pp. 172988142199444
Author(s):  
Yujia Zhai ◽  
Baoli Lu ◽  
Weijun Li ◽  
Jian Xu ◽  
Shuangyi Ma

As a fundamental assumption in simultaneous localization and mapping, the static scenes hypothesis can be hardly fulfilled in applications of indoor/outdoor navigation or localization. Recent works about simultaneous localization and mapping in dynamic scenes commonly use heavy pixel-level segmentation net to distinguish dynamic objects, which brings enormous calculations and limits the real-time performance of the system. That restricts the application of simultaneous localization and mapping on the mobile terminal. In this article, we present a lightweight system for monocular simultaneous localization and mapping in dynamic scenes, which can run in real time on central processing unit (CPU) and generate a semantic probability map. The pixel-wise semantic segmentation net is replaced with a lightweight object detection net combined with three-dimensional segmentation based on motion clustering. And a framework integrated with an improved weighted-random sample consensus solver is proposed to jointly solve the camera pose and perform three-dimensional object segmentation, which enables high accuracy and efficiency. Besides, the prior information of the generated map and the object detection results is introduced for better estimation. The experiments on the public data set, and in the real-world demonstrate that our method obtains an outstanding improvement in both accuracy and speed compared to state-of-the-art methods.

2019 ◽  
Vol 9 (16) ◽  
pp. 3264 ◽  
Author(s):  
Xujie Kang ◽  
Jing Li ◽  
Xiangtao Fan ◽  
Wenhui Wan

In recent years, low-cost and lightweight RGB and depth (RGB-D) sensors, such as Microsoft Kinect, have made available rich image and depth data, making them very popular in the field of simultaneous localization and mapping (SLAM), which has been increasingly used in robotics, self-driving vehicles, and augmented reality. The RGB-D SLAM constructs 3D environmental models of natural landscapes while simultaneously estimating camera poses. However, in highly variable illumination and motion blur environments, long-distance tracking can result in large cumulative errors and scale shifts. To address this problem in actual applications, in this study, we propose a novel multithreaded RGB-D SLAM framework that incorporates a highly accurate prior terrestrial Light Detection and Ranging (LiDAR) point cloud, which can mitigate cumulative errors and improve the system’s robustness in large-scale and challenging scenarios. First, we employed deep learning to achieve system automatic initialization and motion recovery when tracking is lost. Next, we used terrestrial LiDAR point cloud to obtain prior data of the landscape, and then we applied the point-to-surface inductively coupled plasma (ICP) iterative algorithm to realize accurate camera pose control from the previously obtained LiDAR point cloud data, and finally expanded its control range in the local map construction. Furthermore, an innovative double window segment-based map optimization method is proposed to ensure consistency, better real-time performance, and high accuracy of map construction. The proposed method was tested for long-distance tracking and closed-loop in two different large indoor scenarios. The experimental results indicated that the standard deviation of the 3D map construction is 10 cm in a mapping distance of 100 m, compared with the LiDAR ground truth. Further, the relative cumulative error of the camera in closed-loop experiments is 0.09%, which is twice less than that of the typical SLAM algorithm (3.4%). Therefore, the proposed method was demonstrated to be more robust than the ORB-SLAM2 algorithm in complex indoor environments.


Author(s):  
Fred Daneshgaran ◽  
Antonio Marangi ◽  
Nicola Bruno ◽  
Fausto Lizzio ◽  
Marina Mondin ◽  
...  

This paper presents the results of the development, design, and implementation of a visual simultaneous localization and mapping (SLAM) system for autonomous real-time localization with application to underground transportation infrastructure (UTI) such as tunnels. Localization is achieved in the absence of any global positioning system (GPS) or auxiliary system. The indoor localization system is a necessary element of a fully autonomous platform for the detection of cracks and other anomalies on the interior surfaces of tunnels and other UTI. It can be used for tagging of high-resolution sensor data obtained with low-cost prototype data acquisition platforms previously developed. Visual based SLAM has been used as the core element in an architecture employing a commercial off-the-shelf (COTS) ZED stereo camera from Stereolabs. To achieve real-time operation, an NVIDIA Jetson TX2 massively parallel graphics processing unit (GPU) was used as the core computational engine employing two different software libraries. We achieved localization at 5 frames per second (FPS) using ORBSLAM2 open-source software library, and the much lighter, but proprietary, ZED SDK was able to deliver a performance at nearly 60 FPS. To assess the accuracy of the relative localization system, we conducted several tests at 30 FPS and reported on the resulting error variances that were found to be consistently very small. Finally, we conducted several tests in a tunnel in the Los Angeles county area and confirmed the applicability of the method for monitoring UTI.


2018 ◽  
Vol 8 (12) ◽  
pp. 2432 ◽  
Author(s):  
Jingchuan Wang ◽  
Ming Zhao ◽  
Weidong Chen

In large-scale and sparse scenes, such as farmland, orchards, mines, and substations, 3D simultaneous localization and mapping are challenging matters that need to address issues such as maintaining reliable data association for scarce environmental information and reducing the computational complexity of global optimization for large-scale scenes. To solve these problems, a real-time incremental simultaneous localization and mapping algorithm called MIM_SLAM is proposed in this paper. This algorithm is applied in mobile robots to build a map on a non-flat road with a 3D LiDAR sensor. MIM_SLAM’s main contribution is that multi-level ICP (Iterative Closest Point) matching is used to solve the data association problem, a Fisher information matrix is used to describe the uncertainty of the estimated pose, and these poses are optimized by the incremental optimization method, which can greatly reduce the computational cost. Then, a map with a high consistency will be established. The proposed algorithm has been evaluated in the real indoor and outdoor scenes as well as two substations and benchmarking dataset from KITTI with the characteristics of sparse and large-scale. Results show that the proposed algorithm has a high mapping accuracy and meets the real-time requirements.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2106
Author(s):  
Ahmed Afifi ◽  
Chisato Takada ◽  
Yuichiro Yoshimura ◽  
Toshiya Nakaguchi

Minimally invasive surgery is widely used because of its tremendous benefits to the patient. However, there are some challenges that surgeons face in this type of surgery, the most important of which is the narrow field of view. Therefore, we propose an approach to expand the field of view for minimally invasive surgery to enhance surgeons’ experience. It combines multiple views in real-time to produce a dynamic expanded view. The proposed approach extends the monocular Oriented features from an accelerated segment test and Rotated Binary robust independent elementary features—Simultaneous Localization And Mapping (ORB-SLAM) to work with a multi-camera setup. The ORB-SLAM’s three parallel threads, namely tracking, mapping and loop closing, are performed for each camera and new threads are added to calculate the relative cameras’ pose and to construct the expanded view. A new algorithm for estimating the optimal inter-camera correspondence matrix from a set of corresponding 3D map points is presented. This optimal transformation is then used to produce the final view. The proposed approach was evaluated using both human models and in vivo data. The evaluation results of the proposed correspondence matrix estimation algorithm prove its ability to reduce the error and to produce an accurate transformation. The results also show that when other approaches fail, the proposed approach can produce an expanded view. In this work, a real-time dynamic field-of-view expansion approach that can work in all situations regardless of images’ overlap is proposed. It outperforms the previous approaches and can also work at 21 fps.


Designs ◽  
2021 ◽  
Vol 5 (1) ◽  
pp. 15
Author(s):  
Andreas Thoma ◽  
Abhijith Moni ◽  
Sridhar Ravi

Digital Image Correlation (DIC) is a powerful tool used to evaluate displacements and deformations in a non-intrusive manner. By comparing two images, one from the undeformed reference states of the sample and the other from the deformed target state, the relative displacement between the two states is determined. DIC is well-known and often used for post-processing analysis of in-plane displacements and deformation of the specimen. Increasing the analysis speed to enable real-time DIC analysis will be beneficial and expand the scope of this method. Here we tested several combinations of the most common DIC methods in combination with different parallelization approaches in MATLAB and evaluated their performance to determine whether the real-time analysis is possible with these methods. The effects of computing with different hardware settings were also analyzed and discussed. We found that implementation problems can reduce the efficiency of a theoretically superior algorithm, such that it becomes practically slower than a sub-optimal algorithm. The Newton–Raphson algorithm in combination with a modified particle swarm algorithm in parallel image computation was found to be most effective. This is contrary to theory, suggesting that the inverse-compositional Gauss–Newton algorithm is superior. As expected, the brute force search algorithm is the least efficient method. We also found that the correct choice of parallelization tasks is critical in attaining improvements in computing speed. A poorly chosen parallelization approach with high parallel overhead leads to inferior performance. Finally, irrespective of the computing mode, the correct choice of combinations of integer-pixel and sub-pixel search algorithms is critical for efficient analysis. The real-time analysis using DIC will be difficult on computers with standard computing capabilities, even if parallelization is implemented, so the suggested solution would be to use graphics processing unit (GPU) acceleration.


2021 ◽  
Vol 87 (5) ◽  
pp. 363-373
Author(s):  
Long Chen ◽  
Bo Wu ◽  
Yao Zhao ◽  
Yuan Li

Real-time acquisition and analysis of three-dimensional (3D) human body kinematics are essential in many applications. In this paper, we present a real-time photogrammetric system consisting of a stereo pair of red-green-blue (RGB) cameras. The system incorporates a multi-threaded and graphics processing unit (GPU)-accelerated solution for real-time extraction of 3D human kinematics. A deep learning approach is adopted to automatically extract two-dimensional (2D) human body features, which are then converted to 3D features based on photogrammetric processing, including dense image matching and triangulation. The multi-threading scheme and GPU-acceleration enable real-time acquisition and monitoring of 3D human body kinematics. Experimental analysis verified that the system processing rate reached ∼18 frames per second. The effective detection distance reached 15 m, with a geometric accuracy of better than 1% of the distance within a range of 12 m. The real-time measurement accuracy for human body kinematics ranged from 0.8% to 7.5%. The results suggest that the proposed system is capable of real-time acquisition and monitoring of 3D human kinematics with favorable performance, showing great potential for various applications.


2011 ◽  
Vol 110-116 ◽  
pp. 2740-2745
Author(s):  
Kirana Kumara P. ◽  
Ashitava Ghosal

Real-time simulation of deformable solids is essential for some applications such as biological organ simulations for surgical simulators. In this work, deformable solids are approximated to be linear elastic, and an easy and straight forward numerical technique, the Finite Point Method (FPM), is used to model three dimensional linear elastostatics. Graphics Processing Unit (GPU) is used to accelerate computations. Results show that the Finite Point Method, together with GPU, can compute three dimensional linear elastostatic responses of solids at rates suitable for real-time graphics, for solids represented by reasonable number of points.


Sign in / Sign up

Export Citation Format

Share Document