Feature-Based Real-Time Visual SLAM Using Kinect

2014 ◽  
Vol 989-994 ◽  
pp. 2651-2654
Author(s):  
Yan Song ◽  
Bo He

In this paper, a novel feature-based real-time visual Simultaneous localization and mapping (SLAM) system is proposed. This system generates colored 3-D reconstruction models and 3-D estimated trajectory using a Kinect style camera. Microsoft Kinect, a low priced 3-D camera, is the only sensor we use in our experiment. Kinect style sensors give RGB-D (red-green-blue depth) data which contains 2D image and per-pixel depth information. ORB (Oriented FAST and Rotated BRIEF) is the algorithm used to extract image features for speed up the whole system. Our system can be used to generate 3-D detailed reconstruction models. Furthermore, an estimated 3D trajectory of the sensor is given in this paper. The results of the experiments demonstrate that our system performs robustly and effectively in both getting detailed 3D models and mapping camera trajectory.

Sensors ◽  
2019 ◽  
Vol 19 (10) ◽  
pp. 2251 ◽  
Author(s):  
Zeyong Shan ◽  
Ruijian Li ◽  
Sören Schwertfeger

Using camera sensors for ground robot Simultaneous Localization and Mapping (SLAM) has many benefits over laser-based approaches, such as the low cost and higher robustness. RGBD sensors promise the best of both worlds: dense data from cameras with depth information. This paper proposes to fuse RGBD and IMU data for a visual SLAM system, called VINS-RGBD, that is built upon the open source VINS-Mono software. The paper analyses the VINS approach and highlights the observability problems. Then, we extend the VINS-Mono system to make use of the depth data during the initialization process as well as during the VIO (Visual Inertial Odometry) phase. Furthermore, we integrate a mapping system based on subsampled depth data and octree filtering to achieve real-time mapping, including loop closing. We provide the software as well as datasets for evaluation. Our extensive experiments are performed with hand-held, wheeled and tracked robots in different environments. We show that ORB-SLAM2 fails for our application and see that our VINS-RGBD approach is superior to VINS-Mono.


Sensors ◽  
2019 ◽  
Vol 19 (17) ◽  
pp. 3699 ◽  
Author(s):  
Masoud S. Bahraini ◽  
Ahmad B. Rad ◽  
Mohammad Bozorg

The important problem of Simultaneous Localization and Mapping (SLAM) in dynamic environments is less studied than the counterpart problem in static settings. In this paper, we present a solution for the feature-based SLAM problem in dynamic environments. We propose an algorithm that integrates SLAM with multi-target tracking (SLAMMTT) using a robust feature-tracking algorithm for dynamic environments. A novel implementation of RANdomSAmple Consensus (RANSAC) method referred to as multilevel-RANSAC (ML-RANSAC) within the Extended Kalman Filter (EKF) framework is applied for multi-target tracking (MTT). We also apply machine learning to detect features from the input data and to distinguish moving from stationary objects. The data stream from LIDAR and vision sensors are fused in real-time to detect objects and depth information. A practical experiment is designed to verify the performance of the algorithm in a dynamic environment. The unique feature of this algorithm is its ability to maintain tracking of features even when the observations are intermittent whereby many reported algorithms fail in such situations. Experimental validation indicates that the algorithm is able to perform consistent estimates in a fast and robust manner suggesting its feasibility for real-time applications.


Author(s):  
Nadia Baha ◽  
Eden Beloudah ◽  
Mehdi Ousmer

Falls are the major health problem among older people who live alone in their home. In the past few years, several studies have been proposed to solve the dilemma especially those which exploit video surveillance. In this paper, in order to allow older adult to safely continue living in home environments, the authors propose a method which combines two different configurations of the Microsoft Kinect: The first one is based on the person's depth information and his velocity (Ceiling mounted Kinect). The second one is based on the variation of bounding box parameters and its velocity (Frontal Kinect). Experimental results on real datasets are conducted and a comparative evaluation of the obtained results relative to the state-of-art methods is presented. The results show that the authors' method is able to accurately detect several types of falls in real-time as well as achieving a significant reduction in false alarms and improves detection rates.


2020 ◽  
Vol 29 (16) ◽  
pp. 2050266
Author(s):  
Adnan Ramakić ◽  
Diego Sušanj ◽  
Kristijan Lenac ◽  
Zlatko Bundalo

Each person describes unique patterns during gait cycles and this information can be extracted from live video stream and used for subject identification. In recent years, there has been a profusion of sensors that in addition to RGB video images also provide depth data in real-time. In this paper, a method to enhance the appearance-based gait recognition method by also integrating features extracted from depth data is proposed. Two approaches are proposed that integrate simple depth features in a way suitable for real-time processing. Unlike previously presented works which usually use a short range sensors like Microsoft Kinect, here, a long-range stereo camera in outdoor environment is used. The experimental results for the proposed approaches show that recognition rates are improved when compared to existing popular gait recognition methods.


Sensors ◽  
2018 ◽  
Vol 18 (10) ◽  
pp. 3559 ◽  
Author(s):  
Runzhi Wang ◽  
Kaichang Di ◽  
Wenhui Wan ◽  
Yongkang Wang

In the study of indoor simultaneous localization and mapping (SLAM) problems using a stereo camera, two types of primary features—point and line segments—have been widely used to calculate the pose of the camera. However, many feature-based SLAM systems are not robust when the camera moves sharply or turns too quickly. In this paper, an improved indoor visual SLAM method to better utilize the advantages of point and line segment features and achieve robust results in difficult environments is proposed. First, point and line segment features are automatically extracted and matched to build two kinds of projection models. Subsequently, for the optimization problem of line segment features, we add minimization of angle observation in addition to the traditional re-projection error of endpoints. Finally, our model of motion estimation, which is adaptive to the motion state of the camera, is applied to build a new combinational Hessian matrix and gradient vector for iterated pose estimation. Furthermore, our proposal has been tested on EuRoC MAV datasets and sequence images captured with our stereo camera. The experimental results demonstrate the effectiveness of our improved point-line feature based visual SLAM method in improving localization accuracy when the camera moves with rapid rotation or violent fluctuation.


2013 ◽  
Vol 765-767 ◽  
pp. 2826-2829 ◽  
Author(s):  
Song Lin ◽  
Rui Min Hu ◽  
Yu Lian Xiao ◽  
Li Yu Gong

In this paper, we propose a novel real-time 3D hand gesture recognition algorithm based on depth information. We segment out the hand region from depth image and convert it to a point cloud. Then, 3D moment invariant features are computed at the point cloud. Finally, support vector machine (SVM) is employed to classify the shape of hand into different categories. We collect a benchmark dataset using Microsoft Kinect for Xbox and test the propose algorithm on it. Experimental results prove the robustness of our proposed algorithm.


Sensors ◽  
2020 ◽  
Vol 20 (17) ◽  
pp. 4922
Author(s):  
Like Cao ◽  
Jie Ling ◽  
Xiaohui Xiao

Noise appears in images captured by real cameras. This paper studies the influence of noise on monocular feature-based visual Simultaneous Localization and Mapping (SLAM). First, an open-source synthetic dataset with different noise levels is introduced in this paper. Then the images in the dataset are denoised using the Fast and Flexible Denoising convolutional neural Network (FFDNet); the matching performances of Scale Invariant Feature Transform (SIFT), Speeded Up Robust Features (SURF) and Oriented FAST and Rotated BRIEF (ORB) which are commonly used in feature-based SLAM are analyzed in comparison and the results show that ORB has a higher correct matching rate than that of SIFT and SURF, the denoised images have a higher correct matching rate than noisy images. Next, the Absolute Trajectory Error (ATE) of noisy and denoised sequences are evaluated on ORB-SLAM2 and the results show that the denoised sequences perform better than the noisy sequences at any noise level. Finally, the completely clean sequence in the dataset and the sequences in the KITTI dataset are denoised and compared with the original sequence through comprehensive experiments. For the clean sequence, the Root-Mean-Square Error (RMSE) of ATE after denoising has decreased by 16.75%; for KITTI sequences, 7 out of 10 sequences have lower RMSE than the original sequences. The results show that the denoised image can achieve higher accuracy in the monocular feature-based visual SLAM under certain conditions.


2020 ◽  
Vol 9 (4) ◽  
pp. 202
Author(s):  
Junhao Cheng ◽  
Zhi Wang ◽  
Hongyan Zhou ◽  
Li Li ◽  
Jian Yao

Most Simultaneous Localization and Mapping (SLAM) methods assume that environments are static. Such a strong assumption limits the application of most visual SLAM systems. The dynamic objects will cause many wrong data associations during the SLAM process. To address this problem, a novel visual SLAM method that follows the pipeline of feature-based methods called DM-SLAM is proposed in this paper. DM-SLAM combines an instance segmentation network with optical flow information to improve the location accuracy in dynamic environments, which supports monocular, stereo, and RGB-D sensors. It consists of four modules: semantic segmentation, ego-motion estimation, dynamic point detection and a feature-based SLAM framework. The semantic segmentation module obtains pixel-wise segmentation results of potentially dynamic objects, and the ego-motion estimation module calculates the initial pose. In the third module, two different strategies are presented to detect dynamic feature points for RGB-D/stereo and monocular cases. In the first case, the feature points with depth information are reprojected to the current frame. The reprojection offset vectors are used to distinguish the dynamic points. In the other case, we utilize the epipolar constraint to accomplish this task. Furthermore, the static feature points left are fed into the fourth module. The experimental results on the public TUM and KITTI datasets demonstrate that DM-SLAM outperforms the standard visual SLAM baselines in terms of accuracy in highly dynamic environments.


Sensors ◽  
2019 ◽  
Vol 19 (1) ◽  
pp. 161 ◽  
Author(s):  
Junqiao Zhao ◽  
Yewei Huang ◽  
Xudong He ◽  
Shaoming Zhang ◽  
Chen Ye ◽  
...  

Autonomous parking in an indoor parking lot without human intervention is one of the most demanded and challenging tasks of autonomous driving systems. The key to this task is precise real-time indoor localization. However, state-of-the-art low-level visual feature-based simultaneous localization and mapping systems (VSLAM) suffer in monotonous or texture-less scenes and under poor illumination or dynamic conditions. Additionally, low-level feature-based mapping results are hard for human beings to use directly. In this paper, we propose a semantic landmark-based robust VSLAM for real-time localization of autonomous vehicles in indoor parking lots. The parking slots are extracted as meaningful landmarks and enriched with confidence levels. We then propose a robust optimization framework to solve the aliasing problem of semantic landmarks by dynamically eliminating suboptimal constraints in the pose graph and correcting erroneous parking slots associations. As a result, a semantic map of the parking lot, which can be used by both autonomous driving systems and human beings, is established automatically and robustly. We evaluated the real-time localization performance using multiple autonomous vehicles, and an repeatability of 0.3 m track tracing was achieved at a 10 kph of autonomous driving.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4604
Author(s):  
Fei Zhou ◽  
Limin Zhang ◽  
Chaolong Deng ◽  
Xinyue Fan

Traditional visual simultaneous localization and mapping (SLAM) systems rely on point features to estimate camera trajectories. However, feature-based systems are usually not robust in complex environments such as weak textures or obvious brightness changes. To solve this problem, we used more environmental structure information by introducing line segments features and designed a monocular visual SLAM system. This system combines points and line segments to effectively make up for the shortcomings of traditional positioning based only on point features. First, ORB algorithm based on local adaptive threshold was proposed. Subsequently, we not only optimized the extracted line features, but also added a screening step before the traditional descriptor matching to combine the point features matching results with the line features matching. Finally, the weighting idea was introduced. When constructing the optimized cost function, we allocated weights reasonably according to the richness and dispersion of features. Our evaluation on publicly available datasets demonstrated that the improved point-line feature method is competitive with the state-of-the-art methods. In addition, the trajectory graph significantly reduced drift and loss, which proves that our system increases the robustness of SLAM.


Sign in / Sign up

Export Citation Format

Share Document