scholarly journals 3D Semantic VSLAM of Indoor Environment Based on Mask Scoring RCNN

2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Chongben Tao ◽  
Yufeng Jin ◽  
Feng Cao ◽  
Zufeng Zhang ◽  
Chunguang Li ◽  
...  

In view of existing Visual SLAM (VSLAM) algorithms when constructing semantic map of indoor environment, there are problems with low accuracy and low label classification accuracy when feature points are sparse. This paper proposed a 3D semantic VSLAM algorithm called BMASK-RCNN based on Mask Scoring RCNN. Firstly, feature points of images are extracted by Binary Robust Invariant Scalable Keypoints (BRISK) algorithm. Secondly, map points of reference key frame are projected to current frame for feature matching and pose estimation, and an inverse depth filter is used to estimate scene depth of created key frame to obtain camera pose changes. In order to achieve object detection and semantic segmentation for both static objects and dynamic objects in indoor environments and then construct dense 3D semantic map with VSLAM algorithm, a Mask Scoring RCNN is used to adjust its structure partially, where a TUM RGB-D SLAM dataset for transfer learning is employed. Semantic information of independent targets in scenes provides semantic information including categories, which not only provides high accuracy of localization but also realizes the probability update of semantic estimation by marking movable objects, thereby reducing the impact of moving objects on real-time mapping. Through simulation and actual experimental comparison with other three algorithms, results show the proposed algorithm has better robustness, and semantic information used in 3D semantic mapping can be accurately obtained.

Author(s):  
C. Li ◽  
Z. Kang ◽  
J. Yang ◽  
F. Li ◽  
Y. Wang

Abstract. Visual Simultaneous Localization and Mapping (SLAM) systems have been widely investigated in response to requirements, since the traditional positioning technology, such as Global Navigation Satellite System (GNSS), cannot accomplish tasks in restricted environments. However, traditional SLAM methods which are mostly based on point feature tracking, usually fail in harsh environments. Previous works have proven that insufficient feature points caused by missing textures, feature mismatches caused by too fast camera movements, and abrupt illumination changes will eventually cause state estimation to fail. And meanwhile, pedestrians are unavoidable, which introduces fake feature associations, thus violating the strict assumption that the unknown environment is static in SLAM. In order to ensure how our system copes with the huge challenges brought by these factors in a complex indoor environment, this paper proposes a semantic-assisted Visual Inertial Odometer (VIO) system towards low-textured scenes and highly dynamic environments. The trained U-net will be used to detect moving objects. Then all feature points in the dynamic object area need to be eliminated, so as to avoid moving objects to participate in the pose solution process and improve robustness in dynamic environments. Finally, the constraints of inertial measurement unit (IMU) are added for low-textured environments. To evaluate the performance of the proposed method, experiments were conducted on the EuRoC and TUM public dataset, and the results demonstrate that the performance of our approach is robust in complex indoor environments.


2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Xiong Zhao ◽  
Tao Zuo ◽  
Xinyu Hu

Most of the current visual Simultaneous Localization and Mapping (SLAM) algorithms are designed based on the assumption of a static environment, and their robustness and accuracy in the dynamic environment do not behave well. The reason is that moving objects in the scene will cause the mismatch of features in the pose estimation process, which further affects its positioning and mapping accuracy. In the meantime, the three-dimensional semantic map plays a key role in mobile robot navigation, path planning, and other tasks. In this paper, we present OFM-SLAM: Optical Flow combining MASK-RCNN SLAM, a novel visual SLAM for semantic mapping in dynamic indoor environments. Firstly, we use the Mask-RCNN network to detect potential moving objects which can generate masks of dynamic objects. Secondly, an optical flow method is adopted to detect dynamic feature points. Then, we combine the optical flow method and the MASK-RCNN for full dynamic points’ culling, and the SLAM system is able to track without these dynamic points. Finally, the semantic labels obtained from MASK-RCNN are mapped to the point cloud for generating a three-dimensional semantic map that only contains the static parts of the scenes and their semantic information. We evaluate our system in public TUM datasets. The results of our experiments demonstrate that our system is more effective in dynamic scenarios, and the OFM-SLAM can estimate the camera pose more accurately and acquire a more precise localization in the high dynamic environment.


Robotica ◽  
2019 ◽  
Vol 38 (2) ◽  
pp. 256-270 ◽  
Author(s):  
Jiyu Cheng ◽  
Yuxiang Sun ◽  
Max Q.-H. Meng

SummaryVisual simultaneous localization and mapping (visual SLAM) has been well developed in recent decades. To facilitate tasks such as path planning and exploration, traditional visual SLAM systems usually provide mobile robots with the geometric map, which overlooks the semantic information. To address this problem, inspired by the recent success of the deep neural network, we combine it with the visual SLAM system to conduct semantic mapping. Both the geometric and semantic information will be projected into the 3D space for generating a 3D semantic map. We also use an optical-flow-based method to deal with the moving objects such that our method is capable of working robustly in dynamic environments. We have performed our experiments in the public TUM dataset and our recorded office dataset. Experimental results demonstrate the feasibility and impressive performance of the proposed method.


Sensors ◽  
2018 ◽  
Vol 18 (12) ◽  
pp. 4290 ◽  
Author(s):  
Elena Luna ◽  
Juan San Miguel ◽  
Diego Ortego ◽  
José Martínez

During the last few years, abandoned object detection has emerged as a hot topic in the video-surveillance community. As a consequence, a myriad of systems has been proposed for automatic monitoring of public and private places, while addressing several challenges affecting detection performance. Due to the complexity of these systems, researchers often address independently the different analysis stages such as foreground segmentation, stationary object detection, and abandonment validation. Despite the improvements achieved for each stage, the advances are rarely applied to the full pipeline, and therefore, the impact of each stage of improvement on the overall system performance has not been studied. In this paper, we formalize the framework employed by systems for abandoned object detection and provide an extensive review of state-of-the-art approaches for each stage. We also build a multi-configuration system allowing one to select a range of alternatives for each stage with the objective of determining the combination achieving the best performance. This multi-configuration is made available online to the research community. We perform an extensive evaluation by gathering a heterogeneous dataset from existing data. Such a dataset allows considering multiple and different scenarios, whereas presenting various challenges such as illumination changes, shadows, and a high density of moving objects, unlike existing literature focusing on a few sequences. The experimental results identify the most effective configurations and highlight design choices favoring robustness to errors. Moreover, we validated such an optimal configuration on additional datasets not previously considered. We conclude the paper by discussing open research challenges arising from the experimental comparison.


2021 ◽  
Vol 11 (2) ◽  
pp. 645
Author(s):  
Xujie Kang ◽  
Jing Li ◽  
Xiangtao Fan ◽  
Hongdeng Jian ◽  
Chen Xu

Visual simultaneous localization and mapping (SLAM) is challenging in dynamic environments as moving objects can impair camera pose tracking and mapping. This paper introduces a method for robust dense bject-level SLAM in dynamic environments that takes a live stream of RGB-D frame data as input, detects moving objects, and segments the scene into different objects while simultaneously tracking and reconstructing their 3D structures. This approach provides a new method of dynamic object detection, which integrates prior knowledge of the object model database constructed, object-oriented 3D tracking against the camera pose, and the association between the instance segmentation results on the current frame data and an object database to find dynamic objects in the current frame. By leveraging the 3D static model for frame-to-model alignment, as well as dynamic object culling, the camera motion estimation reduced the overall drift. According to the camera pose accuracy and instance segmentation results, an object-level semantic map representation was constructed for the world map. The experimental results obtained using the TUM RGB-D dataset, which compares the proposed method to the related state-of-the-art approaches, demonstrating that our method achieves similar performance in static scenes and improved accuracy and robustness in dynamic scenes.


Robotica ◽  
2014 ◽  
Vol 34 (4) ◽  
pp. 837-858 ◽  
Author(s):  
F. Herranz ◽  
A. Llamazares ◽  
E. Molinos ◽  
M. Ocaña ◽  
M. A. Sotelo

SUMMARYLocalization and mapping in indoor environments, such as airports and hospitals, are key tasks for almost every robotic platform. Some researchers suggest the use of Range-Only (RO) sensors based on WiFi (Wireless Fidelity) technology with SLAM (Simultaneous Localization And Mapping) techniques to solve both problems. The current state of the art in RO SLAM is mainly focused on the filtering approach, while the study of smoothing approaches with RO sensors is quite incomplete. This paper presents a comparison between filtering algorithms, such as EKF and FastSLAM, and a smoothing algorithm, the SAM (Smoothing And Mapping). Experimental results are obtained in indoor environments using WiFi sensors. The results demonstrate the feasibility of the smoothing approach using WiFi sensors in an indoor environment.


Information ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 92
Author(s):  
Xiaoning Han ◽  
Shuailong Li ◽  
Xiaohui Wang ◽  
Weijia Zhou

Sensing and mapping its surroundings is an essential requirement for a mobile robot. Geometric maps endow robots with the capacity of basic tasks, e.g., navigation. To co-exist with human beings in indoor scenes, the need to attach semantic information to a geometric map, which is called a semantic map, has been realized in the last two decades. A semantic map can help robots to behave in human rules, plan and perform advanced tasks, and communicate with humans on the conceptual level. This survey reviews methods about semantic mapping in indoor scenes. To begin with, we answered the question, what is a semantic map for mobile robots, by its definitions. After that, we reviewed works about each of the three modules of semantic mapping, i.e., spatial mapping, acquisition of semantic information, and map representation, respectively. Finally, though great progress has been made, there is a long way to implement semantic maps in advanced tasks for robots, thus challenges and potential future directions are discussed before a conclusion at last.


Author(s):  
Laurentiu Predescu ◽  
Daniel Dunea

Optical monitors have proven their versatility into the studies of air quality in the workplace and indoor environments. The current study aimed to perform a screening of the indoor environment regarding the presence of various fractions of particulate matter (PM) and the specific thermal microclimate in a classroom occupied with students in March 2019 (before COVID-19 pandemic) and in March 2021 (during pandemic) at Valahia University Campus, Targoviste, Romania. The objectives were to assess the potential exposure of students and academic personnel to PM and to observe the performances of various sensors and monitors (particle counter, PM monitors, and indoor microclimate sensors). PM1 ranged between 29 and 41 μg m−3 and PM10 ranged between 30 and 42 μg m−3. It was observed that the particles belonged mostly to fine and submicrometric fractions in acceptable thermal environments according to the PPD and PMV indices. The particle counter recorded preponderantly 0.3, 0.5, and 1.0 micron categories. The average acute dose rate was estimated as 6.58 × 10−4 mg/kg-day (CV = 14.3%) for the 20–40 years range. Wearing masks may influence the indoor microclimate and PM levels but additional experiments should be performed at a finer scale.


2020 ◽  
pp. 1420326X2097546
Author(s):  
Richard A Sharpe ◽  
Andrew J Williams ◽  
Ben Simpson ◽  
Gemma Finnegan ◽  
Tim Jones

Fuel poverty affects around 34% of European homes, representing a considerable burden to society and healthcare systems. This pilot study assesses the impact of an intervention to install a new first time central heating system in order to reduce fuel poverty on household satisfaction with indoor temperatures/environment, ability to pay bills and mental well-being. In Cornwall, 183 households received the intervention and a further 374 went onto a waiting list control. A post-intervention postal questionnaires and follow-up phone calls were undertaken ( n = 557) to collect data on household demographics, resident satisfaction with indoor environment, finances and mental well-being (using the Short Warwick-Edinburgh Mental Wellbeing scale). We compared responses between the waiting list control and intervention group to assess the effectiveness of the intervention. A total of 31% of participants responded, 83 from the waiting list control and 71 from the intervention group. The intervention group reported improvements in the indoor environment, finances and mental well-being. However, these benefits were not expressed by all participants, which may result from diverse resident behaviours, lifestyles and housing characteristics. Future policies need to consider whole house approaches alongside resident training and other behaviour change techniques that can account for complex interactions between behaviours and the built environment.


2021 ◽  
Vol 15 (03) ◽  
pp. 337-357
Author(s):  
Alexander Julian Golkowski ◽  
Marcus Handte ◽  
Peter Roch ◽  
Pedro J. Marrón

For many application areas such as autonomous navigation, the ability to accurately perceive the environment is essential. For this purpose, a wide variety of well-researched sensor systems are available that can be used to detect obstacles or navigation targets. Stereo cameras have emerged as a very versatile sensing technology in this regard due to their low hardware cost and high fidelity. Consequently, much work has been done to integrate them into mobile robots. However, the existing literature focuses on presenting the concepts and algorithms used to implement the desired robot functions on top of a given camera setup. As a result, the rationale and impact of choosing this camera setup are usually neither discussed nor described. Thus, when designing the stereo camera system for a mobile robot, there is not much general guidance beyond isolated setups that worked for a specific robot. To close the gap, this paper studies the impact of the physical setup of a stereo camera system in indoor environments. To do this, we present the results of an experimental analysis in which we use a given software setup to estimate the distance to an object while systematically changing the camera setup. Thereby, we vary the three main parameters of the physical camera setup, namely the angle and distance between the cameras as well as the field of view and a rather soft parameter, the resolution. Based on the results, we derive several guidelines on how to choose the parameters for an application.


Sign in / Sign up

Export Citation Format

Share Document