scholarly journals EnvSLAM: Combining SLAM Systems and Neural Networks to Improve the Environment Fusion in AR Applications

2021 ◽  
Vol 10 (11) ◽  
pp. 772
Author(s):  
Giulia Marchesi ◽  
Christian Eichhorn ◽  
David A. Plecher ◽  
Yuta Itoh ◽  
Gudrun Klinker

Augmented Reality (AR) has increasingly benefited from the use of Simultaneous Localization and Mapping (SLAM) systems. This technology has enabled developers to create AR markerless applications, but lack semantic understanding of their environment. The inclusion of this information would empower AR applications to better react to the surroundings more realistically. To gain semantic knowledge, in recent years, focus has shifted toward fusing SLAM systems with neural networks, giving birth to the field of Semantic SLAM. Building on existing research, this paper aimed to create a SLAM system that generates a 3D map using ORB-SLAM2 and enriches it with semantic knowledge originated from the Fast-SCNN network. The key novelty of our approach is a new method for improving the predictions of neural networks, employed to balance the loss of accuracy introduced by efficient real-time models. Exploiting sensor information provided by a smartphone, GPS coordinates are utilized to query the OpenStreetMap database. The returned information is used to understand which classes are currently absent in the environment, so that they can be removed from the network’s prediction with the goal of improving its accuracy. We achieved 87.40% Pixel Accuracy with Fast-SCNN on our custom version of COCO-Stuff and showed an improvement by involving GPS data for our self-made smartphone dataset resulting in 90.24% Pixel Accuracy. Having in mind the use on smartphones, the implementation aimed to find a trade-off between accuracy and efficiency, making the system achieve an unprecedented speed. To this end, the system was carefully designed and a strong focus on lightweight neural networks is also fundamental. This enabled the creation of an above real-time Semantic SLAM system that we called EnvSLAM (Environment SLAM). Our extensive evaluation reveals the efficiency of the system features and the operability in above real-time (48.1 frames per second with an input image resolution of 640 × 360 pixels). Moreover, the GPS integration indicates an effective improvement of the network’s prediction accuracy.

Robotica ◽  
2007 ◽  
Vol 25 (2) ◽  
pp. 175-187 ◽  
Author(s):  
Staffan Ekvall ◽  
Danica Kragic ◽  
Patric Jensfelt

SUMMARYThe problem studied in this paper is a mobile robot that autonomously navigates in a domestic environment, builds a map as it moves along and localizes its position in it. In addition, the robot detects predefined objects, estimates their position in the environment and integrates this with the localization module to automatically put the objects in the generated map. Thus, we demonstrate one of the possible strategies for the integration of spatial and semantic knowledge in a service robot scenario where a simultaneous localization and mapping (SLAM) and object detection recognition system work in synergy to provide a richer representation of the environment than it would be possible with either of the methods alone. Most SLAM systems build maps that are only used for localizing the robot. Such maps are typically based on grids or different types of features such as point and lines. The novelty is the augmentation of this process with an object-recognition system that detects objects in the environment and puts them in the map generated by the SLAM system. The metric map is also split into topological entities corresponding to rooms. In this way, the user can command the robot to retrieve a certain object from a certain room. We present the results of map building and an extensive evaluation of the object detection algorithm performed in an indoor setting.


Pedestrians in the vehicle way are in peril of being hit, along these lines making extreme damage walkers and vehicle inhabitants. Hence, constant person on foot identification was done through a set of recorded videos and the system detects the persons/pedestrians in the given input videos. In this survey, a continuous plan was proposed dependent on Aggregated Channel Features (ACF) and CPU. The proposed technique doesn't have to resize the information picture neither the video quality. We also use SVM with HOG and SVM with HAAR to detect the pedestrians. In addition, the Convolutional Neural Networks (CNN) were trained with a set of pedestrian images datasets and later tested on some test-set of pedestrian images. The analyses demonstrated that the proposed technique could be utilized to distinguish people on foot in the video with satisfactory mistake rates and high prediction accuracy. In this manner, it tends to be applied progressively for any real-time streaming of videos and also for prediction of pedestrians in prerecorded videos.


Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4755
Author(s):  
Huai-Mu Wang ◽  
Huei-Yung Lin ◽  
Chin-Chen Chang

In this paper, we present a real-time object detection and depth estimation approach based on deep convolutional neural networks (CNNs). We improve object detection through the incorporation of transfer connection blocks (TCBs), in particular, to detect small objects in real time. For depth estimation, we introduce binocular vision to the monocular-based disparity estimation network, and the epipolar constraint is used to improve prediction accuracy. Finally, we integrate the two-dimensional (2D) location of the detected object with the depth information to achieve real-time detection and depth estimation. The results demonstrate that the proposed approach achieves better results compared to conventional methods.


Author(s):  
Muhammad Hanif Ahmad Nizar ◽  
Chow Khuen Chan ◽  
Azira Khalil ◽  
Ahmad Khairuddin Mohamed Yusof ◽  
Khin Wee Lai

Background: Valvular heart disease is a serious disease leading to mortality and increasing medical care cost. The aortic valve is the most common valve affected by this disease. Doctors rely on echocardiogram for diagnosing and evaluating valvular heart disease. However, the images from echocardiogram are poor in comparison to Computerized Tomography and Magnetic Resonance Imaging scan. This study proposes the development of Convolutional Neural Networks (CNN) that can function optimally during a live echocardiographic examination for detection of the aortic valve. An automated detection system in an echocardiogram will improve the accuracy of medical diagnosis and can provide further medical analysis from the resulting detection. Methods: Two detection architectures, Single Shot Multibox Detector (SSD) and Faster Regional based Convolutional Neural Network (R-CNN) with various feature extractors were trained on echocardiography images from 33 patients. Thereafter, the models were tested on 10 echocardiography videos. Results: Faster R-CNN Inception v2 had shown the highest accuracy (98.6%) followed closely by SSD Mobilenet v2. In terms of speed, SSD Mobilenet v2 resulted in a loss of 46.81% in framesper- second (fps) during real-time detection but managed to perform better than the other neural network models. Additionally, SSD Mobilenet v2 used the least amount of Graphic Processing Unit (GPU) but the Central Processing Unit (CPU) usage was relatively similar throughout all models. Conclusion: Our findings provide a foundation for implementing a convolutional detection system to echocardiography for medical purposes.


Author(s):  
Dimitrios Boursinos ◽  
Xenofon Koutsoukos

AbstractMachine learning components such as deep neural networks are used extensively in cyber-physical systems (CPS). However, such components may introduce new types of hazards that can have disastrous consequences and need to be addressed for engineering trustworthy systems. Although deep neural networks offer advanced capabilities, they must be complemented by engineering methods and practices that allow effective integration in CPS. In this paper, we proposed an approach for assurance monitoring of learning-enabled CPS based on the conformal prediction framework. In order to allow real-time assurance monitoring, the approach employs distance learning to transform high-dimensional inputs into lower size embedding representations. By leveraging conformal prediction, the approach provides well-calibrated confidence and ensures a bounded small error rate while limiting the number of inputs for which an accurate prediction cannot be made. We demonstrate the approach using three datasets of mobile robot following a wall, speaker recognition, and traffic sign recognition. The experimental results demonstrate that the error rates are well-calibrated while the number of alarms is very small. Furthermore, the method is computationally efficient and allows real-time assurance monitoring of CPS.


Sign in / Sign up

Export Citation Format

Share Document