Development of environment design support mixed reality system capable of environment estimation using deep learning

Impact ◽  
2020 ◽  
Vol 2020 (2) ◽  
pp. 9-11
Author(s):  
Tomohiro Fukuda

Mixed reality (MR) is rapidly becoming a vital tool, not just in gaming, but also in education, medicine, construction and environmental management. The term refers to systems in which computer-generated content is superimposed over objects in a real-world environment across one or more sensory modalities. Although most of us have heard of the use of MR in computer games, it also has applications in military and aviation training, as well as tourism, healthcare and more. In addition, it has the potential for use in architecture and design, where buildings can be superimposed in existing locations to render 3D generations of plans. However, one major challenge that remains in MR development is the issue of real-time occlusion. This refers to hiding 3D virtual objects behind real articles. Dr Tomohiro Fukuda, who is based at the Division of Sustainable Energy and Environmental Engineering, Graduate School of Engineering at Osaka University in Japan, is an expert in this field. Researchers, led by Dr Tomohiro Fukuda, are tackling the issue of occlusion in MR. They are currently developing a MR system that realises real-time occlusion by harnessing deep learning to achieve an outdoor landscape design simulation using a semantic segmentation technique. This methodology can be used to automatically estimate the visual environment prior to and after construction projects.

2007 ◽  
Vol 16 (06) ◽  
pp. 981-999 ◽  
Author(s):  
GEORGIOS N. YANNAKAKIS ◽  
JOHN HALLAM

This paper presents quantitative measurements/metrics of qualitative entertainment features within computer game environments and proposes artificial intelligence (AI) techniques for optimizing entertainment in such interactive systems. A human-verified metric of interest (i.e. player entertainment in real-time) for predator/prey games and a neuro-evolution on-line learning (i.e. during play) approach have already been reported in the literature to serve this purpose. In this paper, an alternative quantitative approach to entertainment modeling based on psychological studies in the field of computer games is introduced and a comparative study of the two approaches is presented. Feedforward neural networks (NNs) and fuzzy-NNs are used to model player satisfaction (interest) in real-time and investigate quantitatively how the qualitative factors of challenge and curiosity contribute to human entertainment. We demonstrate that appropriate non-extreme levels of challenge and curiosity generate high values of entertainment and we project the extensibility of the approach to other genres of digital entertainment (e.g. mixed-reality interactive playgrounds).


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 8072
Author(s):  
Yu-Bang Chang ◽  
Chieh Tsai ◽  
Chang-Hong Lin ◽  
Poki Chen

As the techniques of autonomous driving become increasingly valued and universal, real-time semantic segmentation has become very popular and challenging in the field of deep learning and computer vision in recent years. However, in order to apply the deep learning model to edge devices accompanying sensors on vehicles, we need to design a structure that has the best trade-off between accuracy and inference time. In previous works, several methods sacrificed accuracy to obtain a faster inference time, while others aimed to find the best accuracy under the condition of real time. Nevertheless, the accuracies of previous real-time semantic segmentation methods still have a large gap compared to general semantic segmentation methods. As a result, we propose a network architecture based on a dual encoder and a self-attention mechanism. Compared with preceding works, we achieved a 78.6% mIoU with a speed of 39.4 FPS with a 1024 × 2048 resolution on a Cityscapes test submission.


2020 ◽  
Vol 58 (12) ◽  
pp. 3049-3061
Author(s):  
Christoph Hoog Antink ◽  
Joana Carlos Mesquita Ferreira ◽  
Michael Paul ◽  
Simon Lyra ◽  
Konrad Heimann ◽  
...  

AbstractPhotoplethysmography imaging (PPGI) for non-contact monitoring of preterm infants in the neonatal intensive care unit (NICU) is a promising technology, as it could reduce medical adhesive-related skin injuries and associated complications. For practical implementations of PPGI, a region of interest has to be detected automatically in real time. As the neonates’ body proportions differ significantly from adults, existing approaches may not be used in a straightforward way, and color-based skin detection requires RGB data, thus prohibiting the use of less-intrusive near-infrared (NIR) acquisition. In this paper, we present a deep learning-based method for segmentation of neonatal video data. We augmented an existing encoder-decoder semantic segmentation method with a modified version of the ResNet-50 encoder. This reduced the computational time by a factor of 7.5, so that 30 frames per second can be processed at 960 × 576 pixels. The method was developed and optimized on publicly available databases with segmentation data from adults. For evaluation, a comprehensive dataset consisting of RGB and NIR video recordings from 29 neonates with various skin tones recorded in two NICUs in Germany and India was used. From all recordings, 643 frames were manually segmented. After pre-training the model on the public adult data, parts of the neonatal data were used for additional learning and left-out neonates are used for cross-validated evaluation. On the RGB data, the head is segmented well (82% intersection over union, 88% accuracy), and performance is comparable with those achieved on large, public, non-neonatal datasets. On the other hand, performance on the NIR data was inferior. By employing data augmentation to generate additional virtual NIR data for training, results could be improved and the head could be segmented with 62% intersection over union and 65% accuracy. The method is in theory capable of performing segmentation in real time and thus it may provide a useful tool for future PPGI applications.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Ran Jin ◽  
Xiaozhen Han ◽  
Tongrui Yu

Image semantic segmentation as a kind of technology has been playing a crucial part in intelligent driving, medical image analysis, video surveillance, and AR. However, since the scene needs to infer more semantics from video and audio clips and the request for real-time performance becomes stricter, whetherthe single-label classification method that was usually used before or the regular manual labeling cannot meet this end. Given the excellent performance of deep learning algorithms in extensive applications, the image semantic segmentation algorithm based on deep learning framework has been brought under the spotlight of development. This paper attempts to improve the ESPNet (Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation) based on the multilabel classification method by the following steps. First, the standard convolution is replaced by applying Receptive Field in Deep Convolutional Neural Network in the convolution layer, to the extent that every pixel in the covered area would facilitate the ultimate feature response. Second, the ASPP (Atrous Spatial Pyramid Pooling) module is improved based on the atrous convolution, and the DB-ASPP (Delate Batch Normalization-ASPP) is proposed as a way to reducing gridding artifacts due to the multilayer atrous convolution, acquiring multiscale information, and integrating the feature information in relation to the image set. Finally, the proposed model and regular models are subject to extensive tests and comparisons on a plurality of multiple data sets. Results show that the proposed model demonstrates a good accuracy of segmentation, the smallest network parameter at 0.3 M and the fastest speed of segmentation at 25 FPS.


2020 ◽  
Vol 10 (2) ◽  
Author(s):  
Fazliaty Edora Fadzli ◽  
Ajune Wanis Ismail

Mixed Reality (MR) is a technology which enable to bring a virtual element into the real-world environment. MR intends to improve reality on the virtual world immerse onto real-world space. Occasionally the MR has been improved as the display technologies advanced progressively. In MR collaborative interface context, the local and remote user work together on collaborative task while sense the immersive environment in the cooperative application. User telepresence is an immersive telepresence, where the reconstruction of a human appears in a real-life. Up till now, producing full telepresence of the life-size human body may require a high transmission bandwidth of the internet. Therefore, this paper explores on a robust real-time 3D reconstruction method for MR telepresence. This paper discusses the previous works on the reconstruction method of a full-body human and the existing research works that have proposed the reconstruction methods for telepresence. Besides the 3D reconstruction method, this paper also enlightens our recent finding on the MR framework to transport a full-body human from a local location to a remote location. The MR telepresence will be discussed, as well as the robust 3D reconstruction method which has been implemented with user telepresence feature where the user experiences an accurate 3D representation of a remote person. The paper ends with the discussion and results, MR telepresence with robust 3D reconstruction method to execute user telepresence.


Sign in / Sign up

Export Citation Format

Share Document