indoor scenes
Recently Published Documents


TOTAL DOCUMENTS

307
(FIVE YEARS 113)

H-INDEX

25
(FIVE YEARS 5)

2021 ◽  
Author(s):  
Tingfeng Ye ◽  
Juzhong Zhang ◽  
Yingcai Wan ◽  
Ze Cui ◽  
Hongbo Yang

In this paper, we extend RGB-D SLAM to address the problem that sparse map-building RGB-D SLAM cannot directly generate maps for indoor navigation and propose a SLAM system for fast generation of indoor planar maps. The system uses RGBD images to generate positional information while converting the corresponding RGBD images into 2D planar lasers for 2D grid navigation map reconstruction of indoor scenes under the condition of limited computational resources, solving the problem that the sparse point cloud maps generated by RGB-D SLAM cannot be directly used for navigation. Meanwhile, the pose information provided by RGB-D SLAM and scan matching respectively is fused to obtain a more accurate and robust pose, which improves the accuracy of map building. Furthermore, we demonstrate the function of the proposed system on the ICL indoor dataset and evaluate the performance of different RGB-D SLAM. The method proposed in this paper can be generalized to RGB-D SLAM algorithms, and the accuracy of map building will be further improved with the development of RGB-D SLAM algorithms.


2021 ◽  
Vol 12 (9) ◽  
pp. 459-469
Author(s):  
D. D. Rukhovich ◽  

In this paper, we propose a novel method of joint 3D object detection and room layout estimation. The proposed method surpasses all existing methods of 3D object detection from monocular images on the indoor SUN RGB-D dataset. Moreover, the proposed method shows competitive results on the ScanNet dataset in multi-view mode. Both these datasets are collected in various residential, administrative, educational and industrial spaces, and altogether they cover almost all possible use cases. Moreover, we are the first to formulate and solve a problem of multi-class 3D object detection from multi-view inputs in indoor scenes. The proposed method can be integrated into the controlling systems of mobile robots. The results of this study can be used to address a navigation task, as well as path planning, capturing and manipulating scene objects, and semantic scene mapping.


2021 ◽  
Vol 13 (23) ◽  
pp. 4755
Author(s):  
Saishang Zhong ◽  
Mingqiang Guo ◽  
Ruina Lv ◽  
Jianguo Chen ◽  
Zhong Xie ◽  
...  

Rigid registration of 3D indoor scenes is a fundamental yet vital task in various fields that include remote sensing (e.g., 3D reconstruction of indoor scenes), photogrammetry measurement, geometry modeling, etc. Nevertheless, state-of-the-art registration approaches still have defects when dealing with low-quality indoor scene point clouds derived from consumer-grade RGB-D sensors. The major challenge is accurately extracting correspondences between a pair of low-quality point clouds when they contain considerable noise, outliers, or weak texture features. To solve the problem, we present a point cloud registration framework in view of RGB-D information. First, we propose a point normal filter for effectively removing noise and simultaneously maintaining sharp geometric features and smooth transition regions. Second, we design a correspondence extraction scheme based on a novel descriptor encoding textural and geometry information, which can robustly establish dense correspondences between a pair of low-quality point clouds. Finally, we propose a point-to-plane registration technology via a nonconvex regularizer, which can further diminish the influence of those false correspondences and produce an exact rigid transformation between a pair of point clouds. Compared to existing state-of-the-art techniques, intensive experimental results demonstrate that our registration framework is excellent visually and numerically, especially for dealing with low-quality indoor scenes.


2021 ◽  
Vol 40 (5) ◽  
pp. 1-18
Author(s):  
Julien Philip ◽  
Sébastien Morgenthaler ◽  
Michaël Gharbi ◽  
George Drettakis

We introduce a neural relighting algorithm for captured indoors scenes, that allows interactive free-viewpoint navigation. Our method allows illumination to be changed synthetically, while coherently rendering cast shadows and complex glossy materials. We start with multiple images of the scene and a three-dimensional mesh obtained by multi-view stereo (MVS) reconstruction. We assume that lighting is well explained as the sum of a view-independent diffuse component and a view-dependent glossy term concentrated around the mirror reflection direction. We design a convolutional network around input feature maps that facilitate learning of an implicit representation of scene materials and illumination, enabling both relighting and free-viewpoint navigation. We generate these input maps by exploiting the best elements of both image-based and physically based rendering. We sample the input views to estimate diffuse scene irradiance, and compute the new illumination caused by user-specified light sources using path tracing. To facilitate the network's understanding of materials and synthesize plausible glossy reflections, we reproject the views and compute mirror images . We train the network on a synthetic dataset where each scene is also reconstructed with MVS. We show results of our algorithm relighting real indoor scenes and performing free-viewpoint navigation with complex and realistic glossy reflections, which so far remained out of reach for view-synthesis techniques.


Author(s):  
Martin Weinmann ◽  
Sven Wursthorn ◽  
Michael Weinmann ◽  
Patrick Hübner

AbstractThe Microsoft HoloLens is a head-worn mobile augmented reality device. It allows a real-time 3D mapping of its direct environment and a self-localisation within the acquired 3D data. Both aspects are essential for robustly augmenting the local environment around the user with virtual contents and for the robust interaction of the user with virtual objects. Although not primarily designed as an indoor mapping device, the Microsoft HoloLens has a high potential for an efficient and comfortable mapping of both room-scale and building-scale indoor environments. In this paper, we provide a survey on the capabilities of the Microsoft HoloLens (Version 1) for the efficient 3D mapping and modelling of indoor scenes. More specifically, we focus on its capabilities regarding the localisation (in terms of pose estimation) within indoor environments and the spatial mapping of indoor environments. While the Microsoft HoloLens can certainly not compete in providing highly accurate 3D data like laser scanners, we demonstrate that the acquired data provides sufficient accuracy for a subsequent standard rule-based reconstruction of a semantically enriched and topologically correct model of an indoor scene from the acquired data. Furthermore, we provide a discussion with respect to the robustness of standard handcrafted geometric features extracted from data acquired with the Microsoft HoloLens and typically used for a subsequent learning-based semantic segmentation.


2021 ◽  
Author(s):  
Zhuohan Jiang ◽  
D. Merika W. Sanders ◽  
Rosemary Cowell

We collected visual and semantic similarity norms for a set of photographic images comprising 120 recognizable objects/animals and 120 indoor/outdoor scenes. Human observers rated the similarity of pairs of images within four categories of stimulus ‒ inanimate objects, animals, indoor scenes and outdoor scenes ‒ via Amazon's Mechanical Turk. We performed multi-dimensional scaling (MDS) on the collected similarity ratings to visualize the perceived similarity for each image category, for both visual and semantic ratings. The MDS solutions revealed the expected similarity relationships between images within each category, along with intuitively sensible differences between visual and semantic similarity relationships for each category. Stress tests performed on the MDS solutions indicated that the MDS analyses captured meaningful levels of variance in the similarity data. These stimuli, associated norms and naming data are made publicly available, and should provide a useful resource for researchers of vision, memory and conceptual knowledge wishing to run experiments using well-parameterized stimulus sets.


Entropy ◽  
2021 ◽  
Vol 23 (9) ◽  
pp. 1164
Author(s):  
Wen Liu ◽  
Xu Wang ◽  
Zhongliang Deng

With the rapid growth of the demand for location services in the indoor environment, fingerprint-based indoor positioning has attracted widespread attention due to its high-precision characteristics. This paper proposes a double-layer dictionary learning algorithm based on channel state information (DDLC). The DDLC system includes two stages. In the offline training stage, a two-layer dictionary learning architecture is constructed for the complex conditions of indoor scenes. In the first layer, for the input training data of different regions, multiple sub-dictionaries are generated corresponding to learning, and non-coherent promotion items are added to emphasize the discrimination between sparse coding in different regions. The second-level dictionary learning introduces support vector discriminant items for the fingerprint points inside each region, and uses Max-margin to distinguish different fingerprint points. In the online positioning stage, we first determine the area of the test point based on the reconstruction error, and then use the support vector discriminator to complete the fingerprint matching work. In this experiment, we selected two representative indoor positioning environments, and compared the DDLC with several existing indoor positioning methods. The results show that DDLC can effectively reduce positioning errors, and because the dictionary itself is easy to maintain and update, the characteristic of strong anti-noise ability can be better used in CSI indoor positioning work.


2021 ◽  
Vol 12 ◽  
Author(s):  
Maham Gardezi ◽  
King Hei Fung ◽  
Usman Mirza Baig ◽  
Mariam Ismail ◽  
Oren Kadosh ◽  
...  

Here, we explore the question: What makes a photograph interesting? Answering this question deepens our understanding of human visual cognition and knowledge gained can be leveraged to reliably and widely disseminate information. Observers viewed images belonging to different categories, which covered a wide, representative spectrum of real-world scenes, in a self-paced manner and, at trial’s end, rated each image’s interestingness. Our studies revealed the following: landscapes were the most interesting of all categories tested, followed by scenes with people and cityscapes, followed still by aerial scenes, with indoor scenes of homes and offices being least interesting. Judgments of relative interestingness of pairs of images, setting a fixed viewing duration, or changing viewing history – all of the above manipulations failed to alter the hierarchy of image category interestingness, indicating that interestingness is an intrinsic property of an image unaffected by external manipulation or agent. Contrary to popular belief, low-level accounts based on computational image complexity, color, or viewing time failed to explain image interestingness: more interesting images were not viewed for longer and were not more complex or colorful. On the other hand, a single higher-order variable, namely image uprightness, significantly improved models of average interest. Observers’ eye movements partially predicted overall average interest: a regression model with number of fixations, mean fixation duration, and a custom measure of novel fixations explained >40% of variance. Our research revealed a clear category-based hierarchy of image interestingness, which appears to be a different dimension altogether from memorability or awe and is as yet unexplained by the dual appraisal hypothesis.


2021 ◽  
Vol 18 (5) ◽  
pp. 172988142110476
Author(s):  
Jibo Wang ◽  
Chengpeng Li ◽  
Bangyu Li ◽  
Chenglin Pang ◽  
Zheng Fang

High-precision and robust localization is the key issue for long-term and autonomous navigation of mobile robots in industrial scenes. In this article, we propose a high-precision and robust localization system based on laser and artificial landmarks. The proposed localization system is mainly composed of three modules, namely scoring mechanism-based global localization module, laser and artificial landmark-based localization module, and relocalization trigger module. Global localization module processes the global map to obtain the map pyramid, thus improve the global localization speed and accuracy when robots are powered on or kidnapped. Laser and artificial landmark-based localization module is employed to achieve robust localization in highly dynamic scenes and high-precision localization in target areas. The relocalization trigger module is used to monitor the current localization quality in real time by matching the current laser scan with the global map and feeds it back to the global localization module to improve the robustness of the system. Experimental results show that our method can achieve robust robot localization and real-time detection of the current localization quality in indoor scenes and industrial environment. In the target area, the position error is less than 0.004 m and the angle error is less than 0.01 rad.


Sign in / Sign up

Export Citation Format

Share Document