indoor scenes Latest Research Papers

In this paper, we extend RGB-D SLAM to address the problem that sparse map-building RGB-D SLAM cannot directly generate maps for indoor navigation and propose a SLAM system for fast generation of indoor planar maps. The system uses RGBD images to generate positional information while converting the corresponding RGBD images into 2D planar lasers for 2D grid navigation map reconstruction of indoor scenes under the condition of limited computational resources, solving the problem that the sparse point cloud maps generated by RGB-D SLAM cannot be directly used for navigation. Meanwhile, the pose information provided by RGB-D SLAM and scan matching respectively is fused to obtain a more accurate and robust pose, which improves the accuracy of map building. Furthermore, we demonstrate the function of the proposed system on the ICL indoor dataset and evaluate the performance of different RGB-D SLAM. The method proposed in this paper can be generalized to RGB-D SLAM algorithms, and the accuracy of map building will be further improved with the development of RGB-D SLAM algorithms.

Download Full-text

2D-to-3D Projection for Monocular and Multi-View 3D Multi-class Object Detection in Indoor Scenes

PROGRAMMNAYA INGENERIA ◽

10.17587/prin.12.459-469 ◽

2021 ◽

Vol 12 (9) ◽

pp. 459-469

Author(s):

D. D. Rukhovich ◽

Keyword(s):

Path Planning ◽

Object Detection ◽

Mobile Robots ◽

3D Object ◽

Indoor Scenes ◽

Novel Method ◽

Semantic Scene ◽

Almost All ◽

3D Object Detection ◽

2D To 3D

In this paper, we propose a novel method of joint 3D object detection and room layout estimation. The proposed method surpasses all existing methods of 3D object detection from monocular images on the indoor SUN RGB-D dataset. Moreover, the proposed method shows competitive results on the ScanNet dataset in multi-view mode. Both these datasets are collected in various residential, administrative, educational and industrial spaces, and altogether they cover almost all possible use cases. Moreover, we are the first to formulate and solve a problem of multi-class 3D object detection from multi-view inputs in indoor scenes. The proposed method can be integrated into the controlling systems of mobile robots. The results of this study can be used to address a navigation task, as well as path planning, capturing and manipulating scene objects, and semantic scene mapping.

Download Full-text

A Robust Rigid Registration Framework of 3D Indoor Scene Point Clouds Based on RGB-D Information

Remote Sensing ◽

10.3390/rs13234755 ◽

2021 ◽

Vol 13 (23) ◽

pp. 4755

Author(s):

Saishang Zhong ◽

Mingqiang Guo ◽

Ruina Lv ◽

Jianguo Chen ◽

Zhong Xie ◽

...

Keyword(s):

State Of The Art ◽

Texture Features ◽

Point Clouds ◽

Smooth Transition ◽

Normal Filter ◽

Rigid Registration ◽

Geometry Modeling ◽

Indoor Scene ◽

Indoor Scenes ◽

Extraction Scheme

Rigid registration of 3D indoor scenes is a fundamental yet vital task in various fields that include remote sensing (e.g., 3D reconstruction of indoor scenes), photogrammetry measurement, geometry modeling, etc. Nevertheless, state-of-the-art registration approaches still have defects when dealing with low-quality indoor scene point clouds derived from consumer-grade RGB-D sensors. The major challenge is accurately extracting correspondences between a pair of low-quality point clouds when they contain considerable noise, outliers, or weak texture features. To solve the problem, we present a point cloud registration framework in view of RGB-D information. First, we propose a point normal filter for effectively removing noise and simultaneously maintaining sharp geometric features and smooth transition regions. Second, we design a correspondence extraction scheme based on a novel descriptor encoding textural and geometry information, which can robustly establish dense correspondences between a pair of low-quality point clouds. Finally, we propose a point-to-plane registration technology via a nonconvex regularizer, which can further diminish the influence of those false correspondences and produce an exact rigid transformation between a pair of point clouds. Compared to existing state-of-the-art techniques, intensive experimental results demonstrate that our registration framework is excellent visually and numerically, especially for dealing with low-quality indoor scenes.

Download Full-text

Free-viewpoint Indoor Neural Relighting from Multi-view Stereo

ACM Transactions on Graphics ◽

10.1145/3469842 ◽

2021 ◽

Vol 40 (5) ◽

pp. 1-18

Author(s):

Julien Philip ◽

Sébastien Morgenthaler ◽

Michaël Gharbi ◽

George Drettakis

Keyword(s):

Three Dimensional ◽

Mirror Reflection ◽

Light Sources ◽

Feature Maps ◽

Convolutional Network ◽

Implicit Representation ◽

Input Feature ◽

Physically Based ◽

Indoor Scenes ◽

Physically Based Rendering

We introduce a neural relighting algorithm for captured indoors scenes, that allows interactive free-viewpoint navigation. Our method allows illumination to be changed synthetically, while coherently rendering cast shadows and complex glossy materials. We start with multiple images of the scene and a three-dimensional mesh obtained by multi-view stereo (MVS) reconstruction. We assume that lighting is well explained as the sum of a view-independent diffuse component and a view-dependent glossy term concentrated around the mirror reflection direction. We design a convolutional network around input feature maps that facilitate learning of an implicit representation of scene materials and illumination, enabling both relighting and free-viewpoint navigation. We generate these input maps by exploiting the best elements of both image-based and physically based rendering. We sample the input views to estimate diffuse scene irradiance, and compute the new illumination caused by user-specified light sources using path tracing. To facilitate the network's understanding of materials and synthesize plausible glossy reflections, we reproject the views and compute mirror images . We train the network on a synthetic dataset where each scene is also reconstructed with MVS. We show results of our algorithm relighting real indoor scenes and performing free-viewpoint navigation with complex and realistic glossy reflections, which so far remained out of reach for view-synthesis techniques.

Download Full-text

Efficient 3D Mapping and Modelling of Indoor Scenes with the Microsoft HoloLens: A Survey

PFG – Journal of Photogrammetry Remote Sensing and Geoinformation Science ◽

10.1007/s41064-021-00163-y ◽

2021 ◽

Author(s):

Martin Weinmann ◽

Sven Wursthorn ◽

Michael Weinmann ◽

Patrick Hübner

Keyword(s):

Local Environment ◽

Semantic Segmentation ◽

Indoor Environments ◽

3D Mapping ◽

Rule Based ◽

Correct Model ◽

3D Data ◽

Laser Scanners ◽

Indoor Scenes ◽

Microsoft Hololens

AbstractThe Microsoft HoloLens is a head-worn mobile augmented reality device. It allows a real-time 3D mapping of its direct environment and a self-localisation within the acquired 3D data. Both aspects are essential for robustly augmenting the local environment around the user with virtual contents and for the robust interaction of the user with virtual objects. Although not primarily designed as an indoor mapping device, the Microsoft HoloLens has a high potential for an efficient and comfortable mapping of both room-scale and building-scale indoor environments. In this paper, we provide a survey on the capabilities of the Microsoft HoloLens (Version 1) for the efficient 3D mapping and modelling of indoor scenes. More specifically, we focus on its capabilities regarding the localisation (in terms of pose estimation) within indoor environments and the spatial mapping of indoor environments. While the Microsoft HoloLens can certainly not compete in providing highly accurate 3D data like laser scanners, we demonstrate that the acquired data provides sufficient accuracy for a subsequent standard rule-based reconstruction of a semantically enriched and topologically correct model of an indoor scene from the acquired data. Furthermore, we provide a discussion with respect to the robustness of standard handcrafted geometric features extracted from data acquired with the Microsoft HoloLens and typically used for a subsequent learning-based semantic segmentation.

Download Full-text

Combining data-and-model-driven 3D modelling (CDMD3DM) for small indoor scenes using RGB-D data

ISPRS Journal of Photogrammetry and Remote Sensing ◽

10.1016/j.isprsjprs.2021.08.006 ◽

2021 ◽

Vol 180 ◽

pp. 1-13

Author(s):

Chang Li ◽

Tianrong Guan ◽

Meng Yang ◽

Ce Zhang

Keyword(s):

3D Modelling ◽

Model Driven ◽

Combining Data ◽

Indoor Scenes

Download Full-text

Visual and Semantic Similarity Norms for a Photographic Image Stimulus Set Containing Recognizable Objects, Animals and Scenes

10.31234/osf.io/he6s5 ◽

2021 ◽

Author(s):

Zhuohan Jiang ◽

D. Merika W. Sanders ◽

Rosemary Cowell

Keyword(s):

Semantic Similarity ◽

Conceptual Knowledge ◽

Stress Tests ◽

Photographic Images ◽

Dimensional Scaling ◽

Outdoor Scenes ◽

Indoor Scenes ◽

Similarity Ratings ◽

Similarity Relationships ◽

Amazon's Mechanical Turk

We collected visual and semantic similarity norms for a set of photographic images comprising 120 recognizable objects/animals and 120 indoor/outdoor scenes. Human observers rated the similarity of pairs of images within four categories of stimulus ‒ inanimate objects, animals, indoor scenes and outdoor scenes ‒ via Amazon's Mechanical Turk. We performed multi-dimensional scaling (MDS) on the collected similarity ratings to visualize the perceived similarity for each image category, for both visual and semantic ratings. The MDS solutions revealed the expected similarity relationships between images within each category, along with intuitively sensible differences between visual and semantic similarity relationships for each category. Stress tests performed on the MDS solutions indicated that the MDS analyses captured meaningful levels of variance in the similarity data. These stimuli, associated norms and naming data are made publicly available, and should provide a useful resource for researchers of vision, memory and conceptual knowledge wishing to run experiments using well-parameterized stimulus sets.

Download Full-text

CSI Amplitude Fingerprinting for Indoor Localization with Dictionary Learning

Entropy ◽

10.3390/e23091164 ◽

2021 ◽

Vol 23 (9) ◽

pp. 1164

Author(s):

Wen Liu ◽

Xu Wang ◽

Zhongliang Deng

Keyword(s):

Dictionary Learning ◽

Indoor Localization ◽

Learning Algorithm ◽

Indoor Positioning ◽

Reconstruction Error ◽

Test Point ◽

Training Data ◽

Support Vector ◽

Location Services ◽

Indoor Scenes

With the rapid growth of the demand for location services in the indoor environment, fingerprint-based indoor positioning has attracted widespread attention due to its high-precision characteristics. This paper proposes a double-layer dictionary learning algorithm based on channel state information (DDLC). The DDLC system includes two stages. In the offline training stage, a two-layer dictionary learning architecture is constructed for the complex conditions of indoor scenes. In the first layer, for the input training data of different regions, multiple sub-dictionaries are generated corresponding to learning, and non-coherent promotion items are added to emphasize the discrimination between sparse coding in different regions. The second-level dictionary learning introduces support vector discriminant items for the fingerprint points inside each region, and uses Max-margin to distinguish different fingerprint points. In the online positioning stage, we first determine the area of the test point based on the reconstruction error, and then use the support vector discriminator to complete the fingerprint matching work. In this experiment, we selected two representative indoor positioning environments, and compared the DDLC with several existing indoor positioning methods. The results show that DDLC can effectively reduce positioning errors, and because the dictionary itself is easy to maintain and update, the characteristic of strong anti-noise ability can be better used in CSI indoor positioning work.

Download Full-text

What Makes an Image Interesting and How Can We Explain It

Frontiers in Psychology ◽

10.3389/fpsyg.2021.668651 ◽

2021 ◽

Vol 12 ◽

Author(s):

Maham Gardezi ◽

King Hei Fung ◽

Usman Mirza Baig ◽

Mariam Ismail ◽

Oren Kadosh ◽

...

Keyword(s):

Eye Movements ◽

Real World ◽

Fixation Duration ◽

Intrinsic Property ◽

Visual Cognition ◽

The Other ◽

Viewing Time ◽

Indoor Scenes ◽

Order Variable ◽

Cognition And Knowledge

Here, we explore the question: What makes a photograph interesting? Answering this question deepens our understanding of human visual cognition and knowledge gained can be leveraged to reliably and widely disseminate information. Observers viewed images belonging to different categories, which covered a wide, representative spectrum of real-world scenes, in a self-paced manner and, at trial’s end, rated each image’s interestingness. Our studies revealed the following: landscapes were the most interesting of all categories tested, followed by scenes with people and cityscapes, followed still by aerial scenes, with indoor scenes of homes and offices being least interesting. Judgments of relative interestingness of pairs of images, setting a fixed viewing duration, or changing viewing history – all of the above manipulations failed to alter the hierarchy of image category interestingness, indicating that interestingness is an intrinsic property of an image unaffected by external manipulation or agent. Contrary to popular belief, low-level accounts based on computational image complexity, color, or viewing time failed to explain image interestingness: more interesting images were not viewed for longer and were not more complex or colorful. On the other hand, a single higher-order variable, namely image uprightness, significantly improved models of average interest. Observers’ eye movements partially predicted overall average interest: a regression model with number of fixations, mean fixation duration, and a custom measure of novel fixations explained >40% of variance. Our research revealed a clear category-based hierarchy of image interestingness, which appears to be a different dimension altogether from memorability or awe and is as yet unexplained by the dual appraisal hypothesis.

Download Full-text

High-precision and robust localization system for mobile robots in complex and large-scale indoor scenes

International Journal of Advanced Robotic Systems ◽

10.1177/17298814211047690 ◽

2021 ◽

Vol 18 (5) ◽

pp. 172988142110476

Author(s):

Jibo Wang ◽

Chengpeng Li ◽

Bangyu Li ◽

Chenglin Pang ◽

Zheng Fang

Keyword(s):

Mobile Robots ◽

Real Time ◽

High Precision ◽

Autonomous Navigation ◽

Target Area ◽

Global Localization ◽

Localization System ◽

Current Localization ◽

Robust Localization ◽

Indoor Scenes

High-precision and robust localization is the key issue for long-term and autonomous navigation of mobile robots in industrial scenes. In this article, we propose a high-precision and robust localization system based on laser and artificial landmarks. The proposed localization system is mainly composed of three modules, namely scoring mechanism-based global localization module, laser and artificial landmark-based localization module, and relocalization trigger module. Global localization module processes the global map to obtain the map pyramid, thus improve the global localization speed and accuracy when robots are powered on or kidnapped. Laser and artificial landmark-based localization module is employed to achieve robust localization in highly dynamic scenes and high-precision localization in target areas. The relocalization trigger module is used to monitor the current localization quality in real time by matching the current laser scan with the global map and feeds it back to the global localization module to improve the robustness of the system. Experimental results show that our method can achieve robust robot localization and real-time detection of the current localization quality in indoor scenes and industrial environment. In the target area, the position error is less than 0.004 m and the angle error is less than 0.01 rad.

Download Full-text

indoor scenes
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Incremental 2D Grid Map Generation from RGB-D Images

2D-to-3D Projection for Monocular and Multi-View 3D Multi-class Object Detection in Indoor Scenes

A Robust Rigid Registration Framework of 3D Indoor Scene Point Clouds Based on RGB-D Information

Free-viewpoint Indoor Neural Relighting from Multi-view Stereo

Efficient 3D Mapping and Modelling of Indoor Scenes with the Microsoft HoloLens: A Survey

Combining data-and-model-driven 3D modelling (CDMD3DM) for small indoor scenes using RGB-D data

Visual and Semantic Similarity Norms for a Photographic Image Stimulus Set Containing Recognizable Objects, Animals and Scenes

CSI Amplitude Fingerprinting for Indoor Localization with Dictionary Learning

What Makes an Image Interesting and How Can We Explain It

High-precision and robust localization system for mobile robots in complex and large-scale indoor scenes

Export Citation Format

indoor scenesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Incremental 2D Grid Map Generation from RGB-D Images

2D-to-3D Projection for Monocular and Multi-View 3D Multi-class Object Detection in Indoor Scenes

A Robust Rigid Registration Framework of 3D Indoor Scene Point Clouds Based on RGB-D Information

Free-viewpoint Indoor Neural Relighting from Multi-view Stereo

Efficient 3D Mapping and Modelling of Indoor Scenes with the Microsoft HoloLens: A Survey

Combining data-and-model-driven 3D modelling (CDMD3DM) for small indoor scenes using RGB-D data

Visual and Semantic Similarity Norms for a Photographic Image Stimulus Set Containing Recognizable Objects, Animals and Scenes

CSI Amplitude Fingerprinting for Indoor Localization with Dictionary Learning

What Makes an Image Interesting and How Can We Explain It

High-precision and robust localization system for mobile robots in complex and large-scale indoor scenes

indoor scenes
Recently Published Documents