semantic scene Latest Research Papers

2022 ◽

pp. 325-336

Author(s):

Daniel Schoepflin ◽

Karthik Iyer ◽

Martin Gomse ◽

Thorsten Schüppstuhl

Keyword(s):

Domain Knowledge ◽

Case Analysis ◽

Training Data ◽

Use Case ◽

Data Generation ◽

Promising Alternative ◽

Proper Training ◽

Semantic Scene ◽

Universal Nature ◽

Synthetic Training Data

Abstract Obtaining annotated data for proper training of AI image classifiers remains a challenge for successful deployment in industrial settings. As a promising alternative to handcrafted annotations, synthetic training data generation has grown in popularity. However, in most cases the pipelines used to generate this data are not of universal nature and have to be redesigned for different domain applications. This requires a detailed formulation of the domain through a semantic scene grammar. We aim to present such a grammar that is based on domain knowledge for the production-supplying transport of components in intralogistic settings. We present a use-case analysis for the domain of production supplying logistics and derive a scene grammar, which can be used to formulate similar problem statements in the domain for the purpose of data generation. We demonstrate the use of this grammar to feed a scene generation pipeline and obtain training data for an AI based image classifier.

Download Full-text

2D-to-3D Projection for Monocular and Multi-View 3D Multi-class Object Detection in Indoor Scenes

PROGRAMMNAYA INGENERIA ◽

10.17587/prin.12.459-469 ◽

2021 ◽

Vol 12 (9) ◽

pp. 459-469

Author(s):

D. D. Rukhovich ◽

Keyword(s):

Path Planning ◽

Object Detection ◽

Mobile Robots ◽

3D Object ◽

Indoor Scenes ◽

Novel Method ◽

Semantic Scene ◽

Almost All ◽

3D Object Detection ◽

2D To 3D

In this paper, we propose a novel method of joint 3D object detection and room layout estimation. The proposed method surpasses all existing methods of 3D object detection from monocular images on the indoor SUN RGB-D dataset. Moreover, the proposed method shows competitive results on the ScanNet dataset in multi-view mode. Both these datasets are collected in various residential, administrative, educational and industrial spaces, and altogether they cover almost all possible use cases. Moreover, we are the first to formulate and solve a problem of multi-class 3D object detection from multi-view inputs in indoor scenes. The proposed method can be integrated into the controlling systems of mobile robots. The results of this study can be used to address a navigation task, as well as path planning, capturing and manipulating scene objects, and semantic scene mapping.

Download Full-text

Indoor Semantic Scene Understanding Using 2D-3D Fusion

10.1109/dicta52665.2021.9647182 ◽

2021 ◽

Author(s):

Muraleekrishna Gopinathan ◽

Giang Truong ◽

Jumana Abu-Khalaf

Keyword(s):

Scene Understanding ◽

Semantic Scene

Download Full-text

Immersive audio-visual scene reproduction using semantic scene reconstruction from 360 cameras

Virtual Reality ◽

10.1007/s10055-021-00594-3 ◽

2021 ◽

Author(s):

Hansung Kim ◽

Luca Remaggi ◽

Aloisio Dourado ◽

Teofilo de Campos ◽

Philip J. B. Jackson ◽

...

Keyword(s):

Visual Information ◽

Acoustic Properties ◽

Scene Reconstruction ◽

Research Progress ◽

Spatial Audio ◽

Visual Content ◽

3D Audio ◽

Panoramic Images ◽

Semantic Scene ◽

Immersive Audio

AbstractAs personalised immersive display systems have been intensely explored in virtual reality (VR), plausible 3D audio corresponding to the visual content is required to provide more realistic experiences to users. It is well known that spatial audio synchronised with visual information improves a sense of immersion but limited research progress has been achieved in immersive audio-visual content production and reproduction. In this paper, we propose an end-to-end pipeline to simultaneously reconstruct 3D geometry and acoustic properties of the environment from a pair of omnidirectional panoramic images. A semantic scene reconstruction and completion method using a deep convolutional neural network is proposed to estimate the complete semantic scene geometry in order to adapt spatial audio reproduction to the scene. Experiments provide objective and subjective evaluations of the proposed pipeline for plausible audio-visual VR reproduction of real scenes.

Download Full-text

2D-to-3D Projection for Monocular and Multi-View 3D Object Detection in Outdoor Scenes

PROGRAMMNAYA INGENERIA ◽

10.17587/prin.12.373-384 ◽

2021 ◽

Vol 12 (7) ◽

pp. 373-384

Author(s):

D. D. Rukhovich ◽

Keyword(s):

Object Detection ◽

Detection Methods ◽

Navigation Systems ◽

Detection Problem ◽

3D Object ◽

Car Detection ◽

Extensive Evaluation ◽

Outdoor Scenes ◽

Semantic Scene ◽

3D Object Detection

In this article, we introduce the task of multi-view RGB-based 3D object detection as an end-to-end optimization problem. In a multi-view formulation of the 3D object detection problem, several images of a static scene are used to detect objects in the scene. To address the 3D object detection problem in a multi-view formulation, we propose a novel 3D object detection method named ImVoxelNet. ImVoxelNet is based on a fully convolutional neural network. Unlike existing 3D object detection methods, ImVoxelNet works directly with 3D representations and does not mediate 3D object detection through 2D object detection. The proposed method accepts multi-view inputs. The number of monocular images in each multi-view input can vary during training and inference; actually, this number might be unique for each multi-view input. Moreover, we propose to treat a single RGB image as a special case of a multi-view input. Accordingly, the proposed method can also accept monocular inputs with no modifications. Through extensive evaluation, we demonstrate that the proposed method successfully handles a variety of outdoor scenes. Specifically, it achieves state-of-the-art results in car detection on KITTI (monocular) and nuScenes (multi-view) benchmarks among all methods that accept RGB images. The proposed method operates in real-time, which makes it possible to integrate it into the navigation systems of autonomous devices. The results of this study can be used to address tasks of navigation, path planning, and semantic scene mapping.

Download Full-text

Up-to-Down Network: Fusing Multi-Scale Context for 3D Semantic Scene Completion

10.1109/iros51168.2021.9635888 ◽

2021 ◽

Author(s):

Hao Zou ◽

Xuemeng Yang ◽

Tianxin Huang ◽

Chujuan Zhang ◽

Yong Liu ◽

...

Keyword(s):

Multi Scale ◽

Semantic Scene

Download Full-text

Semantic-Based Explainable AI: Leveraging Semantic Scene Graphs and Pairwise Ranking to Explain Robot Failures

10.1109/iros51168.2021.9635890 ◽

2021 ◽

Author(s):

Devleena Das ◽

Sonia Chernova

Keyword(s):

Explainable Ai ◽

Semantic Scene

Download Full-text

Semantic scene-object consistency modulates N300/400 EEG components, but does not automatically facilitate object representations

10.1101/2021.08.19.456466 ◽

2021 ◽

Author(s):

Lixiang Chen ◽

Radoslaw Martin Cichy ◽

Daniel Kaiser

Keyword(s):

Object Perception ◽

Object Representations ◽

Natural Vision ◽

Index Changes ◽

Eeg Recordings ◽

Semantic Scene ◽

Task Irrelevant ◽

Inconsistent Objects ◽

Scene Object

AbstractDuring natural vision, objects rarely appear in isolation, but often within a semantically related scene context. Previous studies reported that semantic consistency between objects and scenes facilitates object perception, and that scene-object consistency is reflected in changes in the N300 and N400 components in EEG recordings. Here, we investigate whether these N300/N400 differences are indicative of changes in the cortical representation of objects. In two experiments, we recorded EEG signals while participants viewed semantically consistent or inconsistent objects within a scene; in Experiment 1, these objects were task-irrelevant, while in Experiment 2, they were directly relevant for behavior. In both experiments, we found reliable and comparable N300/400 differences between consistent and inconsistent scene-object combinations. To probe the quality of object representations, we performed multivariate classification analyses, in which we decoded the category of the objects contained in the scene. In Experiment 1, in which the objects were not task-relevant, object category could be decoded from around 100 ms after the object presentation, but no difference in decoding performance was found between consistent and inconsistent objects. By contrast, when the objects were task-relevant in Experiment 2, we found enhanced decoding of semantically consistent, compared to semantically inconsistent, objects. These results show that differences in N300/N400 components related to scene-object consistency do not index changes in cortical object representations, but rather reflect a generic marker of semantic violations. Further, our findings suggest that facilitatory effects between objects and scenes are task-dependent rather than automatic.

Download Full-text

IMENet: Joint 3D Semantic Scene Completion and 2D Semantic Segmentation through Iterative Mutual Enhancement

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/110 ◽

2021 ◽

Author(s):

Jie Li ◽

Laiyan Ding ◽

Rui Huang

Keyword(s):

State Of The Art ◽

Semantic Segmentation ◽

Late Fusion ◽

Unified Framework ◽

Semantic Classes ◽

Indoor Scene ◽

Semantic Scene ◽

3D Scene ◽

High Level ◽

Fusion Scheme

3D semantic scene completion and 2D semantic segmentation are two tightly correlated tasks that are both essential for indoor scene understanding, because they predict the same semantic classes, using positively correlated high-level features. Current methods use 2D features extracted from early-fused RGB-D images for 2D segmentation to improve 3D scene completion. We argue that this sequential scheme does not ensure these two tasks fully benefit each other, and present an Iterative Mutual Enhancement Network (IMENet) to solve them jointly, which interactively refines the two tasks at the late prediction stage. Specifically, two refinement modules are developed under a unified framework for the two tasks. The first is a 2D Deformable Context Pyramid (DCP) module, which receives the projection from the current 3D predictions to refine the 2D predictions. In turn, a 3D Deformable Depth Attention (DDA) module is proposed to leverage the reprojected results from 2D predictions to update the coarse 3D predictions. This iterative fusion happens to the stable high-level features of both tasks at a late stage. Extensive experiments on NYU and NYUCAD datasets verify the effectiveness of the proposed iterative late fusion scheme, and our approach outperforms the state of the art on both 3D semantic scene completion and 2D semantic segmentation.

Download Full-text

Semantic Scene Manipulation Based on 3D Spatial Object Relations and Language Instructions

10.1109/humanoids47582.2021.9555802 ◽

2021 ◽

Author(s):

Rainer Kartmann ◽

Danqing Liu ◽

Tamim Asfour

Keyword(s):

Object Relations ◽

Spatial Object ◽

Semantic Scene

Download Full-text

semantic scene
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Towards Synthetic AI Training Data for Image Classification in Intralogistic Settings

2D-to-3D Projection for Monocular and Multi-View 3D Multi-class Object Detection in Indoor Scenes

Indoor Semantic Scene Understanding Using 2D-3D Fusion

Immersive audio-visual scene reproduction using semantic scene reconstruction from 360 cameras

2D-to-3D Projection for Monocular and Multi-View 3D Object Detection in Outdoor Scenes

Up-to-Down Network: Fusing Multi-Scale Context for 3D Semantic Scene Completion

Semantic-Based Explainable AI: Leveraging Semantic Scene Graphs and Pairwise Ranking to Explain Robot Failures

Semantic scene-object consistency modulates N300/400 EEG components, but does not automatically facilitate object representations

IMENet: Joint 3D Semantic Scene Completion and 2D Semantic Segmentation through Iterative Mutual Enhancement

Semantic Scene Manipulation Based on 3D Spatial Object Relations and Language Instructions

Export Citation Format

semantic sceneRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Towards Synthetic AI Training Data for Image Classification in Intralogistic Settings

2D-to-3D Projection for Monocular and Multi-View 3D Multi-class Object Detection in Indoor Scenes

Indoor Semantic Scene Understanding Using 2D-3D Fusion

Immersive audio-visual scene reproduction using semantic scene reconstruction from 360 cameras

2D-to-3D Projection for Monocular and Multi-View 3D Object Detection in Outdoor Scenes

Up-to-Down Network: Fusing Multi-Scale Context for 3D Semantic Scene Completion

Semantic-Based Explainable AI: Leveraging Semantic Scene Graphs and Pairwise Ranking to Explain Robot Failures

Semantic scene-object consistency modulates N300/400 EEG components, but does not automatically facilitate object representations

IMENet: Joint 3D Semantic Scene Completion and 2D Semantic Segmentation through Iterative Mutual Enhancement

Semantic Scene Manipulation Based on 3D Spatial Object Relations and Language Instructions

semantic scene
Recently Published Documents