Indoor Semantic Scene Understanding Using 2D-3D Fusion

Natural-language-based scene understanding can enable heterogeneous robots to cooperate efficiently in large and unconstructed environments. However, studies on symbolic planning rarely consider the semantic knowledge acquisition problem associated with the surrounding environments. Further, recent developments in deep learning methods show outstanding performance for semantic scene understanding using natural language. In this paper, a cooperation framework that connects deep learning techniques and a symbolic planner for heterogeneous robots is proposed. The framework is largely composed of the scene understanding engine, planning agent, and knowledge engine. We employ neural networks for natural-language-based scene understanding to share environmental information among robots. We then generate a sequence of actions for each robot using a planning domain definition language planner. JENA-TDB is used for knowledge acquisition storage. The proposed method is validated using simulation results obtained from one unmanned aerial and three ground vehicles.

Download Full-text

SEMANTIC SCENE UNDERSTANDING FOR THE AUTONOMOUS PLATFORM

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2020-637-2020 ◽

2020 ◽

Vol XLIII-B2-2020 ◽

pp. 637-644

Author(s):

B. Vishnyakov ◽

Y. Blokhinov ◽

I. Sgibnev ◽

V. Sheverdin ◽

A. Sorokin ◽

...

Keyword(s):

Data Collection ◽

Autonomous Vehicles ◽

Three Dimensional ◽

Scene Understanding ◽

Semantic Segmentation ◽

Lidar Data ◽

Point Cloud Segmentation ◽

Detection Point ◽

Sensor Platform ◽

Semantic Scene

Abstract. In this paper we describe a new multi-sensor platform for data collection and algorithm testing. We propose a couple of methods for solution of semantic scene understanding problem for land autonomous vehicles. We describe our approaches for automatic camera and LiDAR calibration; three-dimensional scene reconstruction and odometry calculation; semantic segmentation that provides obstacle recognition and underlying surface classification; object detection; point cloud segmentation. Also, we describe our virtual simulation complex based on Unreal Engine, that can be used for both data collection and algorithm testing. We collected a large database of field and virtual data: more than 1,000,000 real images with corresponding LiDAR data and more than 3,500,000 simulated images with corresponding LiDAR data. All proposed methods were implemented and tested on our autonomous platform; accuracy estimates were obtained on the collected database.

Download Full-text

SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences

2019 IEEE/CVF International Conference on Computer Vision (ICCV) ◽

10.1109/iccv.2019.00939 ◽

2019 ◽

Cited By ~ 28

Author(s):

Jens Behley ◽

Martin Garbade ◽

Andres Milioto ◽

Jan Quenzel ◽

Sven Behnke ◽

...

Keyword(s):

Scene Understanding ◽

Semantic Scene

Download Full-text

Towards 3D LiDAR-based semantic scene understanding of 3D point cloud sequences: The SemanticKITTI Dataset

The International Journal of Robotics Research ◽

10.1177/02783649211006735 ◽

2021 ◽

pp. 027836492110067

Author(s):

Jens Behley ◽

Martin Garbade ◽

Andres Milioto ◽

Jan Quenzel ◽

Sven Behnke ◽

...

Keyword(s):

Point Cloud ◽

Scene Understanding ◽

Semantic Segmentation ◽

Point Clouds ◽

Lessons Learned ◽

Multiple Point ◽

3D Point Cloud ◽

Core Capability ◽

Semantic Scene ◽

3D Lidar

A holistic semantic scene understanding exploiting all available sensor modalities is a core capability to master self-driving in complex everyday traffic. To this end, we present the SemanticKITTI dataset that provides point-wise semantic annotations of Velodyne HDL-64E point clouds of the KITTI Odometry Benchmark. Together with the data, we also published three benchmark tasks for semantic scene understanding covering different aspects of semantic scene understanding: (1) semantic segmentation for point-wise classification using single or multiple point clouds as input; (2) semantic scene completion for predictive reasoning on the semantics and occluded regions; and (3) panoptic segmentation combining point-wise classification and assigning individual instance identities to separate objects of the same class. In this article, we provide details on our dataset showing an unprecedented number of fully annotated point cloud sequences, more information on our labeling process to efficiently annotate such a vast amount of point clouds, and lessons learned in this process. The dataset and resources are available at http://www.semantic-kitti.org .

Download Full-text

Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) ◽

10.1109/cvpr.2019.00349 ◽

2019 ◽

Cited By ~ 2

Author(s):

Amir Atapour-Abarghouei ◽

Toby P. Breckon

Keyword(s):

Scene Understanding ◽

Depth Prediction ◽

Semantic Scene

Download Full-text

GEOMETRIC INVARIANTS CONSTRUCTION FOR SEMANTIC SCENE UNDERSTANDING FROM MULTIPLE VIEWS INSPIRED BY THE HUMAN VISUAL SYSTEM

International Journal of Image and Graphics ◽

10.1142/s021946781250012x ◽

2012 ◽

Vol 12 (02) ◽

pp. 1250012 ◽

Cited By ~ 1

Author(s):

N. FAN ◽

CHENG JIN

Keyword(s):

Belief Propagation ◽

Scene Understanding ◽

Optimal Solution ◽

Multiple Views ◽

Geometric Invariants ◽

Pairwise Interactions ◽

Semantic Scene ◽

Segmentation Task ◽

Class Segmentation ◽

Segmentation Problem

Semantic scene understanding is one of the several significant goals of robotics. In this paper, we propose a framework that is able to construct geometric invariants for simultaneous object detection and segmentation using a simple pairwise interactive context term, for the sake of achieving a preliminary milestone of Semantic scene understanding. The context is incorporated as pairwise interactions between pixels, imposing a prior on the labeling. Our model formulates the multi-class image segmentation task as an energy minimization problem and finds a globally optimal solution using belief propagation and neural network. We experimentally evaluate the proposed method on three publicly available datasets: the MSRC-1, the CorelB datasets, and the PASCAL VOC database. Results show the applicability and efficacy of the proposed method to the multi-class segmentation problem.

Download Full-text