GEOMETRIC INVARIANTS CONSTRUCTION FOR SEMANTIC SCENE UNDERSTANDING FROM MULTIPLE VIEWS INSPIRED BY THE HUMAN VISUAL SYSTEM

Semantic scene understanding is one of the several significant goals of robotics. In this paper, we propose a framework that is able to construct geometric invariants for simultaneous object detection and segmentation using a simple pairwise interactive context term, for the sake of achieving a preliminary milestone of Semantic scene understanding. The context is incorporated as pairwise interactions between pixels, imposing a prior on the labeling. Our model formulates the multi-class image segmentation task as an energy minimization problem and finds a globally optimal solution using belief propagation and neural network. We experimentally evaluate the proposed method on three publicly available datasets: the MSRC-1, the CorelB datasets, and the PASCAL VOC database. Results show the applicability and efficacy of the proposed method to the multi-class segmentation problem.

Download Full-text

MVLidarNet: Real-Time Multi-Class Scene Understanding for Autonomous Driving Using Multiple Views

2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) ◽

10.1109/iros45743.2020.9341450 ◽

2020 ◽

Author(s):

Ke Chen ◽

Ryan Oldja ◽

Nikolai Smolyanskiy ◽

Stan Birchfield ◽

Alexander Popov ◽

...

Keyword(s):

Real Time ◽

Scene Understanding ◽

Autonomous Driving ◽

Multiple Views

Download Full-text

Beyond Trees: Analysis and Convergence of Belief Propagation in Graphs with Multiple Cycles

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6227 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7333-7340

Author(s):

Roie Zivan ◽

Omer Lev ◽

Rotem Galiki

Keyword(s):

Graphical Models ◽

Belief Propagation ◽

Optimization Problems ◽

Optimal Solution ◽

Damping Factor ◽

Constraint Optimization ◽

Single Cycle ◽

Distributed Constraint Optimization ◽

Multiple Cycles ◽

Constraint Optimization Problems

Belief propagation, an algorithm for solving problems represented by graphical models, has long been known to converge to the optimal solution when the graph is a tree. When the graph representing the problem includes a single cycle, the algorithm either converges to the optimal solution or performs periodic oscillations. While the conditions that trigger these two behaviors have been established, the question regarding the convergence and divergence of the algorithm on graphs that include more than one cycle is still open.Focusing on Max-sum, the version of belief propagation for solving distributed constraint optimization problems (DCOPs), we extend the theory on the behavior of belief propagation in general – and Max-sum specifically – when solving problems represented by graphs with multiple cycles. This includes: 1) Generalizing the results obtained for graphs with a single cycle to graphs with multiple cycles, by using backtrack cost trees (BCT). 2) Proving that when the algorithm is applied to adjacent symmetric cycles, the use of a large enough damping factor guarantees convergence to the optimal solution.

Download Full-text

Automatic Instrument Segmentation in Robot-Assisted Surgery Using Deep Learning

10.1101/275867 ◽

2018 ◽

Cited By ~ 13

Author(s):

Alexey A. Shvets ◽

Alexander Rakhlin ◽

Alexandr A. Kalinin ◽

Vladimir I. Iglovikov

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Semantic Segmentation ◽

Robot Assisted ◽

Automatic Instrument ◽

Binary Segmentation ◽

Video Feed ◽

Class Segmentation ◽

Segmentation Problem ◽

Robot Assisted Surgery

AbstractSemantic segmentation of robotic instruments is an important problem for the robot-assisted surgery. One of the main challenges is to correctly detect an instrument’s position for the tracking and pose estimation in the vicinity of surgical scenes. Accurate pixel-wise instrument segmentation is needed to address this challenge. In this paper we describe our deep learning-based approach for robotic instrument segmentation. Our approach demonstrates an improvement over the state-of-the-art results using several novel deep neural network architectures. It addressed the binary segmentation problem, where every pixel in an image is labeled as an instrument or background from the surgery video feed. In addition, we solve a multi-class segmentation problem, in which we distinguish between different instruments or different parts of an instrument from the background. In this setting, our approach outperforms other methods for automatic instrument segmentation thereby providing state-of-the-art results for these problems. The source code for our solution is made publicly available.

Download Full-text

MLRSNet: A multi-label high spatial resolution remote sensing dataset for semantic scene understanding

ISPRS Journal of Photogrammetry and Remote Sensing ◽

10.1016/j.isprsjprs.2020.09.020 ◽

2020 ◽

Vol 169 ◽

pp. 337-350

Author(s):

Xiaoman Qi ◽

Panpan Zhu ◽

Yuebin Wang ◽

Liqiang Zhang ◽

Junhuan Peng ◽

...

Keyword(s):

Remote Sensing ◽

Spatial Resolution ◽

High Spatial Resolution ◽

Scene Understanding ◽

Semantic Scene

Download Full-text

PDDL Planning with Natural Language-Based Scene Understanding for UAV-UGV Cooperation

Applied Sciences ◽

10.3390/app9183789 ◽

2019 ◽

Vol 9 (18) ◽

pp. 3789 ◽

Cited By ~ 2

Author(s):

Jiyoun Moon ◽

Beom-Hee Lee

Keyword(s):

Deep Learning ◽

Natural Language ◽

Knowledge Acquisition ◽

Scene Understanding ◽

Semantic Knowledge ◽

Ground Vehicles ◽

Learning Techniques ◽

Recent Developments ◽

Heterogeneous Robots ◽

Semantic Scene

Natural-language-based scene understanding can enable heterogeneous robots to cooperate efficiently in large and unconstructed environments. However, studies on symbolic planning rarely consider the semantic knowledge acquisition problem associated with the surrounding environments. Further, recent developments in deep learning methods show outstanding performance for semantic scene understanding using natural language. In this paper, a cooperation framework that connects deep learning techniques and a symbolic planner for heterogeneous robots is proposed. The framework is largely composed of the scene understanding engine, planning agent, and knowledge engine. We employ neural networks for natural-language-based scene understanding to share environmental information among robots. We then generate a sequence of actions for each robot using a planning domain definition language planner. JENA-TDB is used for knowledge acquisition storage. The proposed method is validated using simulation results obtained from one unmanned aerial and three ground vehicles.

Download Full-text

GEOMETRIC INVARIANTS CONSTRUCTION FROM MULTIPLE VIEWS

International Journal of Modeling Simulation and Scientific Computing ◽

10.1142/s1793962311000402 ◽

2011 ◽

Vol 02 (02) ◽

pp. 195-206

Author(s):

CHENG JIN

Keyword(s):

Three Dimensional ◽

Ground Truth ◽

Multiple Views ◽

Multiple Images ◽

Object Structure ◽

Projective Invariants ◽

Geometric Invariants ◽

Image Pairs ◽

Fundamental Matrices ◽

Novel Algorithm

Geometric invariants have wide applications in computer vision and their precision has long been a hot topic. In most of the existing methods, three-dimensional (3D) invariants have been obtained by reconstruction of the object structure, where fundamental matrices between image pairs should be first established. Consequently, there are additional errors introduced during invariants construction and could be very time consuming. In this paper, a novel algorithm to calculate 3D projective invariants from multiple images has been proposed, without reconstructing the object structures explicitly. We have employed the geometric configuration of points and lines in general position to deduce the formulation of 3D invariants. It has been verified in our experiments that our proposed method is considerably accurate when compared with the ground truth, and more efficient when compared with reconstruction based methods.

Download Full-text

SEMANTIC SCENE UNDERSTANDING FOR THE AUTONOMOUS PLATFORM

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2020-637-2020 ◽

2020 ◽

Vol XLIII-B2-2020 ◽

pp. 637-644

Author(s):

B. Vishnyakov ◽

Y. Blokhinov ◽

I. Sgibnev ◽

V. Sheverdin ◽

A. Sorokin ◽

...

Keyword(s):

Data Collection ◽

Autonomous Vehicles ◽

Three Dimensional ◽

Scene Understanding ◽

Semantic Segmentation ◽

Lidar Data ◽

Point Cloud Segmentation ◽

Detection Point ◽

Sensor Platform ◽

Semantic Scene

Abstract. In this paper we describe a new multi-sensor platform for data collection and algorithm testing. We propose a couple of methods for solution of semantic scene understanding problem for land autonomous vehicles. We describe our approaches for automatic camera and LiDAR calibration; three-dimensional scene reconstruction and odometry calculation; semantic segmentation that provides obstacle recognition and underlying surface classification; object detection; point cloud segmentation. Also, we describe our virtual simulation complex based on Unreal Engine, that can be used for both data collection and algorithm testing. We collected a large database of field and virtual data: more than 1,000,000 real images with corresponding LiDAR data and more than 3,500,000 simulated images with corresponding LiDAR data. All proposed methods were implemented and tested on our autonomous platform; accuracy estimates were obtained on the collected database.

Download Full-text

SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences

2019 IEEE/CVF International Conference on Computer Vision (ICCV) ◽

10.1109/iccv.2019.00939 ◽

2019 ◽

Cited By ~ 28

Author(s):

Jens Behley ◽

Martin Garbade ◽

Andres Milioto ◽

Jan Quenzel ◽

Sven Behnke ◽

...

Keyword(s):

Scene Understanding ◽

Semantic Scene

Download Full-text

Indoor Semantic Scene Understanding Using 2D-3D Fusion

10.1109/dicta52665.2021.9647182 ◽

2021 ◽

Author(s):

Muraleekrishna Gopinathan ◽

Giang Truong ◽

Jumana Abu-Khalaf

Keyword(s):

Scene Understanding ◽

Semantic Scene

Download Full-text

Towards 3D LiDAR-based semantic scene understanding of 3D point cloud sequences: The SemanticKITTI Dataset

The International Journal of Robotics Research ◽

10.1177/02783649211006735 ◽

2021 ◽

pp. 027836492110067

Author(s):

Jens Behley ◽

Martin Garbade ◽

Andres Milioto ◽

Jan Quenzel ◽

Sven Behnke ◽

...

Keyword(s):

Point Cloud ◽

Scene Understanding ◽

Semantic Segmentation ◽

Point Clouds ◽

Lessons Learned ◽

Multiple Point ◽

3D Point Cloud ◽

Core Capability ◽

Semantic Scene ◽

3D Lidar

A holistic semantic scene understanding exploiting all available sensor modalities is a core capability to master self-driving in complex everyday traffic. To this end, we present the SemanticKITTI dataset that provides point-wise semantic annotations of Velodyne HDL-64E point clouds of the KITTI Odometry Benchmark. Together with the data, we also published three benchmark tasks for semantic scene understanding covering different aspects of semantic scene understanding: (1) semantic segmentation for point-wise classification using single or multiple point clouds as input; (2) semantic scene completion for predictive reasoning on the semantics and occluded regions; and (3) panoptic segmentation combining point-wise classification and assigning individual instance identities to separate objects of the same class. In this article, we provide details on our dataset showing an unprecedented number of fully annotated point cloud sequences, more information on our labeling process to efficiently annotate such a vast amount of point clouds, and lessons learned in this process. The dataset and resources are available at http://www.semantic-kitti.org .

Download Full-text