GEOMETRIC INVARIANTS CONSTRUCTION FOR SEMANTIC SCENE UNDERSTANDING FROM MULTIPLE VIEWS INSPIRED BY THE HUMAN VISUAL SYSTEM

2012 ◽  
Vol 12 (02) ◽  
pp. 1250012 ◽  
Author(s):  
N. FAN ◽  
CHENG JIN

Semantic scene understanding is one of the several significant goals of robotics. In this paper, we propose a framework that is able to construct geometric invariants for simultaneous object detection and segmentation using a simple pairwise interactive context term, for the sake of achieving a preliminary milestone of Semantic scene understanding. The context is incorporated as pairwise interactions between pixels, imposing a prior on the labeling. Our model formulates the multi-class image segmentation task as an energy minimization problem and finds a globally optimal solution using belief propagation and neural network. We experimentally evaluate the proposed method on three publicly available datasets: the MSRC-1, the CorelB datasets, and the PASCAL VOC database. Results show the applicability and efficacy of the proposed method to the multi-class segmentation problem.

2020 ◽  
Vol 34 (05) ◽  
pp. 7333-7340
Author(s):  
Roie Zivan ◽  
Omer Lev ◽  
Rotem Galiki

Belief propagation, an algorithm for solving problems represented by graphical models, has long been known to converge to the optimal solution when the graph is a tree. When the graph representing the problem includes a single cycle, the algorithm either converges to the optimal solution or performs periodic oscillations. While the conditions that trigger these two behaviors have been established, the question regarding the convergence and divergence of the algorithm on graphs that include more than one cycle is still open.Focusing on Max-sum, the version of belief propagation for solving distributed constraint optimization problems (DCOPs), we extend the theory on the behavior of belief propagation in general – and Max-sum specifically – when solving problems represented by graphs with multiple cycles. This includes: 1) Generalizing the results obtained for graphs with a single cycle to graphs with multiple cycles, by using backtrack cost trees (BCT). 2) Proving that when the algorithm is applied to adjacent symmetric cycles, the use of a large enough damping factor guarantees convergence to the optimal solution.


2018 ◽  
Author(s):  
Alexey A. Shvets ◽  
Alexander Rakhlin ◽  
Alexandr A. Kalinin ◽  
Vladimir I. Iglovikov

AbstractSemantic segmentation of robotic instruments is an important problem for the robot-assisted surgery. One of the main challenges is to correctly detect an instrument’s position for the tracking and pose estimation in the vicinity of surgical scenes. Accurate pixel-wise instrument segmentation is needed to address this challenge. In this paper we describe our deep learning-based approach for robotic instrument segmentation. Our approach demonstrates an improvement over the state-of-the-art results using several novel deep neural network architectures. It addressed the binary segmentation problem, where every pixel in an image is labeled as an instrument or background from the surgery video feed. In addition, we solve a multi-class segmentation problem, in which we distinguish between different instruments or different parts of an instrument from the background. In this setting, our approach outperforms other methods for automatic instrument segmentation thereby providing state-of-the-art results for these problems. The source code for our solution is made publicly available.


2020 ◽  
Vol 169 ◽  
pp. 337-350
Author(s):  
Xiaoman Qi ◽  
Panpan Zhu ◽  
Yuebin Wang ◽  
Liqiang Zhang ◽  
Junhuan Peng ◽  
...  

2019 ◽  
Vol 9 (18) ◽  
pp. 3789 ◽  
Author(s):  
Jiyoun Moon ◽  
Beom-Hee Lee

Natural-language-based scene understanding can enable heterogeneous robots to cooperate efficiently in large and unconstructed environments. However, studies on symbolic planning rarely consider the semantic knowledge acquisition problem associated with the surrounding environments. Further, recent developments in deep learning methods show outstanding performance for semantic scene understanding using natural language. In this paper, a cooperation framework that connects deep learning techniques and a symbolic planner for heterogeneous robots is proposed. The framework is largely composed of the scene understanding engine, planning agent, and knowledge engine. We employ neural networks for natural-language-based scene understanding to share environmental information among robots. We then generate a sequence of actions for each robot using a planning domain definition language planner. JENA-TDB is used for knowledge acquisition storage. The proposed method is validated using simulation results obtained from one unmanned aerial and three ground vehicles.


Author(s):  
CHENG JIN

Geometric invariants have wide applications in computer vision and their precision has long been a hot topic. In most of the existing methods, three-dimensional (3D) invariants have been obtained by reconstruction of the object structure, where fundamental matrices between image pairs should be first established. Consequently, there are additional errors introduced during invariants construction and could be very time consuming. In this paper, a novel algorithm to calculate 3D projective invariants from multiple images has been proposed, without reconstructing the object structures explicitly. We have employed the geometric configuration of points and lines in general position to deduce the formulation of 3D invariants. It has been verified in our experiments that our proposed method is considerably accurate when compared with the ground truth, and more efficient when compared with reconstruction based methods.


Author(s):  
B. Vishnyakov ◽  
Y. Blokhinov ◽  
I. Sgibnev ◽  
V. Sheverdin ◽  
A. Sorokin ◽  
...  

Abstract. In this paper we describe a new multi-sensor platform for data collection and algorithm testing. We propose a couple of methods for solution of semantic scene understanding problem for land autonomous vehicles. We describe our approaches for automatic camera and LiDAR calibration; three-dimensional scene reconstruction and odometry calculation; semantic segmentation that provides obstacle recognition and underlying surface classification; object detection; point cloud segmentation. Also, we describe our virtual simulation complex based on Unreal Engine, that can be used for both data collection and algorithm testing. We collected a large database of field and virtual data: more than 1,000,000 real images with corresponding LiDAR data and more than 3,500,000 simulated images with corresponding LiDAR data. All proposed methods were implemented and tested on our autonomous platform; accuracy estimates were obtained on the collected database.


Author(s):  
Jens Behley ◽  
Martin Garbade ◽  
Andres Milioto ◽  
Jan Quenzel ◽  
Sven Behnke ◽  
...  

2021 ◽  
Author(s):  
Muraleekrishna Gopinathan ◽  
Giang Truong ◽  
Jumana Abu-Khalaf

2021 ◽  
pp. 027836492110067
Author(s):  
Jens Behley ◽  
Martin Garbade ◽  
Andres Milioto ◽  
Jan Quenzel ◽  
Sven Behnke ◽  
...  

A holistic semantic scene understanding exploiting all available sensor modalities is a core capability to master self-driving in complex everyday traffic. To this end, we present the SemanticKITTI dataset that provides point-wise semantic annotations of Velodyne HDL-64E point clouds of the KITTI Odometry Benchmark. Together with the data, we also published three benchmark tasks for semantic scene understanding covering different aspects of semantic scene understanding: (1) semantic segmentation for point-wise classification using single or multiple point clouds as input; (2) semantic scene completion for predictive reasoning on the semantics and occluded regions; and (3) panoptic segmentation combining point-wise classification and assigning individual instance identities to separate objects of the same class. In this article, we provide details on our dataset showing an unprecedented number of fully annotated point cloud sequences, more information on our labeling process to efficiently annotate such a vast amount of point clouds, and lessons learned in this process. The dataset and resources are available at http://www.semantic-kitti.org .


Sign in / Sign up

Export Citation Format

Share Document