Simple object recognition based on spatial relations and visual features represented using irregular pyramids

Object recognition has become a central topic in computer vision applications such as image search, robotics and vehicle safety systems. However, it is a challenging task due to the limited discriminative power of low-level visual features in describing the considerably diverse range of high-level visual semantics of objects. Semantic gap between low-level visual features and high-level concepts are a bottleneck in most systems. New content analysis models need to be developed to bridge the semantic gap. In this thesis, algorithms based on conditional random fields (CRF) from the class of probabilistic graphical models are developed to tackle the problem of multiclass image labeling for object recognition. Image labeling assigns a specific semantic category from a predefined set of object classes to each pixel in the image. By well capturing spatial interactions of visual concepts, CRF modeling has proved to be a successful tool for image labeling. This thesis proposes novel approaches to empowering the CRF modeling for robust image labeling. Our primary contributions are twofold. To better represent feature distributions of CRF potentials, new feature functions based on generalized Gaussian mixture models (GGMM) are designed and their efficacy is investigated. Due to its shape parameter, GGMM can provide a proper fit to multi-modal and skewed distribution of data in nature images. The new model proves more successful than Gaussian and Laplacian mixture models. It also outperforms a deep neural network model on Corel imageset by 1% accuracy. Further in this thesis, we apply scene level contextual information to integrate global visual semantics of the image with pixel-wise dense inference of fully-connected CRF to preserve small objects of foreground classes and to make dense inference robust to initial misclassifications of the unary classifier. Proposed inference algorithm factorizes the joint probability of labeling configuration and image scene type to obtain prediction update equations for labeling individual image pixels and also the overall scene type of the image. The proposed context-based dense CRF model outperforms conventional dense CRF model by about 2% in terms of labeling accuracy on MSRC imageset and by 4% on SIFT Flow imageset. Also, the proposed model obtains the highest scene classification rate of 86% on MSRC dataset.

Download Full-text

Global Context Extraction for Object Recognition Using a Combination of Range and Visual Features

Dynamic 3D Imaging - Lecture Notes in Computer Science ◽

10.1007/978-3-642-03778-8_8 ◽

2009 ◽

pp. 96-109 ◽

Cited By ~ 2

Author(s):

Michael Kemmler ◽

Erik Rodner ◽

Joachim Denzler

Keyword(s):

Object Recognition ◽

Visual Features ◽

Global Context ◽

Context Extraction

Download Full-text

Object recognition and segmentation in videos by connecting heterogeneous visual features

Computer Vision and Image Understanding ◽

10.1016/j.cviu.2007.10.004 ◽

2008 ◽

Vol 111 (1) ◽

pp. 86-109 ◽

Cited By ~ 10

Author(s):

Valérie Gouet-Brunet ◽

Bruno Lameyre

Keyword(s):

Object Recognition ◽

Visual Features

Download Full-text

What are the visual features underlying rapid object recognition?

Frontiers in Psychology ◽

10.3389/fpsyg.2011.00326 ◽

2011 ◽

Vol 2 ◽

Cited By ~ 22

Author(s):

Sébastien M. Crouzet

Keyword(s):

Object Recognition ◽

Visual Features

Download Full-text

Large-scale identification of the visual features used for object recognition with ClickMe.ai

Journal of Vision ◽

10.1167/18.10.414 ◽

2018 ◽

Vol 18 (10) ◽

pp. 414 ◽

Cited By ~ 1

Author(s):

Drew Linsley ◽

Dan Shiebler ◽

Sven Eberhardt ◽

Andreas Karagounis ◽

Thomas Serre

Keyword(s):

Object Recognition ◽

Large Scale ◽

Visual Features

Download Full-text

Semantic Object Recognition Based on Qualitative Probabilistic Spatial Relations

Formal Modeling and Verification of Cyber-Physical Systems ◽

10.1007/978-3-658-09994-7_12 ◽

2015 ◽

pp. 278-280

Author(s):

Malgorzata Goldhoorn ◽

Frank Kirchner

Keyword(s):

Object Recognition ◽

Spatial Relations ◽

Semantic Object

Download Full-text

Sequential model-based segmentation and recognition of image structures driven by visual features and spatial relations

Computer Vision and Image Understanding ◽

10.1016/j.cviu.2011.09.004 ◽

2012 ◽

Vol 116 (1) ◽

pp. 146-165 ◽

Cited By ~ 23

Author(s):

Geoffroy Fouquier ◽

Jamal Atif ◽

Isabelle Bloch

Keyword(s):

Spatial Relations ◽

Visual Features ◽

Sequential Model ◽

Model Based

Download Full-text

Hierarchy of visual features for object recognition

2014 IEEE International Conference on Image Processing (ICIP) ◽

10.1109/icip.2014.7026192 ◽

2014 ◽

Author(s):

Nitin Gupta ◽

Sukhendu Das ◽

Sutanu Chakraborti

Keyword(s):

Object Recognition ◽

Visual Features

Download Full-text

Bio-inspired unsupervised learning of visual features leads to robust invariant object recognition

Neurocomputing ◽

10.1016/j.neucom.2016.04.029 ◽

2016 ◽

Vol 205 ◽

pp. 382-392 ◽

Cited By ~ 49

Author(s):

Saeed Reza Kheradpisheh ◽

Mohammad Ganjtabesh ◽

Timothée Masquelier

Keyword(s):

Object Recognition ◽

Unsupervised Learning ◽

Visual Features ◽

Invariant Object Recognition

Download Full-text

Coordination of what and where in Visual Attention

Perception ◽

10.1068/p221261 ◽

1993 ◽

Vol 22 (11) ◽

pp. 1261-1270 ◽

Cited By ~ 36

Author(s):

John Duncan

Keyword(s):

Object Recognition ◽

Visual Attention ◽

Divided Attention ◽

Visual Information ◽

Functional Relationship ◽

Human Subjects ◽

Visual Display ◽

Spatial Relations ◽

Cortical Areas ◽

Cortical System

Performance often suffers when two visual discriminations must be made concurrently (‘divided attention’). In the modular primate visual system, different cortical areas analyse different kinds of visual information. Especially important is a distinction between an occipitoparietal ‘where?’ system, analysing spatial relations, and an occipitotemporal ‘what?’ system responsible for object recognition. Though such visual subsystems are anatomically parallel, their functional relationship when ‘what?’ and ‘where?’ discriminations are made concurrently is unknown. In the present experiments, human subjects made concurrent discriminations concerning a brief visual display. Discriminations were either similar (two ‘what?’ or two ‘where?’ discriminations) or dissimilar (one of each), and concerned the same or different objects. When discriminations concerned different objects, there was strong interference between them. This was equally severe whether discriminations were similar—and therefore dependent on the same cortical system—or dissimilar. When concurrent ‘what?’ and ‘where?’ discriminations concerned the same object, however, all interference disappeared. Such results suggest that ‘what?’ and ‘where?’ systems are coordinated in visual attention: their separate outputs can be used simultaneously without cost, but only when they concern one object.

Download Full-text