Development of an Autonomous Visual Perception System for Robots Using Object-Based Visual Attention

Robots detect and keep track of relevant objects in their environment to accomplish some tasks. Many of them are equipped with mobile cameras as the main sensors, process the images and maintain an internal representation of the detected objects. We propose a novel active visual memory that moves the camera to detect objects in robot's surroundings and tracks their positions. This visual memory is based on a combination of multi-modal filters that efficiently integrates partial information. The visual attention subsystem is distributed among the software components in charge of detecting relevant objects. We demonstrate the efficiency and robustness of this perception system in a real humanoid robot participating in the RoboCup SPL competition.

Download Full-text

A Goal-Directed Visual Perception System Using Object-Based Top–Down Attention

IEEE Transactions on Autonomous Mental Development ◽

10.1109/tamd.2011.2163513 ◽

2012 ◽

Vol 4 (1) ◽

pp. 87-103 ◽

Cited By ~ 14

Author(s):

Yuanlong Yu ◽

George K. I. Mann ◽

Raymond G. Gosine

Keyword(s):

Visual Perception ◽

Top Down ◽

Object Based ◽

Perception System

Download Full-text

Object-Based and Space-Based Hierarchical Focusing of Visual Attention

PsycEXTRA Dataset ◽

10.1037/e512682013-541 ◽

2007 ◽

Author(s):

Menahem Yeari ◽

Morris Goldsmith

Keyword(s):

Visual Attention ◽

Object Based

Download Full-text

Real-Time Visual Perception System for a Robotic Rat

2020 IEEE International Conference on Real-time Computing and Robotics (RCAR) ◽

10.1109/rcar49640.2020.9303325 ◽

2020 ◽

Author(s):

Guanglu Jia ◽

Qing Shi ◽

Chen Chen ◽

Yi Xu ◽

Chang Li ◽

...

Keyword(s):

Visual Perception ◽

Real Time ◽

Perception System

Download Full-text

Object-based auditory and visual attention

Trends in Cognitive Sciences ◽

10.1016/j.tics.2008.02.003 ◽

2008 ◽

Vol 12 (5) ◽

pp. 182-186 ◽

Cited By ~ 354

Author(s):

Barbara G. Shinn-Cunningham

Keyword(s):

Visual Attention ◽

Object Based

Download Full-text

Waveguide combiners for mixed reality headsets: a nanophotonics design perspective

Nanophotonics ◽

10.1515/nanoph-2020-0410 ◽

2020 ◽

Vol 10 (1) ◽

pp. 41-74

Author(s):

Bernard C. Kress ◽

Ishan Chatterjee

Keyword(s):

Visual Perception ◽

Mixed Reality ◽

Deep Understanding ◽

Large Field ◽

High Angular Resolution ◽

List Type ◽

Human Visual Perception ◽

Perception System ◽

Social Comfort ◽

Hardware Architectures

AbstractThis paper is a review and analysis of the various implementation architectures of diffractive waveguide combiners for augmented reality (AR), mixed reality (MR) headsets, and smart glasses. Extended reality (XR) is another acronym frequently used to refer to all variants across the MR spectrum. Such devices have the potential to revolutionize how we work, communicate, travel, learn, teach, shop, and are entertained. Already, market analysts show very optimistic expectations on return on investment in MR, for both enterprise and consumer applications. Hardware architectures and technologies for AR and MR have made tremendous progress over the past five years, fueled by recent investment hype in start-ups and accelerated mergers and acquisitions by larger corporations. In order to meet such high market expectations, several challenges must be addressed: first, cementing primary use cases for each specific market segment and, second, achieving greater MR performance out of increasingly size-, weight-, cost- and power-constrained hardware. One such crucial component is the optical combiner. Combiners are often considered as critical optical elements in MR headsets, as they are the direct window to both the digital content and the real world for the user’s eyes.Two main pillars defining the MR experience are comfort and immersion. Comfort comes in various forms: –wearable comfort—reducing weight and size, pushing back the center of gravity, addressing thermal issues, and so on–visual comfort—providing accurate and natural 3-dimensional cues over a large field of view and a high angular resolution–vestibular comfort—providing stable and realistic virtual overlays that spatially agree with the user’s motion–social comfort—allowing for true eye contact, in a socially acceptable form factor.Immersion can be defined as the multisensory perceptual experience (including audio, display, gestures, haptics) that conveys to the user a sense of realism and envelopment. In order to effectively address both comfort and immersion challenges through improved hardware architectures and software developments, a deep understanding of the specific features and limitations of the human visual perception system is required. We emphasize the need for a human-centric optical design process, which would allow for the most comfortable headset design (wearable, visual, vestibular, and social comfort) without compromising the user’s sense of immersion (display, sensing, and interaction). Matching the specifics of the display architecture to the human visual perception system is key to bound the constraints of the hardware allowing for headset development and mass production at reasonable costs, while providing a delightful experience to the end user.

Download Full-text

Sistema de percepção visual embarcado aplicado à navegação segura de veículos = Embedded visual perception system applied to safe navigation of vehicles

10.47749/t/unicamp.2011.836315 ◽

2011 ◽

Author(s):

Arthur de Miranda Neto

Keyword(s):

Visual Perception ◽

Safe Navigation ◽

Perception System

Download Full-text

Object-based and environment-based inhibition of return of visual attention.

Journal of Experimental Psychology Human Perception & Performance ◽

10.1037/0096-1523.20.3.478 ◽

1994 ◽

Vol 20 (3) ◽

pp. 478-499 ◽

Cited By ~ 151

Author(s):

Steven P. Tipper ◽

Bruce Weaver ◽

Loretta M. Jerreat ◽

Arloene L. Burak

Keyword(s):

Visual Attention ◽

Inhibition Of Return ◽

Object Based

Download Full-text

Influence of Movement Expertise on Visual Perception of Objects, Events and Motor Action

Developing and Applying Biologically-Inspired Vision Systems ◽

10.4018/978-1-4666-2539-6.ch001 ◽

2012 ◽

pp. 1-30 ◽

Cited By ~ 1

Author(s):

Kai Essig ◽

Oleg Strogan ◽

Helge Ritter ◽

Thomas Schack

Keyword(s):

Eye Movements ◽

Visual Perception ◽

Visual Attention ◽

Saliency Map ◽

Long Term Memory ◽

Bottom Up ◽

Term Memory ◽

Perceptual Skills ◽

Control Learning

Various computational models of visual attention rely on the extraction of salient points or proto-objects, i.e., discrete units of attention, computed from bottom-up image features. In recent years, different solutions integrating top-down mechanisms were implemented, as research has shown that although eye movements initially are solely influenced by bottom-up information, after some time goal driven (high-level) processes dominate the guidance of visual attention towards regions of interest (Hwang, Higgins & Pomplun, 2009). However, even these improved modeling approaches are unlikely to generalize to a broader range of application contexts, because basic principles of visual attention, such as cognitive control, learning and expertise, have thus far not sufficiently been taken into account (Tatler, Hayhoe, Land & Ballard, 2011). In some recent work, the authors showed the functional role and representational nature of long-term memory structures for human perceptual skills and motor control. Based on these findings, the chapter extends a widely applied saliency-based model of visual attention (Walther & Koch, 2006) in two ways: first, it computes the saliency map using the cognitive visual attention approach (CVA) that shows a correspondence between regions of high saliency values and regions of visual interest indicated by participants’ eye movements (Oyekoya & Stentiford, 2004). Second, it adds an expertise-based component (Schack, 2012) to represent the influence of the quality of mental representation structures in long-term memory (LTM) and the roles of learning on the visual perception of objects, events, and motor actions.

Download Full-text

Left-Hemispheric Asymmetry for Object-Based Attention: an ERP Study

Brain Sciences ◽

10.3390/brainsci9110315 ◽

2019 ◽

Vol 9 (11) ◽

pp. 315 ◽

Cited By ~ 3

Author(s):

Andrea Orlandi ◽

Alice Mado Proverbio

Keyword(s):

Visual Attention ◽

Hemispheric Asymmetry ◽

Time Course ◽

Electrophysiological Study ◽

Target Selection ◽

Neural Correlates ◽

Specific Stimulus ◽

Object Categories ◽

Object Based ◽

Stimulus Features

It has been shown that selective attention enhances the activity in visual regions associated with stimulus processing. The left hemisphere seems to have a prominent role when non-spatial attention is directed towards specific stimulus features (e.g., color, spatial frequency). The present electrophysiological study investigated the time course and neural correlates of object-based attention, under the assumption of left-hemispheric asymmetry. Twenty-nine right-handed participants were presented with 3D graphic images representing the shapes of different object categories (wooden dummies, chairs, structures of cubes) which lacked detail. They were instructed to press a button in response to a target stimulus indicated at the beginning of each run. The perception of non-target stimuli elicited a larger anterior N2 component, which was likely associated with motor inhibition. Conversely, target selection resulted in an enhanced selection negativity (SN) response lateralized over the left occipito-temporal regions, followed by a larger centro-parietal P300 response. These potentials were interpreted as indexing attentional selection and categorization processes, respectively. The standardized weighted low-resolution electromagnetic tomography (swLORETA) source reconstruction showed the engagement of a fronto-temporo-limbic network underlying object-based visual attention. Overall, the SN scalp distribution and relative neural generators hinted at a left-hemispheric advantage for non-spatial object-based visual attention.

Download Full-text