scholarly journals Development of an Autonomous Visual Perception System for Robots Using Object-Based Visual Attention

Author(s):  
Yuanlong Yu ◽  
George K. I. Mann ◽  
Raymond G.
2015 ◽  
Vol 12 (01) ◽  
pp. 1550009 ◽  
Author(s):  
Francisco Martín ◽  
Carlos E. Agüero ◽  
José M. Cañas

Robots detect and keep track of relevant objects in their environment to accomplish some tasks. Many of them are equipped with mobile cameras as the main sensors, process the images and maintain an internal representation of the detected objects. We propose a novel active visual memory that moves the camera to detect objects in robot's surroundings and tracks their positions. This visual memory is based on a combination of multi-modal filters that efficiently integrates partial information. The visual attention subsystem is distributed among the software components in charge of detecting relevant objects. We demonstrate the efficiency and robustness of this perception system in a real humanoid robot participating in the RoboCup SPL competition.


2008 ◽  
Vol 12 (5) ◽  
pp. 182-186 ◽  
Author(s):  
Barbara G. Shinn-Cunningham

Nanophotonics ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 41-74
Author(s):  
Bernard C. Kress ◽  
Ishan Chatterjee

AbstractThis paper is a review and analysis of the various implementation architectures of diffractive waveguide combiners for augmented reality (AR), mixed reality (MR) headsets, and smart glasses. Extended reality (XR) is another acronym frequently used to refer to all variants across the MR spectrum. Such devices have the potential to revolutionize how we work, communicate, travel, learn, teach, shop, and are entertained. Already, market analysts show very optimistic expectations on return on investment in MR, for both enterprise and consumer applications. Hardware architectures and technologies for AR and MR have made tremendous progress over the past five years, fueled by recent investment hype in start-ups and accelerated mergers and acquisitions by larger corporations. In order to meet such high market expectations, several challenges must be addressed: first, cementing primary use cases for each specific market segment and, second, achieving greater MR performance out of increasingly size-, weight-, cost- and power-constrained hardware. One such crucial component is the optical combiner. Combiners are often considered as critical optical elements in MR headsets, as they are the direct window to both the digital content and the real world for the user’s eyes.Two main pillars defining the MR experience are comfort and immersion. Comfort comes in various forms: –wearable comfort—reducing weight and size, pushing back the center of gravity, addressing thermal issues, and so on–visual comfort—providing accurate and natural 3-dimensional cues over a large field of view and a high angular resolution–vestibular comfort—providing stable and realistic virtual overlays that spatially agree with the user’s motion–social comfort—allowing for true eye contact, in a socially acceptable form factor.Immersion can be defined as the multisensory perceptual experience (including audio, display, gestures, haptics) that conveys to the user a sense of realism and envelopment. In order to effectively address both comfort and immersion challenges through improved hardware architectures and software developments, a deep understanding of the specific features and limitations of the human visual perception system is required. We emphasize the need for a human-centric optical design process, which would allow for the most comfortable headset design (wearable, visual, vestibular, and social comfort) without compromising the user’s sense of immersion (display, sensing, and interaction). Matching the specifics of the display architecture to the human visual perception system is key to bound the constraints of the hardware allowing for headset development and mass production at reasonable costs, while providing a delightful experience to the end user.


Author(s):  
Steven P. Tipper ◽  
Bruce Weaver ◽  
Loretta M. Jerreat ◽  
Arloene L. Burak

Author(s):  
Kai Essig ◽  
Oleg Strogan ◽  
Helge Ritter ◽  
Thomas Schack

Various computational models of visual attention rely on the extraction of salient points or proto-objects, i.e., discrete units of attention, computed from bottom-up image features. In recent years, different solutions integrating top-down mechanisms were implemented, as research has shown that although eye movements initially are solely influenced by bottom-up information, after some time goal driven (high-level) processes dominate the guidance of visual attention towards regions of interest (Hwang, Higgins & Pomplun, 2009). However, even these improved modeling approaches are unlikely to generalize to a broader range of application contexts, because basic principles of visual attention, such as cognitive control, learning and expertise, have thus far not sufficiently been taken into account (Tatler, Hayhoe, Land & Ballard, 2011). In some recent work, the authors showed the functional role and representational nature of long-term memory structures for human perceptual skills and motor control. Based on these findings, the chapter extends a widely applied saliency-based model of visual attention (Walther & Koch, 2006) in two ways: first, it computes the saliency map using the cognitive visual attention approach (CVA) that shows a correspondence between regions of high saliency values and regions of visual interest indicated by participants’ eye movements (Oyekoya & Stentiford, 2004). Second, it adds an expertise-based component (Schack, 2012) to represent the influence of the quality of mental representation structures in long-term memory (LTM) and the roles of learning on the visual perception of objects, events, and motor actions.


2019 ◽  
Vol 9 (11) ◽  
pp. 315 ◽  
Author(s):  
Andrea Orlandi ◽  
Alice Mado Proverbio

It has been shown that selective attention enhances the activity in visual regions associated with stimulus processing. The left hemisphere seems to have a prominent role when non-spatial attention is directed towards specific stimulus features (e.g., color, spatial frequency). The present electrophysiological study investigated the time course and neural correlates of object-based attention, under the assumption of left-hemispheric asymmetry. Twenty-nine right-handed participants were presented with 3D graphic images representing the shapes of different object categories (wooden dummies, chairs, structures of cubes) which lacked detail. They were instructed to press a button in response to a target stimulus indicated at the beginning of each run. The perception of non-target stimuli elicited a larger anterior N2 component, which was likely associated with motor inhibition. Conversely, target selection resulted in an enhanced selection negativity (SN) response lateralized over the left occipito-temporal regions, followed by a larger centro-parietal P300 response. These potentials were interpreted as indexing attentional selection and categorization processes, respectively. The standardized weighted low-resolution electromagnetic tomography (swLORETA) source reconstruction showed the engagement of a fronto-temporo-limbic network underlying object-based visual attention. Overall, the SN scalp distribution and relative neural generators hinted at a left-hemispheric advantage for non-spatial object-based visual attention.


Sign in / Sign up

Export Citation Format

Share Document