The structure of 3-D shape representations in human vision revealed by eye movements during object recognition

2006 ◽  
Author(s):  
Charles Leek

2021 ◽  
Author(s):  
Gaurav Malhotra ◽  
Marin Dujmovic ◽  
John Hummel ◽  
Jeffrey S Bowers

The success of Convolutional Neural Networks (CNNs) in classifying objects has led to a surge of interest in using these systems to understand human vision. Recent studies have argued that when CNNs are trained in the correct learning environment, they can emulate a key property of human vision -- learning to classify objects based on their shape. While showing a shape-bias is indeed a desirable property for any model of human object recognition, it is unclear whether the resulting shape representations learned by these networks are human-like. We explored this question in the context of a well-known observation from psychology showing that humans encode the shape of objects in terms of relations between object features. To check whether this is also true for the representations of CNNs, we ran a series of simulations where we trained CNNs on datasets of novel shapes and tested them on a set of controlled deformations of these shapes. We found that CNNs do not show any enhanced sensitivity to deformations which alter relations between features, even when explicitly trained on such deformations. This behaviour contrasted with human participants in previous studies as well as in a new experiment. We argue that these results are a consequence of a fundamental difference between how humans and CNNs learn to recognise objects: while CNNs select features that allow them to optimally classify the proximal stimulus, humans select features that they infer to be properties of the distal stimulus. This makes human representations more generalisable to novel contexts and tasks.





2017 ◽  
Vol 14 (03) ◽  
pp. 1750006
Author(s):  
Xin Wang ◽  
Pieter Jonker

Using active vision to perceive surroundings instead of just passively receiving information, humans develop the ability to explore unknown environments. Humanoid robot active vision research has already half a century history. It covers comprehensive research areas and plenty of studies have been done. Nowadays, the new trend is to use a stereo setup or a Kinect with neck movements to realize active vision. However, human perception is a combination of eye and neck movements. This paper presents an advanced active vision system that works in a similar way as human vision. The main contributions are: a design of a set of controllers that mimic eye and neck movements, including saccade eye movements, pursuit eye movements, vestibulo-ocular reflex eye movements and vergence eye movements; an adaptive selection mechanism based on properties of objects to automatically choose an optimal tracking algorithm; a novel Multimodal Visual Odometry Perception method that combines stereopsis and convergence to enable robots to perform both precise action in action space and scene exploration in personal space. Experimental results prove the effectiveness and robustness of our system. Besides, the system works in real-time constraints with low-cost cameras and motors, providing an affordable solution for industrial applications.



2009 ◽  
Vol 49 (18) ◽  
pp. 2241-2253 ◽  
Author(s):  
Alexander C. Schütz ◽  
Doris I. Braun ◽  
Karl R. Gegenfurtner


2019 ◽  
Author(s):  
Vladislav Ayzenberg ◽  
Frederik S. Kamps ◽  
Daniel D. Dilks ◽  
Stella F. Lourenco

AbstractShape perception is crucial for object recognition. However, it remains unknown exactly how shape information is represented, and, consequently, used by the visual system. Here, we hypothesized that the visual system represents “shape skeletons” to both (1) perceptually organize contours and component parts into a shape percept, and (2) compare shapes to recognize objects. Using functional magnetic resonance imaging (fMRI) and representational similarity analysis (RSA), we found that a model of skeletal similarity explained significant unique variance in the response profiles of V3 and LO, regions known to be involved in perceptual organization and object recognition, respectively. Moreover, the skeletal model remained predictive in these regions even when controlling for other models of visual similarity that approximate low- to high-level visual features (i.e., Gabor-jet, GIST, HMAX, and AlexNet), and across different surface forms, a manipulation that altered object contours while preserving the underlying skeleton. Together, these findings shed light on the functional roles of shape skeletons in human vision, as well as the computational properties of V3 and LO.



Author(s):  
Anders Petersen ◽  
Søren Kyllingsbæk

In the attentional dwell time paradigm by Duncan, Ward, and Shapiro (1994) , two backward masked targets are presented at different spatial locations and separated by a varying time interval. Results show that report of the second target is severely impaired when the time interval is less than 500 ms which has been taken as a direct measure of attentional dwell time in human vision. However, we show that eye movements may have confounded the estimate of the dwell time and that the measure may not be robust as previously suggested. The latter is supported by evidence suggesting that intensive training strongly attenuates the dwell time because of habituation to the masks. Thus, this article points to eye movements and masking as two potential methodological pitfalls that should be considered when using the attentional dwell time paradigm to investigate the temporal dynamics of attention.



2019 ◽  
Author(s):  
Ahmad Yousef

This proposal offers perspective and challenges aiming to optimize and socialize the humanoid eyes. The main purpose of this proposal is to bring the readers’ attention to the importance and the sophistication of the human eye and its four dynamics continuum, namely, the continuum that may include saccadic eye movements, pupil variations, blinks along with duchenne markers. We suggested that the robotics’ designers to work collaboratively with neuroscientists to mathematically estimate the aforementioned continuum, and therefore, humanoid eyes/cameras can be perfectly invented; invention that we try to register its essential elements in the present study. The aforementioned collaboration will be very beneficial for an additional purpose, namely, because human vision indeed activates very many cortical areas that extended to the prefrontal cortex; the collaboration may effectively flourish the eye trackers to be good replacements of the expensive brain imaging in certain circumstances.



Author(s):  
Fiona Mulvey

This chapter introduces the basics of eye anatomy, eye movements and vision. It will explain the concepts behind human vision sufficiently for the reader to understand later chapters in the book on human perception and attention, and their relationship to (and potential measurement with) eye movements. We will first describe the path of light from the environment through the structures of the eye and on to the brain, as an introduction to the physiology of vision. We will then describe the image registered by the eye, and the types of movements the eye makes in order to perceive the environment as a cogent whole. This chapter explains how eye movements can be thought of as the interface between the visual world and the brain, and why eye movement data can be analysed not only in terms of the environment, or what is looked at, but also in terms of the brain, or subjective cognitive and emotional states. These two aspects broadly define the scope and applicability of eye movements technology in research and in human computer interaction in later sections of the book.





Sign in / Sign up

Export Citation Format

Share Document