Vision

Author(s):  
Frances Egan

Vision is the most studied sense. It is our richest source of information about the external world, providing us with knowledge of the shape, size, distance, colour and luminosity of objects around us. Vision is fast, automatic and achieved without conscious effort; however, the apparent ease with which we see is deceptive. Ever since Kepler characterized the formation of the retinal image in the early seventeenth century, vision theorists have known that the image on the retina does not correspond in an obvious manner to the way things look. The retinal image is two-dimensional, yet we see three dimensions; the size and shape of the image that an object casts on the retina varies with the distance and perspective of the observer, yet we experience objects as having constant size and shape. The primary task of a theory of vision is to explain how useful information about the external world is recovered from the changing retinal image. Theories of vision fall roughly into two classes. Indirect theories characterize the processes underlying visual perception in psychological terms, as, for example, inference from prior data or construction of complex percepts from basic sensory components. Direct theories tend to stress the richness of the information available in the retinal image, but, more importantly, they deny that visual processes can be given any correct psychological or mental characterization. Direct theorists, while not denying that the processing underlying vision may be very complex, claim that the complexity is to be explicated merely by reference to nonpsychological, neural processes implemented in the brain. The most influential recent work in vision treats it as an information-processing task, hence as indirect. Many computational models characterize visual processing as the production and decoding of a series of increasingly useful internal representations of the distal scene. These operations are described in computational accounts by precise algorithms. Computer implementations of possible strategies employed by the visual system contribute to our understanding of the problems inherent in complex visual tasks such as edge detection and shape recognition, and make possible the rigorous testing of proposed solutions.

Author(s):  
Frances Egan

Vision is the most studied sense. It is our richest source of information about the external world, providing us with knowledge of the shape, size, distance, colour and luminosity of objects around us. Vision is fast, automatic and achieved without conscious effort; however, the apparent ease with which we see is deceptive. Since Kepler characterized the formation of the retinal image in the early seventeenth century, vision theorists have known that objects do not look the way they appear on the retina. The retinal image is two-dimensional, yet we see three dimensions; the size and shape of the image that an object casts on the retina varies with the distance and perspective of the observer, yet we experience objects as having constant size and shape. The primary task of a theory of vision is to explain how useful information about the external world is recovered from the changing retinal image. Theories of vision fall roughly into two classes. Indirect theories characterize the processes underlying visual perception in psychological terms, as, for example, inference from prior data or construction of complex percepts from basic sensory components. Direct theories tend to stress the richness of the information available in the retinal image, but, more importantly, they deny that visual processes can be given any correct psychological or mental characterization. Direct theorists, while not denying that the processing underlying vision may be very complex, claim that the complexity is to be explicated merely by reference to non-psychological, neural processes implemented in the brain. The most influential recent work in vision treats it as an information-processing task, hence as indirect. Computational models characterize visual processing as the production and decoding of a series of increasingly useful internal representations of the distal scene. These operations are described in computational accounts by precise algorithms. Computer implementations of possible strategies employed by the visual system contribute to our understanding of the problems inherent in complex visual tasks such as edge detection or shape recognition, and make possible the rigorous testing of proposed solutions.


There is a long-standing tradition of research on vision in Great Britain that goes back at least as far as Newton. The Royal Society is therefore a most suitable venue for a conference on the Psychology of Vision, and it is no accident that two of our distinguished guests from North America are British subjects. In the first 30 years of this century the Gestalt movement brought about a revolution in our ways of thinking about vision, but the subject then remained rather stagnant for two decades. In more recent years, dramatic discoveries and radical new insights have been forthcoming from three different directions. First, neurophysiologists have laid bare some of the highly systematic wiring that subserves the early stages of the processing of the visual input. Secondly, psychologists and psychophysiologists have uncovered some of the intricacies of the mechanisms that underlie such functions as acuity, contrast discrimination, motion detection and stereopsis. It is becoming possible to put together results from these two directions and to show how mechanisms inferred from psychophysical observations are instantiated in known neurophysiological circuits. The two sets of results indicate that visual processing is both more complex and more elegant than had been suspected 50 years ago. Thirdly, the advent of the digital computer has made it possible to build rigorous computational models of the visual system, to explore and to specify more adequately the nature of the task that the visual system must perform, and to demonstrate precisely how the constraints imposed by the nature of the physical world and of its optics make it possible for the brain to use the patterns of light impinging on the retinae to form a useful representation of the external world. Although this last enterprise may strike some as speculative, it has already led to insights into the nature of vision that have changed our ways of looking at the problems and have made the theories of shape recognition put forward in the 1950s and 1970s, including those of one of us, look extremely superficial.


Author(s):  
C J R Sheppard

The confocal microscope is now widely used in both biomedical and industrial applications for imaging, in three dimensions, objects with appreciable depth. There are now a range of different microscopes on the market, which have adopted a variety of different designs. The aim of this paper is to explore the effects on imaging performance of design parameters including the method of scanning, the type of detector, and the size and shape of the confocal aperture.It is becoming apparent that there is no such thing as an ideal confocal microscope: all systems have limitations and the best compromise depends on what the microscope is used for and how it is used. The most important compromise at present is between image quality and speed of scanning, which is particularly apparent when imaging with very weak signals. If great speed is not of importance, then the fundamental limitation for fluorescence imaging is the detection of sufficient numbers of photons before the fluorochrome bleaches.


2020 ◽  
Author(s):  
B R Geib ◽  
R Cabeza ◽  
M G Woldorff

Abstract While it is broadly accepted that attention modulates memory, the contribution of specific rapid attentional processes to successful encoding is largely unknown. To investigate this issue, we leveraged the high temporal resolution of electroencephalographic recordings to directly link a cascade of visuo-attentional neural processes to successful encoding: namely (1) the N2pc (peaking ~200 ms), which reflects stimulus-specific attentional orienting and allocation, (2) the sustained posterior-contralateral negativity (post-N2pc), which has been associated with sustained visual processing, (3) the contralateral reduction in oscillatory alpha power (contralateral reduction in alpha > 200 ms), which has also been independently related to attentionally sustained visual processing. Each of these visuo-attentional processes was robustly predictive of successful encoding, and, moreover, each enhanced memory independently of the classic, longer-latency, conceptually related, difference-due-to memory (Dm) effect. Early latency midfrontal theta power also promoted successful encoding, with at least part of this influence being mediated by the later latency Dm effect. These findings markedly expand current knowledge by helping to elucidate the intimate relationship between attentional modulations of perceptual processing and effective encoding for later memory retrieval.


1999 ◽  
Vol 11 (3) ◽  
pp. 300-311 ◽  
Author(s):  
Edmund T. Rolls ◽  
Martin J. Tovée ◽  
Stefano Panzeri

Backward masking can potentially provide evidence of the time needed for visual processing, a fundamental constraint that must be incorporated into computational models of vision. Although backward masking has been extensively used psychophysically, there is little direct evidence for the effects of visual masking on neuronal responses. To investigate the effects of a backward masking paradigm on the responses of neurons in the temporal visual cortex, we have shown that the response of the neurons is interrupted by the mask. Under conditions when humans can just identify the stimulus, with stimulus onset asynchronies (SOA) of 20 msec, neurons in macaques respond to their best stimulus for approximately 30 msec. We now quantify the information that is available from the responses of single neurons under backward masking conditions when two to six faces were shown. We show that the information available is greatly decreased as the mask is brought closer to the stimulus. The decrease is more marked than the decrease in firing rate because it is the selective part of the firing that is especially attenuated by the mask, not the spontaneous firing, and also because the neuronal response is more variable at short SOAs. However, even at the shortest SOA of 20 msec, the information available is on average 0.1 bits. This compares to 0.3 bits with only the 16-msec target stimulus shown and a typical value for such neurons of 0.4 to 0.5 bits with a 500-msec stimulus. The results thus show that considerable information is available from neuronal responses even under backward masking conditions that allow the neurons to have their main response in 30 msec. This provides evidence for how rapid the processing of visual information is in a cortical area and provides a fundamental constraint for understanding how cortical information processing operates.


2020 ◽  
Vol 11 ◽  
Author(s):  
Peter Gärdenfors

The world as we perceive it is structured into objects, actions and places that form parts of events. In this article, my aim is to explain why these categories are cognitively primary. From an empiricist and evolutionary standpoint, it is argued that the reduction of the complexity of sensory signals is based on the brain's capacity to identify various types of invariances that are evolutionarily relevant for the activities of the organism. The first aim of the article is to explain why places, object and actions are primary cognitive categories in our constructions of the external world. It is shown that the invariances that determine these categories have their separate characteristics and that they are, by and large, independent of each other. This separation is supported by what is known about the neural mechanisms. The second aim is to show that the category of events can be analyzed as being constituted of the primary categories. The category of numbers is briefly discussed. Some implications for computational models of the categories are also presented.


2010 ◽  
Vol 22 (11) ◽  
pp. 2417-2426 ◽  
Author(s):  
Stephanie A. McMains ◽  
Sabine Kastner

Multiple stimuli that are present simultaneously in the visual field compete for neural representation. At the same time, however, multiple stimuli in cluttered scenes also undergo perceptual organization according to certain rules originally defined by the Gestalt psychologists such as similarity or proximity, thereby segmenting scenes into candidate objects. How can these two seemingly orthogonal neural processes that occur early in the visual processing stream be reconciled? One possibility is that competition occurs among perceptual groups rather than at the level of elements within a group. We probed this idea using fMRI by assessing competitive interactions across visual cortex in displays containing varying degrees of perceptual organization or perceptual grouping (Grp). In strong Grp displays, elements were arranged such that either an illusory figure or a group of collinear elements were present, whereas in weak Grp displays the same elements were arranged randomly. Competitive interactions among stimuli were overcome throughout early visual cortex and V4, when elements were grouped regardless of Grp type. Our findings suggest that context-dependent grouping mechanisms and competitive interactions are linked to provide a bottom–up bias toward candidate objects in cluttered scenes.


i-Perception ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 204166952090355 ◽  
Author(s):  
Peter U. Tse

Binocular disparity can give rise to the perception of open surfaces or closed curved surfaces (volumes) that appear to vary smoothly across discrete depths. Here I build on my recent papers by providing examples where modally completing surfaces not only fill in from one depth layer’s visible contours to another layer’s visible contours within virtual contours in an analog manner, but where modally completing surface curvature is altered by the interpolation of an abutting object perceived to be connected to or embedded within that modally completing surface. Seemingly minor changes in such an abutting object can flip the interpretation of distal regions, for example, turning a distant edge (where a surface ends) into rim (where a surface bends to occlude itself) or turning an open surface into a closed one. In general, the interpolated modal surface appears to deform, warp, or bend in three-dimensions to accommodate the abutting object. These demonstrations cannot be easily explained by existing models of visual processing or modal completion and drive home the implausibility of localistic accounts of modal or amodal completion that are based, for example, solely on extending contours in space until they meet behind an occluder or in front of “pacmen.” These demonstrations place new constraints on the holistic surface and volume generation processes that construct our experience of a three-dimensional world of surfaces and objects under normal viewing conditions.


1995 ◽  
Vol 6 (3) ◽  
pp. 182-186 ◽  
Author(s):  
Steven Yantis

The human visual system does not rigidly preserve the properties of the retinal image as neural signals are transmitted to higher areas of the brain Instead, it generates a representation that captures stable surface properties despite a retinal image that is often fragmented in space and time because of occlusion caused by object and observer motion The recovery of this coherent representation depends at least in part on input from an abstract representation of three-dimensional (3-D) surface layout In the two experiments reported, a stereoscopic apparent motion display was used to investigate the perceived continuity of a briefly interrupted visual object When a surface appeared in front of the object's location during the interruption, the object was more likely to be perceived as persisting through the interruption (behind an occluder) than when the surface appeared behind the object's location under otherwise identical stimulus conditions The results reveal the influence of 3-D surface-based representations even in very simple visual tasks


Author(s):  
Bernd J. Kröger

This chapter outlines a comprehensive neurocomputational model of voice and speech perception based on (i) already established computational models, as well as on (ii) neurophysiological data of the underlying neural processes. Neurocomputational models of speech perception comprise auditory as well as cognitive modules, in order to extract sound features as well as linguistic information (linguistic content). A model of voice and speech perception in addition needs to process paralinguistic information like gender, age, emotional or affective state of speaker, etc. It is argued here that modules of a neurocomputational model of voice and speech perception need to interact with modules which go beyond unimodal auditory processing because, for example, processing of paralinguistic information is closely related to such as visual facial perception. Thus, this chapter describes neural modelling of voice and speech perception in relation to general communication and social-interaction processes, which makes it necessary to develop a hypermodal processing approach.


Sign in / Sign up

Export Citation Format

Share Document