scholarly journals On the Aperture Problem of Binocular 3D Motion Perception

Vision ◽  
2019 ◽  
Vol 3 (4) ◽  
pp. 64
Author(s):  
Martin Lages ◽  
Suzanne Heron

Like many predators, humans have forward-facing eyes that are set a short distance apart so that an extensive region of the visual field is seen from two different points of view. The human visual system can establish a three-dimensional (3D) percept from the projection of images into the left and right eye. How the visual system integrates local motion and binocular depth in order to accomplish 3D motion perception is still under investigation. Here, we propose a geometric-statistical model that combines noisy velocity constraints with a spherical motion prior to solve the aperture problem in 3D. In two psychophysical experiments, it is shown that instantiations of this model can explain how human observers disambiguate 3D line motion direction behind a circular aperture. We discuss the implications of our results for the processing of motion and dynamic depth in the visual system.

Author(s):  
Martin Lages ◽  
Suzanne Heron ◽  
Hongfang Wang

The authors discuss local constraints for the perception of three-dimensional (3D) binocular motion in a geometric-probabilistic framework. It is shown that Bayesian models of binocular 3D motion can explain perceptual bias under uncertainty and predict perceived velocity under ambiguity. The models exploit biologically plausible constraints of local motion and disparity processing in a binocular viewing geometry. Results from computer simulations and psychophysical experiments support the idea that local constraints of motion and disparity processing are combined late in the visual processing hierarchy to establish perceived 3D motion direction.


Perception ◽  
1996 ◽  
Vol 25 (7) ◽  
pp. 797-814 ◽  
Author(s):  
Michiteru Kitazaki ◽  
Shinsuke Shimojo

The generic-view principle (GVP) states that given a 2-D image the visual system interprets it as a generic view of a 3-D scene when possible. The GVP was applied to 3-D-motion perception to show how the visual system decomposes retinal image motion into three components of 3-D motion: stretch/shrinkage, rotation, and translation. First, the optical process of retinal image motion was analyzed, and predictions were made based on the GVP in the inverse-optical process. Then experiments were conducted in which the subject judged perception of stretch/shrinkage, rotation in depth, and translation in depth for a moving bar stimulus. Retinal-image parameters—2-D stretch/shrinkage, 2-D rotation, and 2-D translation—were manipulated categorically and exhaustively. The results were highly consistent with the predictions. The GVP seems to offer a broad and general framework for understanding the ambiguity-solving process in motion perception. Its relationship to other constraints such as that of rigidity is discussed.


2010 ◽  
Vol 23 (3) ◽  
pp. 241-261
Author(s):  
Rachael Thiel ◽  
J. Timothy Petersik

AbstractFour experiments and controls were run in order to determine the ability of the visual system to detect slight changes in three-dimensional (3D) rotating stimuli in comparison to two-dimensional (2D) controls. A small number of observers (between 5 and 8) viewed computerized displays of pixel-defined transparent rotating spheres or circular patches of pixels drifting linearly in opposite directions. Halfway through the circuit of rotation a letter was briefly displayed and the rotation continued with some change introduced. Our results showed that for horizontal shifts of the stimulus on the X-axis, changes in the axis of rotation, and additions/deletions of pixels, observers were better at detecting the changes associated with 3D motion than 2D motion. There was no good 2D control for approaching and receding stimuli, but on the basis of other results it was concluded that 3D movement had no advantage. It is suggested that rotation in 3D is more readily monitored by the visual system than simultaneous 2D motions in opposite directions.


Author(s):  
Max R. Dürsteler ◽  
Erika N. Lorincz

When we fixate the center of a rotating three-dimensional structure, such as a physically rotating wheel made out of sectors, which stereo cues are encoded with a static random-dot “texture,” a rather striking global motion illusion occurs: the rotating three-dimensional wheel appears as standing still (stereo rotation standstill). Even when using a dynamic (flickering) random-dot texture, it is still impossible to gain a percept of smooth rotation. However, local motion can still be clearly perceived. When the random-dot texture “overlaying” the wheel is also rotating, the concealed wheel is perceived as rotating at the same velocity as the texture, regardless of its velocity (stereo rotation capture). Stereo complex motion standstill and capture is shown to occur for other categories of complex motions such as expanding, contracting, and spiraling motions thus providing evidence for a dominance of luminance inputs over stereo inputs for complex motion detectors in our visual system.


2018 ◽  
Vol 30 (12) ◽  
pp. 3355-3392 ◽  
Author(s):  
Jonathan Vacher ◽  
Andrew Isaac Meso ◽  
Laurent U. Perrinet ◽  
Gabriel Peyré

A common practice to account for psychophysical biases in vision is to frame them as consequences of a dynamic process relying on optimal inference with respect to a generative model. The study presented here details the complete formulation of such a generative model intended to probe visual motion perception with a dynamic texture model. It is derived in a set of axiomatic steps constrained by biological plausibility. We extend previous contributions by detailing three equivalent formulations of this texture model. First, the composite dynamic textures are constructed by the random aggregation of warped patterns, which can be viewed as three-dimensional gaussian fields. Second, these textures are cast as solutions to a stochastic partial differential equation (sPDE). This essential step enables real-time, on-the-fly texture synthesis using time-discretized autoregressive processes. It also allows for the derivation of a local motion-energy model, which corresponds to the log likelihood of the probability density. The log likelihoods are essential for the construction of a Bayesian inference framework. We use the dynamic texture model to psychophysically probe speed perception in humans using zoom-like changes in the spatial frequency content of the stimulus. The human data replicate previous findings showing perceived speed to be positively biased by spatial frequency increments. A Bayesian observer who combines a gaussian likelihood centered at the true speed and a spatial frequency dependent width with a “slow-speed prior” successfully accounts for the perceptual bias. More precisely, the bias arises from a decrease in the observer's likelihood width estimated from the experiments as the spatial frequency increases. Such a trend is compatible with the trend of the dynamic texture likelihood width.


Perception ◽  
1995 ◽  
Vol 24 (11) ◽  
pp. 1247-1256 ◽  
Author(s):  
Yoseph Hermush ◽  
Yehezkel Yeshurun

Motion is perceived whenever a subject is presented with an appropriate spatiotemporal visual pattern. Like many other visual tasks, motion perception involves both local and global processing, and thus might be subject to the well-known paradox that arises from the fact that local features and observations form the basis for global perception, but sometimes this global percept can not be easily derived from any single local observation, as is best exemplified by the aperture problem. Globally, dual (transparent) motion can be readily perceived. Spatial limits on the local ability to perceive multiple motion are sought. By using the framework of apparent motion, it is found that dual, orthogonally oriented motion can be perceived only when the dots that constitute the two motions are separated by some spatial limit. For short-range apparent motion, the limit is found to be comparable to Dmax, and the visual system cannot perceive more than a single coherent motion in a local ‘patch’ of radius Dmax. It was also found that this spatial limit on local-motion perception is not constant, but depends linearly on the spatial organisation of the stimuli, and vanishes for stimuli having reverse contrast. The lower bound on the ability to perceive multiple motion is compared with some well-known bounds in stereopsis, and a cortical columnar architecture that might account for it is proposed.


2001 ◽  
Vol 10 (3) ◽  
pp. 312-330 ◽  
Author(s):  
Bernard Harper ◽  
Richard Latto

Stereo scene capture and generation is an important facet of presence research in that stereoscopic images have been linked to naturalness as a component of reported presence. Three-dimensional images can be captured and presented in many ways, but it is rare that the most simple and “natural” method is used: full orthostereoscopic image capture and projection. This technique mimics as closely as possible the geometry of the human visual system and uses convergent axis stereography with the cameras separated by the human interocular distance. It simulates human viewing angles, magnification, and convergences so that the point of zero disparity in the captured scene is reproduced without disparity in the display. In a series of experiments, we have used this technique to investigate body image distortion in photographic images. Three psychophysical experiments compared size, weight, or shape estimations (perceived waist-hip ratio) in 2-D and 3-D images for the human form and real or virtual abstract shapes. In all cases, there was a relative slimming effect of binocular disparity. A well-known photographic distortion is the perspective flattening effect of telephoto lenses. A fourth psychophysical experiment using photographic portraits taken at different distances found a fattening effect with telephoto lenses and a slimming effect with wide-angle lenses. We conclude that, where possible, photographic inputs to the visual system should allow it to generate the cyclopean point of view by which we normally see the world. This is best achieved by viewing images made with full orthostereoscopic capture and display geometry. The technique can result in more-accurate estimations of object shape or size and control of ocular suppression. These are assets that have particular utility in the generation of realistic virtual environments.


Sign in / Sign up

Export Citation Format

Share Document