scholarly journals Image content is more important than Bouma’s Law for scene metamers

2018 ◽  
Author(s):  
Thomas S. A. Wallis ◽  
Christina M. Funke ◽  
Alexander S. Ecker ◽  
Leon A. Gatys ◽  
Felix A. Wichmann ◽  
...  

AbstractWe subjectively perceive our visual field with high fidelity, yet large peripheral distortions can go unnoticed and peripheral objects can be difficult to identify (crowding). A recent paper proposed a model of the mid-level ventral visual stream in which neural responses were averaged over an area of space that increased as a function of eccentricity (scaling). Human participants could not discriminate synthesised model images from each other (they were metamers) when scaling was about half the retinal eccentricity. This result implicated ventral visual area V2 and approximated “Bouma’s Law” of crowding. It has subsequently been interpreted as a link between crowding zones, receptive field scaling, and our rich perceptual experience. However, participants in this experiment never saw the original images. We find that participants can easily discriminate real and model-generated images at V2 scaling. Lower scale factors than even V1 receptive fields may be required to generate metamers. Efficiently explaining why scenes look as they do may require incorporating segmentation processes and global organisational constraints in addition to local pooling.

eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Thomas SA Wallis ◽  
Christina M Funke ◽  
Alexander S Ecker ◽  
Leon A Gatys ◽  
Felix A Wichmann ◽  
...  

We subjectively perceive our visual field with high fidelity, yet peripheral distortions can go unnoticed and peripheral objects can be difficult to identify (crowding). Prior work showed that humans could not discriminate images synthesised to match the responses of a mid-level ventral visual stream model when information was averaged in receptive fields with a scaling of about half their retinal eccentricity. This result implicated ventral visual area V2, approximated ‘Bouma’s Law’ of crowding, and has subsequently been interpreted as a link between crowding zones, receptive field scaling, and our perceptual experience. However, this experiment never assessed natural images. We find that humans can easily discriminate real and model-generated images at V2 scaling, requiring scales at least as small as V1 receptive fields to generate metamers. We speculate that explaining why scenes look as they do may require incorporating segmentation and global organisational constraints in addition to local pooling.


2017 ◽  
Author(s):  
Jesse Gomez ◽  
Vaidehi Natu ◽  
Brianna Jeska ◽  
Michael Barnett ◽  
Kalanit Grill-Spector

ABSTRACTReceptive fields (RFs) processing information in restricted parts of the visual field are a key property of neurons in the visual system. However, how RFs develop in humans is unknown. Using fMRI and population receptive field (pRF) modeling in children and adults, we determined where and how pRFs develop across the ventral visual stream. We find that pRF properties in visual field maps, V1 through VO1, are adult-like by age 5. However, pRF properties in face- and word-selective regions develop into adulthood, increasing the foveal representation and the visual field coverage for faces in the right hemisphere and words in the left hemisphere. Eye-tracking indicates that pRF changes are related to changing fixation patterns on words and faces across development. These findings suggest a link between viewing behavior of faces and words and the differential development of pRFs across visual cortex, potentially due to competition on foveal coverage.


2019 ◽  
Author(s):  
David Richter ◽  
Floris P. de Lange

AbstractPerception and behavior can be guided by predictions, which are often based on learned statistical regularities. Neural responses to expected stimuli are frequently found to be attenuated after statistical learning. However, whether this sensory attenuation following statistical learning occurs automatically or depends on attention remains unknown. In the present fMRI study, we exposed human volunteers to sequentially presented object stimuli, in which the first object predicted the identity of the second object. We observed a strong attenuation of neural activity for expected compared to unexpected stimuli in the ventral visual stream. Crucially, this sensory attenuation was only apparent when stimuli were attended, and vanished when attention was directed away from the predictable objects. These results put important constraints on neurocomputational theories that cast perception as a process of probabilistic integration of prior knowledge and sensory information.


2019 ◽  
Author(s):  
Mariya E. Manahova ◽  
Eelke Spaak ◽  
Floris P. de Lange

AbstractFamiliarity with a stimulus leads to an attenuated neural response to the stimulus. Alongside this attenuation, recent studies have also observed a truncation of stimulus-evoked activity for familiar visual input. One proposed function of this truncation is to rapidly put neurons in a state of readiness to respond to new input. Here, we examined this hypothesis by presenting human participants with target stimuli that were embedded in rapid streams of familiar or novel distractor stimuli at different speeds of presentation, while recording brain activity using magnetoencephalography (MEG) and measuring behavioral performance. We investigated the temporal and spatial dynamics of signal truncation and whether this phenomenon bears relationship to participants’ ability to categorize target items within a visual stream. Behaviorally, target categorization performance was markedly better when the target was embedded within familiar distractors, and this benefit became more pronounced with increasing speed of presentation. Familiar distractors showed a truncation of neural activity in the visual system, and this truncation was strongest for the fastest presentation speeds. Moreover, neural processing of the target was stronger when it was preceded by familiar distractors. Taken together, these findings suggest that truncation of neural responses for familiar items may result in stronger processing of relevant target information, resulting in superior perceptual performance.Significance statementThe visual response to familiar input is attenuated more rapidly than for novel input. Here we find that this truncation of the neural response for familiar input is strongest for very fast image presentations. We also find a tentative function for this truncation: the neural response to a target image that is embedded within distractors is much greater when the distractors are familiar than when they are novel. Similarly, target categorization performance is much better when the target is embedded within familiar distractors, and this advantage is most obvious for very fast image presentations. This suggests that neural truncation helps to rapidly put neurons in a state of readiness to respond to new input.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
David Richter ◽  
Floris P de Lange

Perception and behavior can be guided by predictions, which are often based on learned statistical regularities. Neural responses to expected stimuli are frequently found to be attenuated after statistical learning. However, whether this sensory attenuation following statistical learning occurs automatically or depends on attention remains unknown. In the present fMRI study, we exposed human volunteers to sequentially presented object stimuli, in which the first object predicted the identity of the second object. We observed a reliable attenuation of neural activity for expected compared to unexpected stimuli in the ventral visual stream. Crucially, this sensory attenuation was only apparent when stimuli were attended, and vanished when attention was directed away from the predictable objects. These results put important constraints on neurocomputational theories that cast perception as a process of probabilistic integration of prior knowledge and sensory information.


2014 ◽  
Vol 14 (10) ◽  
pp. 717-717
Author(s):  
K. Kay ◽  
K. Weiner ◽  
K. Grill-Spector

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Irina Higgins ◽  
Le Chang ◽  
Victoria Langston ◽  
Demis Hassabis ◽  
Christopher Summerfield ◽  
...  

AbstractIn order to better understand how the brain perceives faces, it is important to know what objective drives learning in the ventral visual stream. To answer this question, we model neural responses to faces in the macaque inferotemporal (IT) cortex with a deep self-supervised generative model, β-VAE, which disentangles sensory data into interpretable latent factors, such as gender or age. Our results demonstrate a strong correspondence between the generative factors discovered by β-VAE and those coded by single IT neurons, beyond that found for the baselines, including the handcrafted state-of-the-art model of face perception, the Active Appearance Model, and deep classifiers. Moreover, β-VAE is able to reconstruct novel face images using signals from just a handful of cells. Together our results imply that optimising the disentangling objective leads to representations that closely resemble those in the IT at the single unit level. This points at disentangling as a plausible learning objective for the visual brain.


eLife ◽  
2018 ◽  
Vol 7 ◽  
Author(s):  
Elias B Issa ◽  
Charles F Cadieu ◽  
James J DiCarlo

Ventral visual stream neural responses are dynamic, even for static image presentations. However, dynamical neural models of visual cortex are lacking as most progress has been made modeling static, time-averaged responses. Here, we studied population neural dynamics during face detection across three cortical processing stages. Remarkably,~30 milliseconds after the initially evoked response, we found that neurons in intermediate level areas decreased their responses to typical configurations of their preferred face parts relative to their response for atypical configurations even while neurons in higher areas achieved and maintained a preference for typical configurations. These hierarchical neural dynamics were inconsistent with standard feedforward circuits. Rather, recurrent models computing prediction errors between stages captured the observed temporal signatures. This model of neural dynamics, which simply augments the standard feedforward model of online vision, suggests that neural responses to static images may encode top-down prediction errors in addition to bottom-up feature estimates.


Author(s):  
Tao He ◽  
David Richter ◽  
Zhiguo Wang ◽  
Floris P. de Lange

AbstractBoth spatial and temporal context play an important role in visual perception and behavior. Humans can extract statistical regularities from both forms of context to help processing the present and to construct expectations about the future. Numerous studies have found reduced neural responses to expected stimuli compared to unexpected stimuli, for both spatial and temporal regularities. However, it is largely unclear whether and how these forms of context interact. In the current fMRI study, thirty-three human volunteers were exposed to object stimuli that could be expected or surprising in terms of their spatial and temporal context. We found a reliable independent contribution of both spatial and temporal context in modulating the neural response. Specifically, neural responses to stimuli in expected compared to unexpected contexts were suppressed throughout the ventral visual stream. Interestingly, the modulation by spatial context was stronger in magnitude and more reliable than modulations by temporal context. These results suggest that while both spatial and temporal context serve as a prior that can modulate sensory processing in a similar fashion, predictions of spatial context may be a more powerful modulator in the visual system.Significance StatementBoth temporal and spatial context can affect visual perception, however it is largely unclear if and how these different forms of context interact in modulating sensory processing. When manipulating both temporal and spatial context expectations, we found that they jointly affected sensory processing, evident as a suppression of neural responses for expected compared to unexpected stimuli. Interestingly, the modulation by spatial context was stronger than that by temporal context. Together, our results suggest that spatial context may be a stronger modulator of neural responses than temporal context within the visual system. Thereby, the present study provides new evidence how different types of predictions jointly modulate perceptual processing.


2021 ◽  
pp. 1-16
Author(s):  
Tao He ◽  
David Richter ◽  
Zhiguo Wang ◽  
Floris P. de Lange

Abstract Both spatial and temporal context play an important role in visual perception and behavior. Humans can extract statistical regularities from both forms of context to help process the present and to construct expectations about the future. Numerous studies have found reduced neural responses to expected stimuli compared with unexpected stimuli, for both spatial and temporal regularities. However, it is largely unclear whether and how these forms of context interact. In the current fMRI study, 33 human volunteers were exposed to pairs of object stimuli that could be expected or surprising in terms of their spatial and temporal context. We found reliable independent contributions of both spatial and temporal context in modulating the neural response. Specifically, neural responses to stimuli in expected compared with unexpected contexts were suppressed throughout the ventral visual stream. These results suggest that both spatial and temporal context may aid sensory processing in a similar fashion, providing evidence on how different types of context jointly modulate perceptual processing.


Sign in / Sign up

Export Citation Format

Share Document