scholarly journals Spatial and Temporal Context Jointly Modulate the Sensory Response within the Ventral Visual Stream

2021 ◽  
pp. 1-16
Author(s):  
Tao He ◽  
David Richter ◽  
Zhiguo Wang ◽  
Floris P. de Lange

Abstract Both spatial and temporal context play an important role in visual perception and behavior. Humans can extract statistical regularities from both forms of context to help process the present and to construct expectations about the future. Numerous studies have found reduced neural responses to expected stimuli compared with unexpected stimuli, for both spatial and temporal regularities. However, it is largely unclear whether and how these forms of context interact. In the current fMRI study, 33 human volunteers were exposed to pairs of object stimuli that could be expected or surprising in terms of their spatial and temporal context. We found reliable independent contributions of both spatial and temporal context in modulating the neural response. Specifically, neural responses to stimuli in expected compared with unexpected contexts were suppressed throughout the ventral visual stream. These results suggest that both spatial and temporal context may aid sensory processing in a similar fashion, providing evidence on how different types of context jointly modulate perceptual processing.

2019 ◽  
Author(s):  
David Richter ◽  
Floris P. de Lange

AbstractPerception and behavior can be guided by predictions, which are often based on learned statistical regularities. Neural responses to expected stimuli are frequently found to be attenuated after statistical learning. However, whether this sensory attenuation following statistical learning occurs automatically or depends on attention remains unknown. In the present fMRI study, we exposed human volunteers to sequentially presented object stimuli, in which the first object predicted the identity of the second object. We observed a strong attenuation of neural activity for expected compared to unexpected stimuli in the ventral visual stream. Crucially, this sensory attenuation was only apparent when stimuli were attended, and vanished when attention was directed away from the predictable objects. These results put important constraints on neurocomputational theories that cast perception as a process of probabilistic integration of prior knowledge and sensory information.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
David Richter ◽  
Floris P de Lange

Perception and behavior can be guided by predictions, which are often based on learned statistical regularities. Neural responses to expected stimuli are frequently found to be attenuated after statistical learning. However, whether this sensory attenuation following statistical learning occurs automatically or depends on attention remains unknown. In the present fMRI study, we exposed human volunteers to sequentially presented object stimuli, in which the first object predicted the identity of the second object. We observed a reliable attenuation of neural activity for expected compared to unexpected stimuli in the ventral visual stream. Crucially, this sensory attenuation was only apparent when stimuli were attended, and vanished when attention was directed away from the predictable objects. These results put important constraints on neurocomputational theories that cast perception as a process of probabilistic integration of prior knowledge and sensory information.


Author(s):  
Tao He ◽  
David Richter ◽  
Zhiguo Wang ◽  
Floris P. de Lange

AbstractBoth spatial and temporal context play an important role in visual perception and behavior. Humans can extract statistical regularities from both forms of context to help processing the present and to construct expectations about the future. Numerous studies have found reduced neural responses to expected stimuli compared to unexpected stimuli, for both spatial and temporal regularities. However, it is largely unclear whether and how these forms of context interact. In the current fMRI study, thirty-three human volunteers were exposed to object stimuli that could be expected or surprising in terms of their spatial and temporal context. We found a reliable independent contribution of both spatial and temporal context in modulating the neural response. Specifically, neural responses to stimuli in expected compared to unexpected contexts were suppressed throughout the ventral visual stream. Interestingly, the modulation by spatial context was stronger in magnitude and more reliable than modulations by temporal context. These results suggest that while both spatial and temporal context serve as a prior that can modulate sensory processing in a similar fashion, predictions of spatial context may be a more powerful modulator in the visual system.Significance StatementBoth temporal and spatial context can affect visual perception, however it is largely unclear if and how these different forms of context interact in modulating sensory processing. When manipulating both temporal and spatial context expectations, we found that they jointly affected sensory processing, evident as a suppression of neural responses for expected compared to unexpected stimuli. Interestingly, the modulation by spatial context was stronger than that by temporal context. Together, our results suggest that spatial context may be a stronger modulator of neural responses than temporal context within the visual system. Thereby, the present study provides new evidence how different types of predictions jointly modulate perceptual processing.


2017 ◽  
Author(s):  
David Richter ◽  
Matthias Ekman ◽  
Floris P. de Lange

AbstractPrediction plays a crucial role in perception, as prominently suggested by predictive coding theories. However, the exact form and mechanism of predictive modulations of sensory processing remain unclear, with some studies reporting a downregulation of the sensory response for predictable input, while others observed an enhanced response. In a similar vein, downregulation of the sensory response for predictable input has been linked to either sharpening or dampening of the sensory representation, which are opposite in nature. In the present study we set out to investigate the neural consequences of perceptual expectation of object stimuli throughout the visual hierarchy, using fMRI in human volunteers. Participants (n=24) were exposed to pairs of sequentially presented object images in a statistical learning paradigm, in which the first object predicted the identity of the second object. Image transitions were not task relevant; thus all learning of statistical regularities was incidental. We found strong suppression of neural responses to expected compared to unexpected stimuli throughout the ventral visual stream, including primary visual cortex (V1), lateral occipital complex (LOC), and anterior ventral visual areas. Expectation suppression in LOC, but not V1, scaled positively with image preference, lending support to the dampening account of expectation suppression in object perception.Significance StatementStatistical regularities permeate our world and help us to perceive and understand our surroundings. It has been suggested that the brain fundamentally relies on predictions and constructs models of the world in order to make sense of sensory information. Previous research on the neural basis of prediction has documented expectation suppression, i.e. suppressed responses to expected compared to unexpected stimuli. In the present study we queried the presence and characteristics of expectation suppression throughout the ventral visual stream. We demonstrate robust expectation suppression in the entire ventral visual pathway, and underlying this suppression a dampening of the sensory representation in object-selective visual cortex, but not in primary visual cortex. Taken together, our results provide novel evidence in support of theories conceptualizing perception as an active inference process, which selectively dampens cortical representations of predictable objects. This dampening may support our ability to automatically filter out irrelevant, predictable objects.


2020 ◽  
Author(s):  
Franziska Geiger ◽  
Martin Schrimpf ◽  
Tiago Marques ◽  
James J. DiCarlo

AbstractAfter training on large datasets, certain deep neural networks are surprisingly good models of the neural mechanisms of adult primate visual object recognition. Nevertheless, these models are poor models of the development of the visual system because they posit millions of sequential, precisely coordinated synaptic updates, each based on a labeled image. While ongoing research is pursuing the use of unsupervised proxies for labels, we here explore a complementary strategy of reducing the required number of supervised synaptic updates to produce an adult-like ventral visual stream (as judged by the match to V1, V2, V4, IT, and behavior). Such models might require less precise machinery and energy expenditure to coordinate these updates and would thus move us closer to viable neuroscientific hypotheses about how the visual system wires itself up. Relative to the current leading model of the adult ventral stream, we here demonstrate that the total number of supervised weight updates can be substantially reduced using three complementary strategies: First, we find that only 2% of supervised updates (epochs and images) are needed to achieve ~80% of the match to adult ventral stream. Second, by improving the random distribution of synaptic connectivity, we find that 54% of the brain match can already be achieved “at birth” (i.e. no training at all). Third, we find that, by training only ~5% of model synapses, we can still achieve nearly 80% of the match to the ventral stream. When these three strategies are applied in combination, we find that these new models achieve ~80% of a fully trained model’s match to the brain, while using two orders of magnitude fewer supervised synaptic updates. These results reflect first steps in modeling not just primate adult visual processing during inference, but also how the ventral visual stream might be “wired up” by evolution (a model’s “birth” state) and by developmental learning (a model’s updates based on visual experience).


2018 ◽  
Author(s):  
Thomas S. A. Wallis ◽  
Christina M. Funke ◽  
Alexander S. Ecker ◽  
Leon A. Gatys ◽  
Felix A. Wichmann ◽  
...  

AbstractWe subjectively perceive our visual field with high fidelity, yet large peripheral distortions can go unnoticed and peripheral objects can be difficult to identify (crowding). A recent paper proposed a model of the mid-level ventral visual stream in which neural responses were averaged over an area of space that increased as a function of eccentricity (scaling). Human participants could not discriminate synthesised model images from each other (they were metamers) when scaling was about half the retinal eccentricity. This result implicated ventral visual area V2 and approximated “Bouma’s Law” of crowding. It has subsequently been interpreted as a link between crowding zones, receptive field scaling, and our rich perceptual experience. However, participants in this experiment never saw the original images. We find that participants can easily discriminate real and model-generated images at V2 scaling. Lower scale factors than even V1 receptive fields may be required to generate metamers. Efficiently explaining why scenes look as they do may require incorporating segmentation processes and global organisational constraints in addition to local pooling.


2016 ◽  
pp. 1-13 ◽  
Author(s):  
Katrin Döhnel ◽  
Tobias Schuwerk ◽  
Beate Sodian ◽  
Göran Hajak ◽  
Rainer Rupprecht ◽  
...  

2018 ◽  
Vol 38 (34) ◽  
pp. 7452-7461 ◽  
Author(s):  
David Richter ◽  
Matthias Ekman ◽  
Floris P. de Lange

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Irina Higgins ◽  
Le Chang ◽  
Victoria Langston ◽  
Demis Hassabis ◽  
Christopher Summerfield ◽  
...  

AbstractIn order to better understand how the brain perceives faces, it is important to know what objective drives learning in the ventral visual stream. To answer this question, we model neural responses to faces in the macaque inferotemporal (IT) cortex with a deep self-supervised generative model, β-VAE, which disentangles sensory data into interpretable latent factors, such as gender or age. Our results demonstrate a strong correspondence between the generative factors discovered by β-VAE and those coded by single IT neurons, beyond that found for the baselines, including the handcrafted state-of-the-art model of face perception, the Active Appearance Model, and deep classifiers. Moreover, β-VAE is able to reconstruct novel face images using signals from just a handful of cells. Together our results imply that optimising the disentangling objective leads to representations that closely resemble those in the IT at the single unit level. This points at disentangling as a plausible learning objective for the visual brain.


Sign in / Sign up

Export Citation Format

Share Document