Neural dynamics at successive stages of the ventral visual stream are consistent with hierarchical error signals

Ventral visual stream neural responses are dynamic, even for static image presentations. However, dynamical neural models of visual cortex are lacking as most progress has been made modeling static, time-averaged responses. Here, we studied population neural dynamics during face detection across three cortical processing stages. Remarkably,~30 milliseconds after the initially evoked response, we found that neurons in intermediate level areas decreased their responses to typical configurations of their preferred face parts relative to their response for atypical configurations even while neurons in higher areas achieved and maintained a preference for typical configurations. These hierarchical neural dynamics were inconsistent with standard feedforward circuits. Rather, recurrent models computing prediction errors between stages captured the observed temporal signatures. This model of neural dynamics, which simply augments the standard feedforward model of online vision, suggests that neural responses to static images may encode top-down prediction errors in addition to bottom-up feature estimates.

Download Full-text

Neural dynamics at successive stages of the ventral visual stream are consistent with hierarchical error signals

10.1101/092551 ◽

2016 ◽

Cited By ~ 1

Author(s):

Elias B. Issa ◽

Charles F. Cadieu ◽

James J. DiCarlo

Keyword(s):

Visual Cortex ◽

Face Detection ◽

Intermediate Level ◽

Neural Dynamics ◽

Additional Parameter ◽

Neural Responses ◽

Neural Models ◽

Visual Stream ◽

Parameter Fitting ◽

Ventral Visual Stream

ABSTRACTVentral visual stream neural responses are dynamic, even for static image presentations. However, dynamical neural models of visual cortex are lacking as most progress has been made modeling static, time-averaged responses. Here, we studied population neural dynamics during face detection across three cortical processing stages. Remarkably, ~30 milliseconds after the initially evoked response, we found that neurons in intermediate level areas decreased their preference for faces, becoming anti-face preferring on average even while neurons in higher level areas achieved and maintained a face preference. This pattern of hierarchical neural dynamics was inconsistent with extensions of standard feedforward circuits that implemented recurrence within a cortical stage. Rather, recurrent models computing errors between stages captured the observed temporal signatures. Without additional parameter fitting, this model of neural dynamics, which simply augments the standard feedforward model of online vision to encode errors, also explained seemingly disparate dynamical phenomena in the ventral stream.

Download Full-text

The Temporal Pole Top-Down Modulates the Ventral Visual Stream During Social Cognition

Cerebral Cortex ◽

10.1093/cercor/bhv226 ◽

2015 ◽

pp. bhv226 ◽

Cited By ~ 16

Author(s):

Corinna Pehrs ◽

Jamil Zaki ◽

Lorna H. Schlochtermeier ◽

Arthur M. Jacobs ◽

Lars Kuchinke ◽

...

Keyword(s):

Social Cognition ◽

Top Down ◽

Visual Stream ◽

Temporal Pole ◽

Ventral Visual Stream

Download Full-text

Unsupervised Models of Mouse Visual Cortex

10.1101/2021.06.16.448730 ◽

2021 ◽

Author(s):

Aran Nayebi ◽

Nathan C. L. Kong ◽

Chengxu Zhuang ◽

Justin L. Gardner ◽

Anthony M. Norcia ◽

...

Keyword(s):

Visual Cortex ◽

Visual System ◽

Network Architectures ◽

Deep Convolutional Neural Networks ◽

Visual Stream ◽

Computational Theory ◽

Visual Coding ◽

Ventral Visual Stream ◽

Mouse Visual Cortex ◽

Image Transformations

Task-optimized deep convolutional neural networks are the most quantitatively accurate models of the primate ventral visual stream. However, such networks are implausible as a model of the mouse visual system because mouse visual cortex has a known shallower hierarchy and the supervised objectives these networks are typically trained with are likely neither ethologically relevant in content nor in quantity. Here we develop shallow network architectures that are more consistent with anatomical and physiological studies of mouse visual cortex than current models. We demonstrate that hierarchically shallow architectures trained using contrastive objective functions applied to visual-acuity-adapted images achieve neural prediction performance that exceed those of the same architectures trained in a supervised manner and result in the most quantitatively accurate models of the mouse visual system. Moreover, these models' neural predictivity significantly surpasses those of supervised, deep architectures that are known to correspond well to the primate ventral visual stream. Finally, we derive a novel measure of inter-animal consistency, and show that the best models closely match this quantity across visual areas. Taken together, our results suggest that contrastive objectives operating on shallow architectures with ethologically-motivated image transformations may be a biologically-plausible computational theory of visual coding in mice.

Download Full-text

Decision letter: Neural dynamics at successive stages of the ventral visual stream are consistent with hierarchical error signals

10.7554/elife.42870.011 ◽

2018 ◽

Author(s):

Ed Connor

Keyword(s):

Neural Dynamics ◽

Visual Stream ◽

Ventral Visual Stream

Download Full-text

Author response: Neural dynamics at successive stages of the ventral visual stream are consistent with hierarchical error signals

10.7554/elife.42870.012 ◽

2018 ◽

Cited By ~ 4

Author(s):

Elias B Issa ◽

Charles F Cadieu ◽

James J DiCarlo

Keyword(s):

Author Response ◽

Neural Dynamics ◽

Visual Stream ◽

Ventral Visual Stream

Download Full-text

Development differentially sculpts receptive fields across human visual cortex

10.1101/199901 ◽

2017 ◽

Author(s):

Jesse Gomez ◽

Vaidehi Natu ◽

Brianna Jeska ◽

Michael Barnett ◽

Kalanit Grill-Spector

Keyword(s):

Visual Cortex ◽

Visual Field ◽

Right Hemisphere ◽

Receptive Fields ◽

Visual Stream ◽

Differential Development ◽

Processing Information ◽

Ventral Visual Stream ◽

Visual Field Maps ◽

The Right

ABSTRACTReceptive fields (RFs) processing information in restricted parts of the visual field are a key property of neurons in the visual system. However, how RFs develop in humans is unknown. Using fMRI and population receptive field (pRF) modeling in children and adults, we determined where and how pRFs develop across the ventral visual stream. We find that pRF properties in visual field maps, V1 through VO1, are adult-like by age 5. However, pRF properties in face- and word-selective regions develop into adulthood, increasing the foveal representation and the visual field coverage for faces in the right hemisphere and words in the left hemisphere. Eye-tracking indicates that pRF changes are related to changing fixation patterns on words and faces across development. These findings suggest a link between viewing behavior of faces and words and the differential development of pRFs across visual cortex, potentially due to competition on foveal coverage.

Download Full-text

Suppressed sensory response to predictable object stimuli throughout the ventral visual stream

10.1101/228890 ◽

2017 ◽

Author(s):

David Richter ◽

Matthias Ekman ◽

Floris P. de Lange

Keyword(s):

Visual Cortex ◽

Primary Visual Cortex ◽

Object Perception ◽

Sensory Response ◽

Neural Basis ◽

Visual Stream ◽

Lateral Occipital Complex ◽

Sensory Representation ◽

Statistical Regularities ◽

Ventral Visual Stream

AbstractPrediction plays a crucial role in perception, as prominently suggested by predictive coding theories. However, the exact form and mechanism of predictive modulations of sensory processing remain unclear, with some studies reporting a downregulation of the sensory response for predictable input, while others observed an enhanced response. In a similar vein, downregulation of the sensory response for predictable input has been linked to either sharpening or dampening of the sensory representation, which are opposite in nature. In the present study we set out to investigate the neural consequences of perceptual expectation of object stimuli throughout the visual hierarchy, using fMRI in human volunteers. Participants (n=24) were exposed to pairs of sequentially presented object images in a statistical learning paradigm, in which the first object predicted the identity of the second object. Image transitions were not task relevant; thus all learning of statistical regularities was incidental. We found strong suppression of neural responses to expected compared to unexpected stimuli throughout the ventral visual stream, including primary visual cortex (V1), lateral occipital complex (LOC), and anterior ventral visual areas. Expectation suppression in LOC, but not V1, scaled positively with image preference, lending support to the dampening account of expectation suppression in object perception.Significance StatementStatistical regularities permeate our world and help us to perceive and understand our surroundings. It has been suggested that the brain fundamentally relies on predictions and constructs models of the world in order to make sense of sensory information. Previous research on the neural basis of prediction has documented expectation suppression, i.e. suppressed responses to expected compared to unexpected stimuli. In the present study we queried the presence and characteristics of expectation suppression throughout the ventral visual stream. We demonstrate robust expectation suppression in the entire ventral visual pathway, and underlying this suppression a dampening of the sensory representation in object-selective visual cortex, but not in primary visual cortex. Taken together, our results provide novel evidence in support of theories conceptualizing perception as an active inference process, which selectively dampens cortical representations of predictable objects. This dampening may support our ability to automatically filter out irrelevant, predictable objects.

Download Full-text

Statistical learning attenuates visual activity only for attended stimuli

10.1101/653782 ◽

2019 ◽

Author(s):

David Richter ◽

Floris P. de Lange

Keyword(s):

Statistical Learning ◽

Sensory Information ◽

Neural Responses ◽

Visual Stream ◽

Fmri Study ◽

Human Volunteers ◽

Statistical Regularities ◽

Ventral Visual Stream ◽

Sensory Attenuation ◽

And Behavior

AbstractPerception and behavior can be guided by predictions, which are often based on learned statistical regularities. Neural responses to expected stimuli are frequently found to be attenuated after statistical learning. However, whether this sensory attenuation following statistical learning occurs automatically or depends on attention remains unknown. In the present fMRI study, we exposed human volunteers to sequentially presented object stimuli, in which the first object predicted the identity of the second object. We observed a strong attenuation of neural activity for expected compared to unexpected stimuli in the ventral visual stream. Crucially, this sensory attenuation was only apparent when stimuli were attended, and vanished when attention was directed away from the predictable objects. These results put important constraints on neurocomputational theories that cast perception as a process of probabilistic integration of prior knowledge and sensory information.

Download Full-text

Image content is more important than Bouma’s Law for scene metamers

10.1101/378521 ◽

2018 ◽

Author(s):

Thomas S. A. Wallis ◽

Christina M. Funke ◽

Alexander S. Ecker ◽

Leon A. Gatys ◽

Felix A. Wichmann ◽

...

Keyword(s):

Perceptual Experience ◽

Receptive Fields ◽

Retinal Eccentricity ◽

High Fidelity ◽

Neural Responses ◽

Visual Stream ◽

Scale Factors ◽

Area V2 ◽

Ventral Visual Stream ◽

Human Participants

AbstractWe subjectively perceive our visual field with high fidelity, yet large peripheral distortions can go unnoticed and peripheral objects can be difficult to identify (crowding). A recent paper proposed a model of the mid-level ventral visual stream in which neural responses were averaged over an area of space that increased as a function of eccentricity (scaling). Human participants could not discriminate synthesised model images from each other (they were metamers) when scaling was about half the retinal eccentricity. This result implicated ventral visual area V2 and approximated “Bouma’s Law” of crowding. It has subsequently been interpreted as a link between crowding zones, receptive field scaling, and our rich perceptual experience. However, participants in this experiment never saw the original images. We find that participants can easily discriminate real and model-generated images at V2 scaling. Lower scale factors than even V1 receptive fields may be required to generate metamers. Efficiently explaining why scenes look as they do may require incorporating segmentation processes and global organisational constraints in addition to local pooling.

Download Full-text

Sensitivity to timing and order in human visual cortex

Journal of Neurophysiology ◽

10.1152/jn.00556.2014 ◽

2015 ◽

Vol 113 (5) ◽

pp. 1656-1669 ◽

Cited By ~ 1

Author(s):

Jedediah M. Singer ◽

Joseph R. Madsen ◽

William S. Anderson ◽

Gabriel Kreiman

Keyword(s):

Visual Cortex ◽

Visual Recognition ◽

Visual Stimulation ◽

Stimulus Presentation ◽

Neural Representation ◽

Visual Signals ◽

Rapid Progression ◽

Visual Responses ◽

Visual Stream ◽

Ventral Visual Stream

Visual recognition takes a small fraction of a second and relies on the cascade of signals along the ventral visual stream. Given the rapid path through multiple processing steps between photoreceptors and higher visual areas, information must progress from stage to stage very quickly. This rapid progression of information suggests that fine temporal details of the neural response may be important to the brain's encoding of visual signals. We investigated how changes in the relative timing of incoming visual stimulation affect the representation of object information by recording intracranial field potentials along the human ventral visual stream while subjects recognized objects whose parts were presented with varying asynchrony. Visual responses along the ventral stream were sensitive to timing differences as small as 17 ms between parts. In particular, there was a strong dependency on the temporal order of stimulus presentation, even at short asynchronies. From these observations we infer that the neural representation of complex information in visual cortex can be modulated by rapid dynamics on scales of tens of milliseconds.

Download Full-text