scholarly journals Learning Divisive Normalization in Primary Visual Cortex

2019 ◽  
Author(s):  
Max F. Burg ◽  
Santiago A. Cadena ◽  
George H. Denfield ◽  
Edgar Y. Walker ◽  
Andreas S. Tolias ◽  
...  

AbstractDivisive normalization (DN) is a prominent computational building block in the brain that has been proposed as a canonical cortical operation. Numerous experimental studies have verified its importance for capturing nonlinear response properties to simple, artificial stimuli, and computational studies suggest that DN is also an important component for processing natural stimuli. However, we lack quantitative models of DN that are directly informed by empirical data and applicable to arbitrary stimuli. Here, we developed an image-computable DN model and tested its ability to predict spiking responses of a large number of neurons to natural images. In macaque primary visual cortex (V1), we found that our model outperformed linear-nonlinear and wavelet-based feature representations and performed on par with state-of-the-art convolutional neural network models. Our model learns the pool of normalizing neurons and the magnitude of their contribution end-to-end from the data, answering a long-standing question about the tuning properties of DN: within the classical receptive field, oriented features were normalized preferentially by features with similar orientations rather than non-specifically as currently assumed. Overall, our work refines our view on gain control within the classical receptive field, quantifies the relevance of DN under stimulation with natural images and provides a new, high-performing, and compactly understandable model of V1.Author summaryDivisive normalization is a computational building block apparent throughout sensory processing in the brain. Numerous studies in the visual cortex have highlighted its importance by explaining nonlinear neural response properties to synthesized simple stimuli like overlapping gratings with varying contrasts. However, we do not know if and how this normalization mechanism plays a role when processing complex stimuli like natural images. Here, we applied modern machine learning methods to build a general divisive normalization model that is directly informed by data and quantifies the importance of divisive normalization. By learning the normalization mechanism from a data set of natural images and neural responses from macaque primary visual cortex, our model made predictions as accurately as current stat-of-the-art convolutional neural networks. Moreover, our model has fewer parameters and offers direct interpretations of them. Specifically, we found that neurons that respond strongly to a specific orientation are preferentially normalized by other neurons that are highly active for similar orientations. Overall, we propose a biologically motivated model of primary visual cortex that is compact, more interpretable, performs on par with standard convolutional neural networks and refines our view on how normalization operates in visual cortex when processing natural stimuli.

2021 ◽  
Vol 17 (6) ◽  
pp. e1009028
Author(s):  
Max F. Burg ◽  
Santiago A. Cadena ◽  
George H. Denfield ◽  
Edgar Y. Walker ◽  
Andreas S. Tolias ◽  
...  

Divisive normalization (DN) is a prominent computational building block in the brain that has been proposed as a canonical cortical operation. Numerous experimental studies have verified its importance for capturing nonlinear neural response properties to simple, artificial stimuli, and computational studies suggest that DN is also an important component for processing natural stimuli. However, we lack quantitative models of DN that are directly informed by measurements of spiking responses in the brain and applicable to arbitrary stimuli. Here, we propose a DN model that is applicable to arbitrary input images. We test its ability to predict how neurons in macaque primary visual cortex (V1) respond to natural images, with a focus on nonlinear response properties within the classical receptive field. Our model consists of one layer of subunits followed by learned orientation-specific DN. It outperforms linear-nonlinear and wavelet-based feature representations and makes a significant step towards the performance of state-of-the-art convolutional neural network (CNN) models. Unlike deep CNNs, our compact DN model offers a direct interpretation of the nature of normalization. By inspecting the learned normalization pool of our model, we gained insights into a long-standing question about the tuning properties of DN that update the current textbook description: we found that within the receptive field oriented features were normalized preferentially by features with similar orientation rather than non-specifically as currently assumed.


2000 ◽  
Vol 83 (2) ◽  
pp. 1019-1030 ◽  
Author(s):  
Valentin Dragoi ◽  
Mriganka Sur

A fundamental feature of neural circuitry in the primary visual cortex (V1) is the existence of recurrent excitatory connections between spiny neurons, recurrent inhibitory connections between smooth neurons, and local connections between excitatory and inhibitory neurons. We modeled the dynamic behavior of intermixed excitatory and inhibitory populations of cells in V1 that receive input from the classical receptive field (the receptive field center) through feedforward thalamocortical afferents, as well as input from outside the classical receptive field (the receptive field surround) via long-range intracortical connections. A counterintuitive result is that the response of oriented cells can be facilitated beyond optimal levels when the surround stimulus is cross-oriented with respect to the center and suppressed when the surround stimulus is iso-oriented. This effect is primarily due to changes in recurrent inhibition within a local circuit. Cross-oriented surround stimulation leads to a reduction of presynaptic inhibition and a supraoptimal response, whereas iso-oriented surround stimulation has the opposite effect. This mechanism is used to explain the orientation and contrast dependence of contextual interactions in primary visual cortex: responses to a center stimulus can be both strongly suppressed and supraoptimally facilitated as a function of surround orientation, and these effects diminish as stimulus contrast decreases.


2001 ◽  
Vol 86 (5) ◽  
pp. 2559-2570 ◽  
Author(s):  
Masaharu Kinoshita ◽  
Hidehiko Komatsu

The perceived brightness of a surface is determined not only by the luminance of the surface (local information), but also by the luminance of its surround (global information). To better understand the neural representation of surface brightness, we investigated the effects of local and global luminance on the activity of neurons in the primary visual cortex (V1) of awake macaque monkeys. Single- and multiple-unit recordings were made from V1 while the monkeys were performing a visual fixation task. The classical receptive field of each neuron was identified as a region responding to a spot stimulus. Neural responses were assessed using homogeneous surfaces at least three times as large as the receptive field as stimuli. We first examined the sensitivity of neurons to variation in local surface luminance, while the luminance of the surround was held constant. The activity of a large majority of surface-responsive neurons (106/115) varied monotonically with changes in surface luminance; in some the dynamic range was over 3 log units. This monotonic relation between surface luminance and neural activity was more evident later in the stimulus period than early on. The effect of the global luminance on neural activity was then assessed in 81 of the surface-responsive neurons by varying the luminance of the surround while holding the luminance of the surface constant. The activity of one group of neurons (25/81) was unaffected by the luminance of the surround; these neurons appear to encode the physical luminance of a surface covering the receptive field. The responses of the other neurons were affected by the luminance of the surround. The effects of the luminances of the surface and the surround on the activities of 26 of these neurons were in the same direction (either increased or decreased), while the effects on the remaining 25 neurons were in opposite directions. The activities of the latter group of neurons seemed to parallel the perceived brightness of the surface, whereas the former seemed to encode the level of illumination. There were differences across different types of neurons with regard to the layer distribution. These findings indicate that global luminance information significantly modulates the activity of surface-responsive V1 neurons and that not only physical luminance, but also perceived brightness, of a homogeneous surface is represented in V1.


2017 ◽  
Author(s):  
Santiago A. Cadena ◽  
George H. Denfield ◽  
Edgar Y. Walker ◽  
Leon A. Gatys ◽  
Andreas S. Tolias ◽  
...  

AbstractDespite great efforts over several decades, our best models of primary visual cortex (V1) still predict spiking activity quite poorly when probed with natural stimuli, highlighting our limited understanding of the nonlinear computations in V1. Recently, two approaches based on deep learning have been successfully applied to neural data: On the one hand, transfer learning from networks trained on object recognition worked remarkably well for predicting neural responses in higher areas of the primate ventral stream, but has not yet been used to model spiking activity in early stages such as V1. On the other hand, data-driven models have been used to predict neural responses in the early visual system (retina and V1) of mice, but not primates. Here, we test the ability of both approaches to predict spiking activity in response to natural images in V1 of awake monkeys. Even though V1 is rather at an early to intermediate stage of the visual system, we found that the transfer learning approach performed similarly well to the data-driven approach and both outperformed classical linear-nonlinear and wavelet-based feature representations that build on existing theories of V1. Notably, transfer learning using a pre-trained feature space required substantially less experimental time to achieve the same performance. In conclusion, multi-layer convolutional neural networks (CNNs) set the new state of the art for predicting neural responses to natural images in primate V1 and deep features learned for object recognition are better explanations for V1 computation than all previous filter bank theories. This finding strengthens the necessity of V1 models that are multiple nonlinearities away from the image domain and it supports the idea of explaining early visual cortex based on high-level functional goals.Author summaryPredicting the responses of sensory neurons to arbitrary natural stimuli is of major importance for understanding their function. Arguably the most studied cortical area is primary visual cortex (V1), where many models have been developed to explain its function. However, the most successful models built on neurophysiologists’ intuitions still fail to account for spiking responses to natural images. Here, we model spiking activity in primary visual cortex (V1) of monkeys using deep convolutional neural networks (CNNs), which have been successful in computer vision. We both trained CNNs directly to fit the data, and used CNNs trained to solve a high-level task (object categorization). With these approaches, we are able to outperform previous models and improve the state of the art in predicting the responses of early visual neurons to natural images. Our results have two important implications. First, since V1 is the result of several nonlinear stages, it should be modeled as such. Second, functional models of entire visual pathways, of which V1 is an early stage, do not only account for higher areas of such pathways, but also provide useful representations for V1 predictions.


2019 ◽  
Vol 31 (11) ◽  
pp. 2138-2176 ◽  
Author(s):  
Luis Gonzalo Sánchez Giraldo ◽  
Odelia Schwartz

Deep convolutional neural networks (CNNs) are becoming increasingly popular models to predict neural responses in visual cortex. However, contextual effects, which are prevalent in neural processing and in perception, are not explicitly handled by current CNNs, including those used for neural prediction. In primary visual cortex, neural responses are modulated by stimuli spatially surrounding the classical receptive field in rich ways. These effects have been modeled with divisive normalization approaches, including flexible models, where spatial normalization is recruited only to the degree that responses from center and surround locations are deemed statistically dependent. We propose a flexible normalization model applied to midlevel representations of deep CNNs as a tractable way to study contextual normalization mechanisms in midlevel cortical areas. This approach captures nontrivial spatial dependencies among midlevel features in CNNs, such as those present in textures and other visual stimuli, that arise from tiling high-order features geometrically. We expect that the proposed approach can make predictions about when spatial normalization might be recruited in midlevel cortical areas. We also expect this approach to be useful as part of the CNN tool kit, therefore going beyond more restrictive fixed forms of normalization.


2003 ◽  
Vol 20 (3) ◽  
pp. 221-230 ◽  
Author(s):  
BEN S. WEBB ◽  
CHRIS J. TINSLEY ◽  
NICK E. BARRACLOUGH ◽  
AMANDA PARKER ◽  
ANDREW M. DERRINGTON

Gain control is a salient feature of information processing throughout the visual system. Heeger (1991, 1992) described a mechanism that could underpin gain control in primary visual cortex (V1). According to this model, a neuron's response is normalized by dividing its output by the sum of a population of neurons, which are selective for orientations covering a broad range. Gain control in this scheme is manifested as a change in the semisaturation constant (contrast gain) of a V1 neuron. Here we examine how flanking and annular gratings of the same or orthogonal orientation to that preferred by a neuron presented beyond the receptive field modulate gain in V1 neurons in anesthetized marmosets (Callithrix jacchus). To characterize how gain was modulated by surround stimuli, the Michaelis–Menten equation was fitted to response versus contrast functions obtained under each stimulus condition. The modulation of gain by surround stimuli was modelled best as a divisive reduction in response gain. Response gain varied with the orientation of surround stimuli, but was reduced most when the orientation of a large annular grating beyond the classical receptive field matched the preferred orientation of neurons. The strength of surround suppression did not vary significantly with retinal eccentricity or laminar distribution. In the marmoset, as in macaques (Angelucci et al., 2002a, b), gain control over the sort of distances reported here (up to 10 deg) may be mediated by feedback from extrastriate areas.


2019 ◽  
Author(s):  
Federica Capparelli ◽  
Klaus Pawelzik ◽  
Udo Ernst

AbstractA central goal in visual neuroscience is to understand computational mechanisms and to identify neural structures responsible for integrating local visual features into global representations. When probed with complex stimuli that extend beyond their classical receptive field, neurons display non-linear behaviours indicative of such integration processes already in early stages of visual processing. Recently some progress has been made in explaining these effects from first principles by sparse coding models with a neurophysiologically realistic inference dynamics. They reproduce some of the complex response characteristics observed in primary visual cortex, but only when the context is located near the classical receptive field, since the connection scheme they propose include interactions only among neurons with overlapping input fields. Longer-range interactions required for addressing the plethora of contextual effects reaching beyond this range do not exist. Hence, a satisfactory explanation of contextual phenomena in terms of realistic interactions and dynamics in visual cortex is still missing. Here we propose an extended generative model for visual scenes that includes spatial dependencies among different features. We derive a neurophysiologically realistic inference scheme under the constraint that neurons have direct access to only local image information. The scheme can be interpreted as a network in primary visual cortex where two neural populations are organized in different layers within orientation hypercolumns that are connected by local, short-range and long-range recurrent interactions. When trained with natural images, the model predicts a connectivity structure linking neurons with similar orientation preferences matching the typical patterns found for long-ranging horizontal axons and feedback projections in visual cortex. Subjected to contextual stimuli typically used in empirical studies our model replicates several hallmark effects of contextual processing and predicts characteristic differences for surround modulation between the two model populations. In summary, our model provides a novel framework for contextual processing in the visual system proposing a well-defined functional role for horizontal axons and feedback projections.Author summaryAn influential hypothesis about how the brain processes visual information posits that each given stimulus should be efficiently encoded using only a small number of cells. This idea led to the development of a class of models that provided a functional explanation for various response properties of visual neurons, including the non-linear modulations observed when localized stimuli are placed in a broader spatial context. However, it remains to be clarified through which anatomical structures and neural connectivities a network in the cortex could perform the computations that these models require. In this paper we propose a model for encoding spatially extended visual scenes. Imposing the constraint that neurons in visual cortex have direct access only to small portions of the visual field we derive a simple yet realistic neural population dynamics. Connectivities optimized for natural scenes conform with anatomical findings and the resulting model reproduces a broad set of physiological observations, while exposing the neural mechanisms relevant for spatio-temporal information integration.


Sign in / Sign up

Export Citation Format

Share Document