scholarly journals Deep convolutional models improve predictions of macaque V1 responses to natural images

2017 ◽  
Author(s):  
Santiago A. Cadena ◽  
George H. Denfield ◽  
Edgar Y. Walker ◽  
Leon A. Gatys ◽  
Andreas S. Tolias ◽  
...  

AbstractDespite great efforts over several decades, our best models of primary visual cortex (V1) still predict spiking activity quite poorly when probed with natural stimuli, highlighting our limited understanding of the nonlinear computations in V1. Recently, two approaches based on deep learning have been successfully applied to neural data: On the one hand, transfer learning from networks trained on object recognition worked remarkably well for predicting neural responses in higher areas of the primate ventral stream, but has not yet been used to model spiking activity in early stages such as V1. On the other hand, data-driven models have been used to predict neural responses in the early visual system (retina and V1) of mice, but not primates. Here, we test the ability of both approaches to predict spiking activity in response to natural images in V1 of awake monkeys. Even though V1 is rather at an early to intermediate stage of the visual system, we found that the transfer learning approach performed similarly well to the data-driven approach and both outperformed classical linear-nonlinear and wavelet-based feature representations that build on existing theories of V1. Notably, transfer learning using a pre-trained feature space required substantially less experimental time to achieve the same performance. In conclusion, multi-layer convolutional neural networks (CNNs) set the new state of the art for predicting neural responses to natural images in primate V1 and deep features learned for object recognition are better explanations for V1 computation than all previous filter bank theories. This finding strengthens the necessity of V1 models that are multiple nonlinearities away from the image domain and it supports the idea of explaining early visual cortex based on high-level functional goals.Author summaryPredicting the responses of sensory neurons to arbitrary natural stimuli is of major importance for understanding their function. Arguably the most studied cortical area is primary visual cortex (V1), where many models have been developed to explain its function. However, the most successful models built on neurophysiologists’ intuitions still fail to account for spiking responses to natural images. Here, we model spiking activity in primary visual cortex (V1) of monkeys using deep convolutional neural networks (CNNs), which have been successful in computer vision. We both trained CNNs directly to fit the data, and used CNNs trained to solve a high-level task (object categorization). With these approaches, we are able to outperform previous models and improve the state of the art in predicting the responses of early visual neurons to natural images. Our results have two important implications. First, since V1 is the result of several nonlinear stages, it should be modeled as such. Second, functional models of entire visual pathways, of which V1 is an early stage, do not only account for higher areas of such pathways, but also provide useful representations for V1 predictions.

2020 ◽  
Author(s):  
Konstantin-Klemens Lurz ◽  
Mohammad Bashiri ◽  
Konstantin Willeke ◽  
Akshay K. Jagadish ◽  
Eric Wang ◽  
...  

AbstractDeep neural networks (DNN) have set new standards at predicting responses of neural populations to visual input. Most such DNNs consist of a convolutional network (core) shared across all neurons which learns a representation of neural computation in visual cortex and a neuron-specific readout that linearly combines the relevant features in this representation. The goal of this paper is to test whether such a representation is indeed generally characteristic for visual cortex, i.e. generalizes between animals of a species, and what factors contribute to obtaining such a generalizing core. To push all non-linear computations into the core where the generalizing cortical features should be learned, we devise a novel readout that reduces the number of parameters per neuron in the readout by up to two orders of magnitude compared to the previous state-of-the-art. It does so by taking advantage of retinotopy and learns a Gaussian distribution over the neuron’s receptive field position. With this new readout we train our network on neural responses from mouse primary visual cortex (V1) and obtain a gain in performance of 7% compared to the previous state-of-the-art network. We then investigate whether the convolutional core indeed captures general cortical features by using the core in transfer learning to a different animal. When transferring a core trained on thousands of neurons from various animals and scans we exceed the performance of training directly on that animal by 12%, and outperform a commonly used VGG16 core pre-trained on imagenet by 33%. In addition, transfer learning with our data-driven core is more data-efficient than direct training, achieving the same performance with only 40% of the data. Our model with its novel readout thus sets a new state-of-the-art for neural response prediction in mouse visual cortex from natural images, generalizes between animals, and captures better characteristic cortical features than current task-driven pre-training approaches such as VGG16.


2019 ◽  
Author(s):  
Max F. Burg ◽  
Santiago A. Cadena ◽  
George H. Denfield ◽  
Edgar Y. Walker ◽  
Andreas S. Tolias ◽  
...  

AbstractDivisive normalization (DN) is a prominent computational building block in the brain that has been proposed as a canonical cortical operation. Numerous experimental studies have verified its importance for capturing nonlinear response properties to simple, artificial stimuli, and computational studies suggest that DN is also an important component for processing natural stimuli. However, we lack quantitative models of DN that are directly informed by empirical data and applicable to arbitrary stimuli. Here, we developed an image-computable DN model and tested its ability to predict spiking responses of a large number of neurons to natural images. In macaque primary visual cortex (V1), we found that our model outperformed linear-nonlinear and wavelet-based feature representations and performed on par with state-of-the-art convolutional neural network models. Our model learns the pool of normalizing neurons and the magnitude of their contribution end-to-end from the data, answering a long-standing question about the tuning properties of DN: within the classical receptive field, oriented features were normalized preferentially by features with similar orientations rather than non-specifically as currently assumed. Overall, our work refines our view on gain control within the classical receptive field, quantifies the relevance of DN under stimulation with natural images and provides a new, high-performing, and compactly understandable model of V1.Author summaryDivisive normalization is a computational building block apparent throughout sensory processing in the brain. Numerous studies in the visual cortex have highlighted its importance by explaining nonlinear neural response properties to synthesized simple stimuli like overlapping gratings with varying contrasts. However, we do not know if and how this normalization mechanism plays a role when processing complex stimuli like natural images. Here, we applied modern machine learning methods to build a general divisive normalization model that is directly informed by data and quantifies the importance of divisive normalization. By learning the normalization mechanism from a data set of natural images and neural responses from macaque primary visual cortex, our model made predictions as accurately as current stat-of-the-art convolutional neural networks. Moreover, our model has fewer parameters and offers direct interpretations of them. Specifically, we found that neurons that respond strongly to a specific orientation are preferentially normalized by other neurons that are highly active for similar orientations. Overall, we propose a biologically motivated model of primary visual cortex that is compact, more interpretable, performs on par with standard convolutional neural networks and refines our view on how normalization operates in visual cortex when processing natural stimuli.


2004 ◽  
Vol 91 (1) ◽  
pp. 206-212 ◽  
Author(s):  
Konrad P. Körding ◽  
Christoph Kayser ◽  
Wolfgang Einhäuser ◽  
Peter König

Sensory areas should be adapted to the properties of their natural stimuli. What are the underlying rules that match the properties of complex cells in primary visual cortex to their natural stimuli? To address this issue, we sampled movies from a camera carried by a freely moving cat, capturing the dynamics of image motion as the animal explores an outdoor environment. We use these movie sequences as input to simulated neurons. Following the intuition that many meaningful high-level variables, e.g., identities of visible objects, do not change rapidly in natural visual stimuli, we adapt the neurons to exhibit firing rates that are stable over time. We find that simulated neurons, which have optimally stable activity, display many properties that are observed for cortical complex cells. Their response is invariant with respect to stimulus translation and reversal of contrast polarity. Furthermore, spatial frequency selectivity and the aspect ratio of the receptive field quantitatively match the experimentally observed characteristics of complex cells. Hence, the population of complex cells in the primary visual cortex can be described as forming an optimally stable representation of natural stimuli.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Bin Wang ◽  
Chuanliang Han ◽  
Tian Wang ◽  
Weifeng Dai ◽  
Yang Li ◽  
...  

AbstractStimulus-dependence of gamma oscillations (GAMMA, 30–90 Hz) has not been fully understood, but it is important for revealing neural mechanisms and functions of GAMMA. Here, we recorded spiking activity (MUA) and the local field potential (LFP), driven by a variety of plaids (generated by two superimposed gratings orthogonal to each other and with different contrast combinations), in the primary visual cortex of anesthetized cats. We found two distinct narrow-band GAMMAs in the LFPs and a variety of response patterns to plaids. Similar to MUA, most response patterns showed that the second grating suppressed GAMMAs driven by the first one. However, there is only a weak site-by-site correlation between cross-orientation interactions in GAMMAs and those in MUAs. We developed a normalization model that could unify the response patterns of both GAMMAs and MUAs. Interestingly, compared with MUAs, the GAMMAs demonstrated a wider range of model parameters and more diverse response patterns to plaids. Further analysis revealed that normalization parameters for high GAMMA, but not those for low GAMMA, were significantly correlated with the discrepancy of spatial frequency between stimulus and sites’ preferences. Consistent with these findings, normalization parameters and diversity of high GAMMA exhibited a clear transition trend and region difference between area 17 to 18. Our results show that GAMMAs are also regulated in the form of normalization, but that the neural mechanisms for these normalizations might differ from those of spiking activity. Normalizations in different brain signals could be due to interactions of excitation and inhibitions at multiple stages in the visual system.


2017 ◽  
Vol 117 (1) ◽  
pp. 388-402 ◽  
Author(s):  
Michael A. Cohen ◽  
George A. Alvarez ◽  
Ken Nakayama ◽  
Talia Konkle

Visual search is a ubiquitous visual behavior, and efficient search is essential for survival. Different cognitive models have explained the speed and accuracy of search based either on the dynamics of attention or on similarity of item representations. Here, we examined the extent to which performance on a visual search task can be predicted from the stable representational architecture of the visual system, independent of attentional dynamics. Participants performed a visual search task with 28 conditions reflecting different pairs of categories (e.g., searching for a face among cars, body among hammers, etc.). The time it took participants to find the target item varied as a function of category combination. In a separate group of participants, we measured the neural responses to these object categories when items were presented in isolation. Using representational similarity analysis, we then examined whether the similarity of neural responses across different subdivisions of the visual system had the requisite structure needed to predict visual search performance. Overall, we found strong brain/behavior correlations across most of the higher-level visual system, including both the ventral and dorsal pathways when considering both macroscale sectors as well as smaller mesoscale regions. These results suggest that visual search for real-world object categories is well predicted by the stable, task-independent architecture of the visual system. NEW & NOTEWORTHY Here, we ask which neural regions have neural response patterns that correlate with behavioral performance in a visual processing task. We found that the representational structure across all of high-level visual cortex has the requisite structure to predict behavior. Furthermore, when directly comparing different neural regions, we found that they all had highly similar category-level representational structures. These results point to a ubiquitous and uniform representational structure in high-level visual cortex underlying visual object processing.


2012 ◽  
Vol 24 (5) ◽  
pp. 1271-1296 ◽  
Author(s):  
Michael Teichmann ◽  
Jan Wiltschut ◽  
Fred Hamker

The human visual system has the remarkable ability to largely recognize objects invariant of their position, rotation, and scale. A good interpretation of neurobiological findings involves a computational model that simulates signal processing of the visual cortex. In part, this is likely achieved step by step from early to late areas of visual perception. While several algorithms have been proposed for learning feature detectors, only few studies at hand cover the issue of biologically plausible learning of such invariance. In this study, a set of Hebbian learning rules based on calcium dynamics and homeostatic regulations of single neurons is proposed. Their performance is verified within a simple model of the primary visual cortex to learn so-called complex cells, based on a sequence of static images. As a result, the learned complex-cell responses are largely invariant to phase and position.


2004 ◽  
Author(s):  
Tatyana Sharpee ◽  
Hiroki Sugihara ◽  
A. V. Kurgansky ◽  
S. Rebrik ◽  
M. P. Stryker ◽  
...  

2003 ◽  
Vol 20 (1) ◽  
pp. 77-84 ◽  
Author(s):  
AN CAO ◽  
PETER H. SCHILLER

Relative motion information, especially relative speed between different input patterns, is required for solving many complex tasks of the visual system, such as depth perception by motion parallax and motion-induced figure/ground segmentation. However, little is known about the neural substrate for processing relative speed information. To explore the neural mechanisms for relative speed, we recorded single-unit responses to relative motion in the primary visual cortex (area V1) of rhesus monkeys while presenting sets of random-dot arrays moving at different speeds. We found that most V1 neurons were sensitive to the existence of a discontinuity in speed, that is, they showed higher responses when relative motion was presented compared to homogenous field motion. Seventy percent of the neurons in our sample responded predominantly to relative rather than to absolute speed. Relative speed tuning curves were similar at different center–surround velocity combinations. These relative motion-sensitive neurons in macaque area V1 probably contribute to figure/ground segmentation and motion discontinuity detection.


Sign in / Sign up

Export Citation Format

Share Document