scholarly journals A performance-optimized model of neural responses across the ventral visual stream

2016 ◽  
Author(s):  
Darren Seibert ◽  
Daniel L Yamins ◽  
Diego Ardila ◽  
Ha Hong ◽  
James J DiCarlo ◽  
...  

Human visual object recognition is subserved by a multitude of cortical areas. To make sense of this system, one line of research focused on response properties of primary visual cortex neurons and developed theoretical models of a set of canonical computations such as convolution, thresholding, exponentiating and normalization that could be hierarchically repeated to give rise to more complex representations. Another line or research focused on response properties of high-level visual cortex and linked these to semantic categories useful for object recognition. Here, we hypothesized that the panoply of visual representations in the human ventral stream may be understood as emergent properties of a system constrained both by simple canonical computations and by top-level, object recognition functionality in a single unified framework (Yamins et al., 2014; Khaligh-Razavi and Kriegeskorte, 2014; Guclu and van Gerven, 2015). We built a deep convolutional neural network model optimized for object recognition and compared representations at various model levels using representational similarity analysis to human functional imaging responses elicited from viewing hundreds of image stimuli. Neural network layers developed representations that corresponded in a hierarchical consistent fashion to visual areas from V1 to LOC. This correspondence increased with optimization of the model's recognition performance. These findings support a unified view of the ventral stream in which representations from the earliest to the latest stages can be understood as being built from basic computations inspired by modeling of early visual cortex shaped by optimization for high-level object-based performance constraints.

2019 ◽  
Vol 31 (9) ◽  
pp. 1354-1367
Author(s):  
Yael Holzinger ◽  
Shimon Ullman ◽  
Daniel Harari ◽  
Marlene Behrmann ◽  
Galia Avidan

Visual object recognition is performed effortlessly by humans notwithstanding the fact that it requires a series of complex computations, which are, as yet, not well understood. Here, we tested a novel account of the representations used for visual recognition and their neural correlates using fMRI. The rationale is based on previous research showing that a set of representations, termed “minimal recognizable configurations” (MIRCs), which are computationally derived and have unique psychophysical characteristics, serve as the building blocks of object recognition. We contrasted the BOLD responses elicited by MIRC images, derived from different categories (faces, objects, and places), sub-MIRCs, which are visually similar to MIRCs, but, instead, result in poor recognition and scrambled, unrecognizable images. Stimuli were presented in blocks, and participants indicated yes/no recognition for each image. We confirmed that MIRCs elicited higher recognition performance compared to sub-MIRCs for all three categories. Whereas fMRI activation in early visual cortex for both MIRCs and sub-MIRCs of each category did not differ from that elicited by scrambled images, high-level visual regions exhibited overall greater activation for MIRCs compared to sub-MIRCs or scrambled images. Moreover, MIRCs and sub-MIRCs from each category elicited enhanced activation in corresponding category-selective regions including fusiform face area and occipital face area (faces), lateral occipital cortex (objects), and parahippocampal place area and transverse occipital sulcus (places). These findings reveal the psychological and neural relevance of MIRCs and enable us to make progress in developing a more complete account of object recognition.


2021 ◽  
Author(s):  
Sophia Nestmann ◽  
Hans-Otto Karnath ◽  
Johannes Rennig

Object constancy is one of the most crucial mechanisms of the human visual system enabling viewpoint invariant object recognition. However, the neuronal foundations of object constancy are widely unknown. Research has shown that the ventral visual stream is involved in processing of various kinds of object stimuli and that several regions along the ventral stream are possibly sensitive to the orientation of an object in space. To systematically address the question of viewpoint sensitive object perception, we conducted a study with stroke patients as well as an fMRI experiment with healthy participants applying object stimuli in several spatial orientations, for example in typical and atypical viewing conditions. In the fMRI experiment, we found stronger BOLD signals and above-chance classification accuracies for objects presented in atypical viewing conditions in fusiform face sensitive and lateral occipito-temporal object preferring areas. In the behavioral patient study, we observed that lesions of the right fusiform gyrus were associated with lower performance in object recognition for atypical views. The complementary results from both experiments emphasize the contributions of fusiform and lateral-occipital areas to visual object constancy and indicate that visual object constancy is particularly enabled through increased neuronal activity and specific activation patterns for objects in demanding viewing conditions.


2018 ◽  
Author(s):  
GANESH ELUMALAI ◽  
Geethanjali Vinodhanand ◽  
Valencia Lasandra Camoya Brown ◽  
Jessica Dasari ◽  
Venkata Hari Krishna Kurra ◽  
...  

Visual perception is the ability to interpret the surrounding environment. A hypothetical ventral stream visual pathway explains how we perceive objects with respect to spatial orientation. The ventral stream fibers extend between Visual cortex to Inferior temporal gyrus. The previous studies have failed to prove any indications on the structural connectivity of this pathway. This study is designed to trace the existence of neural structural connectivity between Visual cortex with Inferior temporal gyrus using Diffusion Tensor Imaging Tractography, which aims to correlate its functional importance with visual object perception. The observational analysis used thirty two healthy adults, ultrahigh b-value and diffusion MRI datasets from an Open access research platform. The datasets range from both sexes, between 20 to 49 years, with mean age of 30.4 years. The confirmatory observational analysis process includes datasets acquisition, pre-processing, processing, reconstruction, fiber tractography and analysis using software tools. All the datasets confirmed the fibre structural extension between, Visual cortex to Inferior temporal gyrus in both the sexes may responsible for the visual perception of objects. This new fiber connectivity evidence justifies the structural relevance of visual perception impairments, such as visual object agnosia. Keywords: Ventral Visual Stream, Dorsal Visual Stream, Visual Agnosia, Where visual pathways


2017 ◽  
Vol 117 (1) ◽  
pp. 388-402 ◽  
Author(s):  
Michael A. Cohen ◽  
George A. Alvarez ◽  
Ken Nakayama ◽  
Talia Konkle

Visual search is a ubiquitous visual behavior, and efficient search is essential for survival. Different cognitive models have explained the speed and accuracy of search based either on the dynamics of attention or on similarity of item representations. Here, we examined the extent to which performance on a visual search task can be predicted from the stable representational architecture of the visual system, independent of attentional dynamics. Participants performed a visual search task with 28 conditions reflecting different pairs of categories (e.g., searching for a face among cars, body among hammers, etc.). The time it took participants to find the target item varied as a function of category combination. In a separate group of participants, we measured the neural responses to these object categories when items were presented in isolation. Using representational similarity analysis, we then examined whether the similarity of neural responses across different subdivisions of the visual system had the requisite structure needed to predict visual search performance. Overall, we found strong brain/behavior correlations across most of the higher-level visual system, including both the ventral and dorsal pathways when considering both macroscale sectors as well as smaller mesoscale regions. These results suggest that visual search for real-world object categories is well predicted by the stable, task-independent architecture of the visual system. NEW & NOTEWORTHY Here, we ask which neural regions have neural response patterns that correlate with behavioral performance in a visual processing task. We found that the representational structure across all of high-level visual cortex has the requisite structure to predict behavior. Furthermore, when directly comparing different neural regions, we found that they all had highly similar category-level representational structures. These results point to a ubiquitous and uniform representational structure in high-level visual cortex underlying visual object processing.


2020 ◽  
Author(s):  
Alexander J.E. Kell ◽  
Sophie L. Bokor ◽  
You-Nah Jeon ◽  
Tahereh Toosi ◽  
Elias B. Issa

The marmoset—a small monkey with a flat cortex—offers powerful techniques for studying neural circuits in a primate. However, it remains unclear whether brain functions typically studied in larger primates can be studied in the marmoset. Here, we asked whether the 300-gram marmosets’ perceptual and cognitive repertoire approaches human levels or is instead closer to rodents’. Using high-level visual object recognition as a testbed, we found that on the same task marmosets substantially outperformed rats and generalized far more robustly across images, all while performing ∼1000 trials/day. We then compared marmosets against the high standard of human behavior. Across the same 400 images, marmosets’ image-by-image recognition behavior was strikingly human-like—essentially as human-like as macaques’. These results demonstrate that marmosets have been substantially underestimated and that high-level abilities have been conserved across simian primates. Consequently, marmosets are a potent small model organism for visual neuroscience, and perhaps beyond.


Author(s):  
Kohitij Kar ◽  
James J DiCarlo

SummaryDistributed neural population spiking patterns in macaque inferior temporal (IT) cortex that support core visual object recognition require additional time to develop for specific (“late-solved”) images suggesting the necessity of recurrent processing in these computations. Which brain circuit motifs are most responsible for computing and transmitting these putative recurrent signals to IT? To test whether the ventral prefrontal cortex (vPFC) is a critical recurrent circuit node in this system, here we pharmacologically inactivated parts of the vPFC and simultaneously measured IT population activity, while monkeys performed object discrimination tasks. Our results show that vPFC inactivation deteriorated the quality of the late-phase (>150 ms from image onset) IT population code, along with commensurate, specific behavioral deficits for “late-solved” images. Finally, silencing vPFC caused the monkeys’ IT activity patterns and behavior to become more like those produced by feedforward artificial neural network models of the ventral stream. Together with prior work, these results argue that fast recurrent processing through the vPFC is critical to the production of behaviorally-sufficient object representations in IT.


Perception ◽  
1997 ◽  
Vol 26 (1_suppl) ◽  
pp. 33-33
Author(s):  
G M Wallis ◽  
H H Bülthoff

The view-based approach to object recognition supposes that objects are stored as a series of associated views. Although representation of these views as combinations of 2-D features allows generalisation to similar views, it remains unclear how very different views might be associated together to allow recognition from any viewpoint. One cue present in the real world other than spatial similarity, is that we usually experience different objects in temporally constrained, coherent order, and not as randomly ordered snapshots. In a series of recent neural-network simulations, Wallis and Baddeley (1997 Neural Computation9 883 – 894) describe how the association of views on the basis of temporal as well as spatial correlations is both theoretically advantageous and biologically plausible. We describe an experiment aimed at testing their hypothesis in human object-recognition learning. We investigated recognition performance of faces previously presented in sequences. These sequences consisted of five views of five different people's faces, presented in orderly sequence from left to right profile in 45° steps. According to the temporal-association hypothesis, the visual system should associate the images together and represent them as different views of the same person's face, although in truth they are images of different people's faces. In a same/different task, subjects were asked to say whether two faces seen from different viewpoints were views of the same person or not. In accordance with theory, discrimination errors increased for those faces seen earlier in the same sequence as compared with those faces which were not ( p<0.05).


2019 ◽  
Vol 5 (5) ◽  
pp. eaav7903 ◽  
Author(s):  
Khaled Nasr ◽  
Pooja Viswanathan ◽  
Andreas Nieder

Humans and animals have a “number sense,” an innate capability to intuitively assess the number of visual items in a set, its numerosity. This capability implies that mechanisms to extract numerosity indwell the brain’s visual system, which is primarily concerned with visual object recognition. Here, we show that network units tuned to abstract numerosity, and therefore reminiscent of real number neurons, spontaneously emerge in a biologically inspired deep neural network that was merely trained on visual object recognition. These numerosity-tuned units underlay the network’s number discrimination performance that showed all the characteristics of human and animal number discriminations as predicted by the Weber-Fechner law. These findings explain the spontaneous emergence of the number sense based on mechanisms inherent to the visual system.


2021 ◽  
Vol 118 (3) ◽  
pp. e2014196118
Author(s):  
Chengxu Zhuang ◽  
Siming Yan ◽  
Aran Nayebi ◽  
Martin Schrimpf ◽  
Michael C. Frank ◽  
...  

Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today’s best supervised methods and that the mapping of these neural network models’ hidden layers is neuroanatomically consistent across the ventral stream. Strikingly, we find that these methods produce brain-like representations even when trained solely with real human child developmental data collected from head-mounted cameras, despite the fact that these datasets are noisy and limited. We also find that semisupervised deep contrastive embeddings can leverage small numbers of labeled examples to produce representations with substantially improved error-pattern consistency to human behavior. Taken together, these results illustrate a use of unsupervised learning to provide a quantitative model of a multiarea cortical brain system and present a strong candidate for a biologically plausible computational theory of primate sensory learning.


Sign in / Sign up

Export Citation Format

Share Document