scholarly journals View-tuned and view-invariant face encoding in IT cortex is explained by selected natural image fragments

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yunjun Nam ◽  
Takayuki Sato ◽  
Go Uchida ◽  
Ekaterina Malakhova ◽  
Shimon Ullman ◽  
...  

AbstractHumans recognize individual faces regardless of variation in the facial view. The view-tuned face neurons in the inferior temporal (IT) cortex are regarded as the neural substrate for view-invariant face recognition. This study approximated visual features encoded by these neurons as combinations of local orientations and colors, originated from natural image fragments. The resultant features reproduced the preference of these neurons to particular facial views. We also found that faces of one identity were separable from the faces of other identities in a space where each axis represented one of these features. These results suggested that view-invariant face representation was established by combining view sensitive visual features. The face representation with these features suggested that, with respect to view-invariant face representation, the seemingly complex and deeply layered ventral visual pathway can be approximated via a shallow network, comprised of layers of low-level processing for local orientations and colors (V1/V2-level) and the layers which detect particular sets of low-level elements derived from natural image fragments (IT-level).

2020 ◽  
Author(s):  
Shijia Fan ◽  
Xiaosha Wang ◽  
Xiaoying Wang ◽  
Tao Wei ◽  
Yanchao Bi

AbstractVisual object recognition in humans and nonhuman primates is achieved by the ventral visual pathway (ventral occipital-temporal cortex, VOTC), which shows a well-documented object domain structure. An on-going question has been what type of information is processed in higher-order VOTC that underlies such observations, with recent evidence suggesting effects of certain visual features. Combining computational vision models, fMRI experiment using a parametric-modulation approach, and natural image statistics of common objects, we depicted the neural distribution of a comprehensive set of visual features in VOTC, identifying voxel sensitivities to specific feature sets across geometry/shape, Fourier power, and color. The visual feature combination pattern in VOTC is significantly explained by their relationships to different types of response-action computation (Fight-or-Flight, Navigation, and Manipulation), as derived from behavioral ratings and natural image statistics. These results offer the first comprehensive visual featural map in VOTC and a plausible theoretical explanation as a mapping onto different types of downstream response-action systems.


Author(s):  
Shijia Fan ◽  
Xiaosha Wang ◽  
Xiaoying Wang ◽  
Tao Wei ◽  
Yanchao Bi

2020 ◽  
Author(s):  
Haider Al-Tahan ◽  
Yalda Mohsenzadeh

AbstractWhile vision evokes a dense network of feedforward and feedback neural processes in the brain, visual processes are primarily modeled with feedforward hierarchical neural networks, leaving the computational role of feedback processes poorly understood. Here, we developed a generative autoencoder neural network model and adversarially trained it on a categorically diverse data set of images. We hypothesized that the feedback processes in the ventral visual pathway can be represented by reconstruction of the visual information performed by the generative model. We compared representational similarity of the activity patterns in the proposed model with temporal (magnetoencephalography) and spatial (functional magnetic resonance imaging) visual brain responses. The proposed generative model identified two segregated neural dynamics in the visual brain. A temporal hierarchy of processes transforming low level visual information into high level semantics in the feedforward sweep, and a temporally later dynamics of inverse processes reconstructing low level visual information from a high level latent representation in the feedback sweep. Our results append to previous studies on neural feedback processes by presenting a new insight into the algorithmic function and the information carried by the feedback processes in the ventral visual pathway.Author summaryIt has been shown that the ventral visual cortex consists of a dense network of regions with feedforward and feedback connections. The feedforward path processes visual inputs along a hierarchy of cortical areas that starts in early visual cortex (an area tuned to low level features e.g. edges/corners) and ends in inferior temporal cortex (an area that responds to higher level categorical contents e.g. faces/objects). Alternatively, the feedback connections modulate neuronal responses in this hierarchy by broadcasting information from higher to lower areas. In recent years, deep neural network models which are trained on object recognition tasks achieved human-level performance and showed similar activation patterns to the visual brain. In this work, we developed a generative neural network model that consists of encoding and decoding sub-networks. By comparing this computational model with the human brain temporal (magnetoencephalography) and spatial (functional magnetic resonance imaging) response patterns, we found that the encoder processes resemble the brain feedforward processing dynamics and the decoder shares similarity with the brain feedback processing dynamics. These results provide an algorithmic insight into the spatiotemporal dynamics of feedforward and feedback processes in biological vision.


Author(s):  
Daniel Riccio ◽  
Andrea Casanova ◽  
Gianni Fenu

Face recognition in real world applications is a very difficult task because of image misalignments, pose and illumination variations, or occlusions. Many researchers in this field have investigated both face representation and classification techniques able to deal with these drawbacks. However, none of them is free from limitations. Early proposed algorithms were generally holistic, in the sense they consider the face object as a whole. Recently, challenging benchmarks demonstrated that they are not adequate to be applied in unconstrained environments, despite of their good performances in more controlled conditions. Therefore, the researchers' attention is now turning on local features that have been demonstrated to be more robust to a large set of non-monotonic distortions. Nevertheless, though local operators partially overcome some drawbacks, they are still opening new questions (e.g., Which criteria should be used to select the most representative features?). This is the reason why, among all the others, hybrid approaches are showing a high potential in terms of recognition accuracy when applied in uncontrolled settings, as they integrate complementary information from both local and global features. This chapter explores local, global, and hybrid approaches.


2018 ◽  
Vol 30 (11) ◽  
pp. 1590-1605 ◽  
Author(s):  
Alex Clarke ◽  
Barry J. Devereux ◽  
Lorraine K. Tyler

Object recognition requires dynamic transformations of low-level visual inputs to complex semantic representations. Although this process depends on the ventral visual pathway, we lack an incremental account from low-level inputs to semantic representations and the mechanistic details of these dynamics. Here we combine computational models of vision with semantics and test the output of the incremental model against patterns of neural oscillations recorded with magnetoencephalography in humans. Representational similarity analysis showed visual information was represented in low-frequency activity throughout the ventral visual pathway, and semantic information was represented in theta activity. Furthermore, directed connectivity showed visual information travels through feedforward connections, whereas visual information is transformed into semantic representations through feedforward and feedback activity, centered on the anterior temporal lobe. Our research highlights that the complex transformations between visual and semantic information is driven by feedforward and recurrent dynamics resulting in object-specific semantics.


2015 ◽  
Vol 15 (7) ◽  
pp. 3 ◽  
Author(s):  
Timothy J. Andrews ◽  
David M. Watson ◽  
Grace E. Rice ◽  
Tom Hartley

2013 ◽  
Vol 25 (8) ◽  
pp. 1261-1269 ◽  
Author(s):  
Yiying Song ◽  
Yu L. L. Luo ◽  
Xueting Li ◽  
Miao Xu ◽  
Jia Liu

Real-world scenes usually contain a set of cluttered and yet contextually related objects. Here we used fMRI to investigate where and how contextually related multiple objects were represented in the human ventral visual pathway. Specifically, we measured the responses in face-selective and body-selective regions along the ventral pathway when faces and bodies were presented either simultaneously or in isolation. We found that, in the posterior regions, the response for the face and body pair was the weighted average response for faces and bodies presented in isolation. In contrast, the anterior regions encoded the face and body pair in a mutually facilitative fashion, with the response for the pair significantly higher than that for its constituent objects. Furthermore, in the right fusiform face area, the face and body pair was represented as one inseparable object, possibly to reduce perceptual load and increase representation efficiency. Therefore, our study suggests that the visual system uses a hierarchical representation scheme to process multiple objects in natural scenes: the average mechanism in posterior regions helps retaining information of individual objects in clutter, whereas the nonaverage mechanism in the anterior regions uses the contextual information to optimize the representation for multiple objects.


2014 ◽  
Vol 3 (4) ◽  
pp. 325-334
Author(s):  
Sheng Huang ◽  
Dan Yang ◽  
Haopeng Zhang ◽  
Luwen Huangfu ◽  
Xiaohong Zhang

2019 ◽  
Author(s):  
Remington Mallett ◽  
Anurima Mummaneni ◽  
Jarrod Lewis-Peacock

Working memory persists in the face of distraction, yet not without consequence. Previous research has shown that memory for low-level visual features is systematically influenced by the maintenance or presentation of a similar distractor stimulus. Responses are frequently biased in stimulus space towards a perceptual distractor, though this has yet to be determined for high-level stimuli. We investigated whether these influences are shared for complex visual stimuli such as faces. To quantify response accuracies for these stimuli, we used a delayed-estimation task with a computer-generated “face space” consisting of eighty faces that varied continuously as a function of age and sex. In a set of three experiments, we found that responses for a target face held in working memory were biased towards a distractor face presented during the maintenance period. The amount of response bias did not vary as a function of distance between target and distractor. Our data suggest that, similar to low-level visual features, high-level face representations in working memory are biased by the processing of related but task-irrelevant information.


Sign in / Sign up

Export Citation Format

Share Document