Linking image-by-image population dynamics in the macaque inferior temporal cortex to core object recognition behavior

2018 ◽  
Author(s):  
Kohitij Kar ◽  
Kailyn Schmidt ◽  
James DiCarlo
eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Xiaoxuan Jia ◽  
Ha Hong ◽  
Jim DiCarlo

Temporal continuity of object identity is a feature of natural visual input, and is potentially exploited -- in an unsupervised manner -- by the ventral visual stream to build the neural representation in inferior temporal (IT) cortex. Here we investigated whether plasticity of individual IT neurons underlies human core-object-recognition behavioral changes induced with unsupervised visual experience. We built a single-neuron plasticity model combined with a previously established IT population-to-recognition-behavior linking model to predict human learning effects. We found that our model, after constrained by neurophysiological data, largely predicted the mean direction, magnitude and time course of human performance changes. We also found a previously unreported dependency of the observed human performance change on the initial task difficulty. This result adds support to the hypothesis that tolerant core object recognition in human and non-human primates is instructed -- at least in part -- by naturally occurring unsupervised temporal contiguity experience.


2009 ◽  
Vol 102 (1) ◽  
pp. 360-376 ◽  
Author(s):  
Nuo Li ◽  
David D. Cox ◽  
Davide Zoccolan ◽  
James J. DiCarlo

Primates can easily identify visual objects over large changes in retinal position—a property commonly referred to as position “invariance.” This ability is widely assumed to depend on neurons in inferior temporal cortex (IT) that can respond selectively to isolated visual objects over similarly large ranges of retinal position. However, in the real world, objects rarely appear in isolation, and the interplay between position invariance and the representation of multiple objects (i.e., clutter) remains unresolved. At the heart of this issue is the intuition that the representations of nearby objects can interfere with one another and that the large receptive fields needed for position invariance can exacerbate this problem by increasing the range over which interference acts. Indeed, most IT neurons' responses are strongly affected by the presence of clutter. While external mechanisms (such as attention) are often invoked as a way out of the problem, we show (using recorded neuronal data and simulations) that the intrinsic properties of IT population responses, by themselves, can support object recognition in the face of limited clutter. Furthermore, we carried out extensive simulations of hypothetical neuronal populations to identify the essential individual-neuron ingredients of a good population representation. These simulations show that the crucial neuronal property to support recognition in clutter is not preservation of response magnitude, but preservation of each neuron's rank-order object preference under identity-preserving image transformations (e.g., clutter). Because IT neuronal responses often exhibit that response property, while neurons in earlier visual areas (e.g., V1) do not, we suggest that preserving the rank-order object preference regardless of clutter, rather than the response magnitude, more precisely describes the goal of individual neurons at the top of the ventral visual stream.


2002 ◽  
Vol 14 (11) ◽  
pp. 2585-2596 ◽  
Author(s):  
Simon M. Stringer ◽  
Edmund T. Rolls

To form view-invariant representations of objects, neurons in the inferior temporal cortex may associate together different views of an object, which tend to occur close together in time under natural viewing conditions. This can be achieved in neuronal network models of this process by using an associative learning rule with a short-term temporal memory trace. It is postulated that within a view, neurons learn representations that enable them to generalize within variations of that view. When three-dimensional (3D) objects are rotated within small angles (up to, e.g., 30 degrees), their surface features undergo geometric distortion due to the change of perspective. In this article, we show how trace learning could solve the problem of in-depth rotation-invariant object recognition by developing representations of the transforms that features undergo when they are on the surfaces of 3D objects. Moreover, we show that having learned how features on 3D objects transform geometrically as the object is rotated in depth, the network can correctly recognize novel 3D variations within a generic view of an object composed of a new combination of previously learned features. These results are demonstrated in simulations of a hierarchical network model (VisNet) of the visual system that show that it can develop representations useful for the recognition of 3D objects by forming perspective-invariant representations to allow generalization within a generic view.


2004 ◽  
Vol 15 (8) ◽  
pp. 1103-1112 ◽  
Author(s):  
Narihisa Matsumoto ◽  
Masato Okada ◽  
Yasuko Sugase-Miyamoto ◽  
Shigeru Yamane ◽  
Kenji Kawano

Author(s):  
Xiaoxuan Jia ◽  
Ha Hong ◽  
James J. DiCarlo

AbstractTemporal continuity of object identity is a feature of natural visual input, and is potentially exploited -- in an unsupervised manner -- by the ventral visual stream to build the neural representation in inferior temporal (IT) cortex and IT-dependent core object recognition behavior. Here we investigated whether plasticity of individual IT neurons underlies human behavioral changes induced with unsupervised visual experience by building a single-neuron plasticity model combined with a previously established IT population-to-recognition-behavior linking model to predict human learning effects. We found that our model quite accurately predicted the mean direction, magnitude and time course of human performance changes. We also found a previously unreported dependency of the observed human performance change on the initial task difficulty. This result adds support to the hypothesis that tolerant core object recognition in human and non-human primates is instructed -- at least in part -- by naturally occurring unsupervised temporal contiguity experience.


Sign in / Sign up

Export Citation Format

Share Document