Neurally plausible mechanisms for learning selective and invariant representations

Abstract Coding for visual stimuli in the ventral stream is known to be invariant to object identity preserving nuisance transformations. Indeed, much recent theoretical and experimental work suggests that the main challenge for the visual cortex is to build up such nuisance invariant representations. Recently, artificial convolutional networks have succeeded in both learning such invariant properties and, surprisingly, predicting cortical responses in macaque and mouse visual cortex with unprecedented accuracy. However, some of the key ingredients that enable such success—supervised learning and the backpropagation algorithm—are neurally implausible. This makes it difficult to relate advances in understanding convolutional networks to the brain. In contrast, many of the existing neurally plausible theories of invariant representations in the brain involve unsupervised learning, and have been strongly tied to specific plasticity rules. To close this gap, we study an instantiation of simple-complex cell model and show, for a broad class of unsupervised learning rules (including Hebbian learning), that we can learn object representations that are invariant to nuisance transformations belonging to a finite orthogonal group. These findings may have implications for developing neurally plausible theories and models of how the visual cortex or artificial neural networks build selectivity for discriminating objects and invariance to real-world nuisance transformations.

Download Full-text

Brain-Inspired Algorithms for Processing of Visual Data

Lecture Notes in Computer Science - Brain-Inspired Computing ◽

10.1007/978-3-030-82427-3_8 ◽

2021 ◽

pp. 105-115

Author(s):

Nicola Strisciuglio ◽

Nicolai Petkov

Keyword(s):

Image Processing ◽

Computer Vision ◽

Visual Cortex ◽

Visual System ◽

Computational Models ◽

Hierarchical Organization ◽

Visual Data ◽

Convolutional Networks ◽

Improved Stability ◽

The Brain

AbstractThe study of the visual system of the brain has attracted the attention and interest of many neuro-scientists, that derived computational models of some types of neuron that compose it. These findings inspired researchers in image processing and computer vision to deploy such models to solve problems of visual data processing.In this paper, we review approaches for image processing and computer vision, the design of which is based on neuro-scientific findings about the functions of some neurons in the visual cortex. Furthermore, we analyze the connection between the hierarchical organization of the visual system of the brain and the structure of Convolutional Networks (ConvNets). We pay particular attention to the mechanisms of inhibition of the responses of some neurons, which provide the visual system with improved stability to changing input stimuli, and discuss their implementation in image processing operators and in ConvNets.

Download Full-text

Learning Invariant Object and Spatial View Representations in the Brain Using Slow Unsupervised Learning

Frontiers in Computational Neuroscience ◽

10.3389/fncom.2021.686239 ◽

2021 ◽

Vol 15 ◽

Author(s):

Edmund T. Rolls

Keyword(s):

Visual Cortex ◽

Visual System ◽

Superior Temporal Sulcus ◽

Object Motion ◽

Eye Position ◽

Head Direction ◽

Invariant Representations ◽

Spatial View ◽

The Brain ◽

Slow Learning

First, neurophysiological evidence for the learning of invariant representations in the inferior temporal visual cortex is described. This includes object and face representations with invariance for position, size, lighting, view and morphological transforms in the temporal lobe visual cortex; global object motion in the cortex in the superior temporal sulcus; and spatial view representations in the hippocampus that are invariant with respect to eye position, head direction, and place. Second, computational mechanisms that enable the brain to learn these invariant representations are proposed. For the ventral visual system, one key adaptation is the use of information available in the statistics of the environment in slow unsupervised learning to learn transform-invariant representations of objects. This contrasts with deep supervised learning in artificial neural networks, which uses training with thousands of exemplars forced into different categories by neuronal teachers. Similar slow learning principles apply to the learning of global object motion in the dorsal visual system leading to the cortex in the superior temporal sulcus. The learning rule that has been explored in VisNet is an associative rule with a short-term memory trace. The feed-forward architecture has four stages, with convergence from stage to stage. This type of slow learning is implemented in the brain in hierarchically organized competitive neuronal networks with convergence from stage to stage, with only 4-5 stages in the hierarchy. Slow learning is also shown to help the learning of coordinate transforms using gain modulation in the dorsal visual system extending into the parietal cortex and retrosplenial cortex. Representations are learned that are in allocentric spatial view coordinates of locations in the world and that are independent of eye position, head direction, and the place where the individual is located. This enables hippocampal spatial view cells to use idiothetic, self-motion, signals for navigation when the view details are obscured for short periods.

Download Full-text

Decoding the future from past experience: learning shapes predictions in early visual cortex

Journal of Neurophysiology ◽

10.1152/jn.00753.2014 ◽

2015 ◽

Vol 113 (9) ◽

pp. 3159-3171 ◽

Cited By ~ 7

Author(s):

Caroline D. B. Luft ◽

Alan Meeson ◽

Andrew E. Welchman ◽

Zoe Kourtzi

Keyword(s):

Visual Cortex ◽

Large Scale ◽

Sequence Structure ◽

Neural Populations ◽

Early Visual Cortex ◽

Temporal Structures ◽

Future Events ◽

Sensory Predictions ◽

Actual Stimulus ◽

The Brain

Learning the structure of the environment is critical for interpreting the current scene and predicting upcoming events. However, the brain mechanisms that support our ability to translate knowledge about scene statistics to sensory predictions remain largely unknown. Here we provide evidence that learning of temporal regularities shapes representations in early visual cortex that relate to our ability to predict sensory events. We tested the participants' ability to predict the orientation of a test stimulus after exposure to sequences of leftward- or rightward-oriented gratings. Using fMRI decoding, we identified brain patterns related to the observers' visual predictions rather than stimulus-driven activity. Decoding of predicted orientations following structured sequences was enhanced after training, while decoding of cued orientations following exposure to random sequences did not change. These predictive representations appear to be driven by the same large-scale neural populations that encode actual stimulus orientation and to be specific to the learned sequence structure. Thus our findings provide evidence that learning temporal structures supports our ability to predict future events by reactivating selective sensory representations as early as in primary visual cortex.

Download Full-text

Homeostatic plasticity mechanisms in mouse V1

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2016.0504 ◽

2017 ◽

Vol 372 (1715) ◽

pp. 20160504 ◽

Cited By ~ 10

Author(s):

Megumi Kaneko ◽

Michael P. Stryker

Keyword(s):

Visual Cortex ◽

Primary Visual Cortex ◽

Neuronal Activity ◽

Dynamic Range ◽

Ocular Dominance ◽

Homeostatic Plasticity ◽

Ocular Dominance Plasticity ◽

The Brain

Mechanisms thought of as homeostatic must exist to maintain neuronal activity in the brain within the dynamic range in which neurons can signal. Several distinct mechanisms have been demonstrated experimentally. Three mechanisms that act to restore levels of activity in the primary visual cortex of mice after occlusion and restoration of vision in one eye, which give rise to the phenomenon of ocular dominance plasticity, are discussed. The existence of different mechanisms raises the issue of how these mechanisms operate together to converge on the same set points of activity. This article is part of the themed issue ‘Integrating Hebbian and homeostatic plasticity’.

Download Full-text

The relative coding strength of object identity and nonidentity features in human occipito-temporal cortex and convolutional neural networks

10.1101/2020.08.11.246967 ◽

2020 ◽

Author(s):

Yaoda Xu ◽

Maryam Vaziri-Pashkam

Keyword(s):

Visual Processing ◽

Visual Pathway ◽

Feature Representation ◽

Object Categorization ◽

Object Identity ◽

Visual Areas ◽

Ventral Visual Pathway ◽

Identity Representation ◽

The Brain ◽

Early Human

ABSTRACTAny given visual object input is characterized by multiple visual features, such as identity, position and size. Despite the usefulness of identity and nonidentity features in vision and their joint coding throughout the primate ventral visual processing pathway, they have so far been studied relatively independently. Here we document the relative coding strength of object identity and nonidentity features in a brain region and how this may change across the human ventral visual pathway. We examined a total of four nonidentity features, including two Euclidean features (position and size) and two non-Euclidean features (image statistics and spatial frequency content of an image). Overall, identity representation increased and nonidentity feature representation decreased along the ventral visual pathway, with identity outweighed the non-Euclidean features, but not the Euclidean ones, in higher levels of visual processing. A similar analysis was performed in 14 convolutional neural networks (CNNs) pretrained to perform object categorization with varying architecture, depth, and with/without recurrent processing. While the relative coding strength of object identity and nonidentity features in lower CNN layers matched well with that in early human visual areas, the match between higher CNN layers and higher human visual regions were limited. Similar results were obtained regardless of whether a CNN was trained with real-world or stylized object images that emphasized shape representation. Together, by measuring the relative coding strength of object identity and nonidentity features, our approach provided a new tool to characterize feature coding in the human brain and the correspondence between the brain and CNNs.SIGNIFICANCE STATEMENTThis study documented the relative coding strength of object identity compared to four types of nonidentity features along the human ventral visual processing pathway and compared brain responses with those of 14 CNNs pretrained to perform object categorization. Overall, identity representation increased and nonidentity feature representation decreased along the ventral visual pathway, with the coding strength of the different nonidentity features differed at higher levels of visual processing. While feature coding in lower CNN layers matched well with that of early human visual areas, the match between higher CNN layers and higher human visual regions were limited. Our approach provided a new tool to characterize feature coding in the human brain and the correspondence between the brain and CNNs.

Download Full-text

Task dependence of neural representations of global scene properties but not scene categories in the prefrontal cortex

10.1101/2020.12.04.412445 ◽

2020 ◽

Author(s):

Yaelan Jung ◽

Dirk B. Walther

Keyword(s):

Prefrontal Cortex ◽

Visual Cortex ◽

Scene Perception ◽

Sound Level ◽

Task Context ◽

Complex Scene ◽

Neural Representations ◽

Task Conditions ◽

Scene Content ◽

The Brain

AbstractNatural scenes deliver rich sensory information about the world. Decades of research has shown that the scene-selective network in the visual cortex represents various aspects of scenes. It is, however, unknown how such complex scene information is processed beyond the visual cortex, such as in the prefrontal cortex. It is also unknown how task context impacts the process of scene perception, modulating which scene content is represented in the brain. In this study, we investigate these questions using scene images from four natural scene categories, which also depict two types of global scene properties, temperature (warm or cold), and sound-level (noisy or quiet). A group of healthy human subjects from both sexes participated in the present study using fMRI. In the study, participants viewed scene images under two different task conditions; temperature judgment and sound-level judgment. We analyzed how different scene attributes (scene categories, temperature, and sound-level information) are represented across the brain under these task conditions. Our findings show that global scene properties are only represented in the brain, especially in the prefrontal cortex, when they are task-relevant. However, scene categories are represented in the brain, in both the parahippocampal place area and the prefrontal cortex, regardless of task context. These findings suggest that the prefrontal cortex selectively represents scene content according to task demands, but this task selectivity depends on the types of scene content; task modulates neural representations of global scene properties but not of scene categories.

Download Full-text

Predictability-dependent encoding of statistical regularities in the early visual cortex

10.31234/osf.io/axq49 ◽

2022 ◽

Author(s):

Andrea Kóbor ◽

Karolina Janacsek ◽

Petra Hermann ◽

Zsofia Zavecz ◽

Vera Varga ◽

...

Keyword(s):

Visual Cortex ◽

The Other ◽

Block Type ◽

Statistical Knowledge ◽

Early Visual Cortex ◽

Statistical Regularities ◽

Visual Cortical ◽

Increased Activity ◽

The Brain ◽

Subsequent Task

Previous research recognized that humans could extract statistical regularities of the environment to automatically predict upcoming events. However, it has remained unexplored how the brain encodes the distribution of statistical regularities if it continuously changes. To investigate this question, we devised an fMRI paradigm where participants (N = 32) completed a visual four-choice reaction time (RT) task consisting of statistical regularities. Two types of blocks involving the same perceptual elements alternated with one another throughout the task: While the distribution of statistical regularities was predictable in one block type, it was unpredictable in the other. Participants were unaware of the presence of statistical regularities and of their changing distribution across the subsequent task blocks. Based on the RT results, although statistical regularities were processed similarly in both the predictable and unpredictable blocks, participants acquired less statistical knowledge in the unpredictable as compared with the predictable blocks. Whole-brain random-effects analyses showed increased activity in the early visual cortex and decreased activity in the precuneus for the predictable as compared with the unpredictable blocks. Therefore, the actual predictability of statistical regularities is likely to be represented already at the early stages of visual cortical processing. However, decreased precuneus activity suggests that these representations are imperfectly updated to track the multiple shifts in predictability throughout the task. The results also highlight that the processing of statistical regularities in a changing environment could be habitual.

Download Full-text

Text, Images, and Video Analytics for Fog Computing

Handbook of Research on Cloud and Fog Computing Infrastructures for Data Science - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-5225-5972-6.ch018 ◽

2018 ◽

pp. 390-410 ◽

Cited By ~ 2

Author(s):

A. Jayanthiladevi ◽

S. Murugan ◽

K. Manivel

Keyword(s):

Big Data ◽

Visual Cortex ◽

Fog Computing ◽

Unstructured Data ◽

Video Analytics ◽

Color Vector ◽

Physical Features ◽

Vector Images ◽

Text Images ◽

The Brain

Today, images and image sequences (videos) make up about 80% of all corporate and public unstructured big data. As growth of unstructured data increases, analytical systems must assimilate and interpret images and videos as well as they interpret structured data such as text and numbers. An image is a set of signals sensed by the human eye and processed by the visual cortex in the brain creating a vivid experience of a scene that is instantly associated with concepts and objects previously perceived and recorded in one's memory. To a computer, images are either a raster image or a vector image. Simply put, raster images are a sequence of pixels with discreet numerical values for color; vector images are a set of color-annotated polygons. To perform analytics on images or videos, the geometric encoding must be transformed into constructs depicting physical features, objects and movement represented by the image or video. This chapter explores text, images, and video analytics in fog computing.

Download Full-text

Difficulties with synaptic theory of learning and memory and possible remedies

Behavioral and Brain Sciences ◽

10.1017/s0140525x00413361 ◽

2000 ◽

Vol 23 (4) ◽

pp. 550-551

Author(s):

Mikhail N. Zhadin

Keyword(s):

Cerebral Cortex ◽

Associative Learning ◽

Learning And Memory ◽

Hebbian Learning ◽

Behavioral Responses ◽

Learning Rules ◽

The Brain

The absence of a clear influence of an animal's behavioral responses to Hebbian associative learning in the cerebral cortex requires some changes in the Hebbian learning rules. The participation of the brain monoaminergic systems in Hebbian associative learning is considered.

Download Full-text

Collaborative Graph Convolutional Networks: Unsupervised Learning Meets Semi-Supervised Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5843 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4215-4222

Author(s):

Binyuan Hui ◽

Pengfei Zhu ◽

Qinghua Hu

Keyword(s):

Unsupervised Learning ◽

Supervised Learning ◽

Mixture Models ◽

Gaussian Mixture Models ◽

Graph Clustering ◽

Gaussian Mixture ◽

Complex Data ◽

Convolutional Networks ◽

Attributed Graph ◽

Node Classification

Graph convolutional networks (GCN) have achieved promising performance in attributed graph clustering and semi-supervised node classification because it is capable of modeling complex graphical structure, and jointly learning both features and relations of nodes. Inspired by the success of unsupervised learning in the training of deep models, we wonder whether graph-based unsupervised learning can collaboratively boost the performance of semi-supervised learning. In this paper, we propose a multi-task graph learning model, called collaborative graph convolutional networks (CGCN). CGCN is composed of an attributed graph clustering network and a semi-supervised node classification network. As Gaussian mixture models can effectively discover the inherent complex data distributions, a new end to end attributed graph clustering network is designed by combining variational graph auto-encoder with Gaussian mixture models (GMM-VGAE) rather than the classic k-means. If the pseudo-label of an unlabeled sample assigned by GMM-VGAE is consistent with the prediction of the semi-supervised GCN, it is selected to further boost the performance of semi-supervised learning with the help of the pseudo-labels. Extensive experiments on benchmark graph datasets validate the superiority of our proposed GMM-VGAE compared with the state-of-the-art attributed graph clustering networks. The performance of node classification is greatly improved by our proposed CGCN, which verifies graph-based unsupervised learning can be well exploited to enhance the performance of semi-supervised learning.

Download Full-text