scholarly journals The Geometry of Concept Learning

2021 ◽  
Author(s):  
Ben Sorscher ◽  
Surya Ganguli ◽  
Haim Sompolinsky

Understanding the neural basis of our remarkable cognitive capacity to accurately learn novel high-dimensional naturalistic concepts from just one or a few sensory experiences constitutes a fundamental problem. We propose a simple, biologically plausible, mathematically tractable, and computationally powerful neural mechanism for few-shot learning of naturalistic concepts. We posit that the concepts we can learn given few examples are defined by tightly circumscribed manifolds in the neural firing rate space of higher order sensory areas. We further posit that a single plastic downstream neuron can learn such concepts from few examples using a simple plasticity rule. We demonstrate the computational power of our simple proposal by showing it can achieve high few-shot learning accuracy on natural visual concepts using both macaque inferotemporal cortex representations and deep neural network models of these representations, and can even learn novel visual concepts specified only through language descriptions. Moreover, we develop a mathematical theory of few-shot learning that links neurophysiology to behavior by delineating several fundamental and measurable geometric properties of high-dimensional neural representations that can accurately predict the few-shot learning performance of naturalistic concepts across all our experiments. We discuss several implications of our theory for past and future studies in neuroscience, psychology and machine learning.

2020 ◽  
Vol 31 (3) ◽  
pp. 287-296
Author(s):  
Ahmed A. Moustafa ◽  
Angela Porter ◽  
Ahmed M. Megreya

AbstractMany students suffer from anxiety when performing numerical calculations. Mathematics anxiety is a condition that has a negative effect on educational outcomes and future employment prospects. While there are a multitude of behavioral studies on mathematics anxiety, its underlying cognitive and neural mechanism remain unclear. This article provides a systematic review of cognitive studies that investigated mathematics anxiety. As there are no prior neural network models of mathematics anxiety, this article discusses how previous neural network models of mathematical cognition could be adapted to simulate the neural and behavioral studies of mathematics anxiety. In other words, here we provide a novel integrative network theory on the links between mathematics anxiety, cognition, and brain substrates. This theoretical framework may explain the impact of mathematics anxiety on a range of cognitive and neuropsychological tests. Therefore, it could improve our understanding of the cognitive and neurological mechanisms underlying mathematics anxiety and also has important applications. Indeed, a better understanding of mathematics anxiety could inform more effective therapeutic techniques that in turn could lead to significant improvements in educational outcomes.


2000 ◽  
Vol 10 (01) ◽  
pp. 59-70 ◽  
Author(s):  
JONATHAN A. MARSHALL ◽  
VISWANATH SRIKANTH

Existing neural network models are capable of tracking linear trajectories of moving visual objects. This paper describes an additional neural mechanism, disfacilitation, that enhances the ability of a visual system to track curved trajectories. The added mechanism combines information about an object's trajectory with information about changes in the object's trajectory, to improve the estimates for the object's next probable location. Computational simulations are presented that show how the neural mechanism can learn to track the speed of objects and how the network operates to predict the trajectories of accelerating and decelerating objects.


Nanophotonics ◽  
2017 ◽  
Vol 6 (3) ◽  
pp. 561-576 ◽  
Author(s):  
Guy Van der Sande ◽  
Daniel Brunner ◽  
Miguel C. Soriano

AbstractWe review a novel paradigm that has emerged in analogue neuromorphic optical computing. The goal is to implement a reservoir computer in optics, where information is encoded in the intensity and phase of the optical field. Reservoir computing is a bio-inspired approach especially suited for processing time-dependent information. The reservoir’s complex and high-dimensional transient response to the input signal is capable of universal computation. The reservoir does not need to be trained, which makes it very well suited for optics. As such, much of the promise of photonic reservoirs lies in their minimal hardware requirements, a tremendous advantage over other hardware-intensive neural network models. We review the two main approaches to optical reservoir computing: networks implemented with multiple discrete optical nodes and the continuous system of a single nonlinear device coupled to delayed feedback.


2020 ◽  
Vol 25 (1) ◽  
pp. 7 ◽  
Author(s):  
Abdel-Rahman Hedar ◽  
Wael Deabes ◽  
Majid Almaraashi ◽  
Hesham H. Amin

Enhancing Evolutionary Algorithms (EAs) using mathematical elements significantly contribute to their development and control the randomness they are experiencing. Moreover, the automation of the primary process steps of EAs is still one of the hardest problems. Specifically, EAs still have no robust automatic termination criteria. Moreover, the highly random behavior of some evolutionary operations should be controlled, and the methods should invoke advanced learning process and elements. As follows, this research focuses on the problem of automating and controlling the search process of EAs by using sensing and mathematical mechanisms. These mechanisms can provide the search process with the needed memories and conditions to adapt to the diversification and intensification opportunities. Moreover, a new quadratic coding and quadratic search operator are invoked to increase the local search improving possibilities. The suggested quadratic search operator uses both regression and Radial Basis Function (RBF) neural network models. Two evolutionary-based methods are proposed to evaluate the performance of the suggested enhancing elements using genetic algorithms and evolution strategies. Results show that for both the regression, RBFs and quadratic techniques could help in the approximation of high-dimensional functions with the use of a few adjustable parameters for each type of function. Moreover, the automatic termination criteria could allow the search process to stop appropriately.


2019 ◽  
Author(s):  
Yue Liu ◽  
Marc W. Howard

AbstractSequential neural activity has been observed in many parts of the brain and has been proposed as a neural mechanism for memory. The natural world expresses temporal relationships at a wide range of scales. Because we cannot know the relevant scales a priori it is desirable that memory, and thus the generated sequences, are scale-invariant. Although recurrent neural network models have been proposed as a mechanism for generating sequences, the requirements for scale-invariant sequences are not known. This paper reports the constraints that enable a linear recurrent neural network model to generate scale-invariant sequential activity. A straightforward eigendecomposition analysis results in two independent conditions that are required for scaleinvariance for connectivity matrices with real, distinct eigenvalues. First, the eigenvalues of the network must be geometrically spaced. Second, the eigenvectors must be related to one another via translation. These constraints are easily generalizable for matrices that have complex and distinct eigenvalues. Analogous albeit less compact constraints hold for matrices with degenerate eigenvalues. These constraints, along with considerations on initial conditions, provide a general recipe to build linear recurrent neural networks that support scale-invariant sequential activity.


2018 ◽  
Author(s):  
Michael L. Waskom ◽  
Roozbeh Kiani

SummaryWhen multiple pieces of information bear on a decision, the best approach is to combine the evidence provided by each one. Evidence integration models formalize the computations underlying this process [1–3], explain human perceptual discrimination behavior [4–11], and correspond to neuronal responses elicited by discrimination tasks [12–17]. These findings indicate that evidence integration is key to understanding the neural basis of decision-making [18–21]. Evidence integration has most often been studied with simple tasks that limit the timescale of deliberation to hundreds of milliseconds, but many natural decisions unfold over much longer durations. Because neural network models imply acute limitations on the timescale of evidence integration [22–26], it is unknown whether current computational insights can generalize beyond rapid judgments. Here, we introduce a new psychophysical task and report model-based analyses of human behavior that demonstrate evidence integration at long timescales. Our task requires probabilistic inference using brief samples of visual evidence that are separated in time by long and unpredictable gaps. We show through several quantitative assays how decision-making can approximate a normative integration process that extends over tens of seconds without accruing significant memory leak or noise. These results support the generalization of evidence integration models to a broader class of behaviors while posing new challenges for models of how these computations are implemented in biological networks.


2020 ◽  
Vol 34 (02) ◽  
pp. 1324-1331
Author(s):  
Arthur Williams ◽  
Joshua Phillips

Transfer learning allows for knowledge to generalize across tasks, resulting in increased learning speed and/or performance. These tasks must have commonalities that allow for knowledge to be transferred. The main goal of transfer learning in the reinforcement learning domain is to train and learn on one or more source tasks in order to learn a target task that exhibits better performance than if transfer was not used (Taylor and Stone 2009). Furthermore, the use of output-gated neural network models of working memory has been shown to increase generalization for supervised learning tasks (Kriete and Noelle 2011; Kriete et al. 2013). We propose that working memory-based generalization plays a significant role in a model's ability to transfer knowledge successfully across tasks. Thus, we extended the Holographic Working Memory Toolkit (HWMtk) (Dubois and Phillips 2017; Phillips and Noelle 2005) to utilize the generalization benefits of output gating within a working memory system. Finally, the model's utility was tested on a temporally extended, partially observable 5x5 2D grid-world maze task that required the agent to learn 3 tasks over the duration of the training period. The results indicate that the addition of output gating increases the initial learning performance of an agent in target tasks and decreases the learning time required to reach a fixed performance threshold.


The process of assigning the weight to each connection is called training. A network can be subject to supervised or unsupervised training. In this chapter, supervised and unsupervised learning are explained and then various training algorithms such as multilayer perceptron (MLP) and Back Propagation (BP) as supervised training algorithms are introduced. The unsupervised training algorithm, namely Kohonen's self-organizing map (SOM), is introduced as one of most popular neural network models. SOMs convert high-dimensional, non-linear statistical relationships into simple geometric relationships in an n-dimensional array.


2020 ◽  
Vol 32 (7) ◽  
pp. 1379-1407
Author(s):  
Yue Liu ◽  
Marc W. Howard

Sequential neural activity has been observed in many parts of the brain and has been proposed as a neural mechanism for memory. The natural world expresses temporal relationships at a wide range of scales. Because we cannot know the relevant scales a priori, it is desirable that memory, and thus the generated sequences, is scale invariant. Although recurrent neural network models have been proposed as a mechanism for generating sequences, the requirements for scale-invariant sequences are not known. This letter reports the constraints that enable a linear recurrent neural network model to generate scale-invariant sequential activity. A straightforward eigendecomposition analysis results in two independent conditions that are required for scale invariance for connectivity matrices with real, distinct eigenvalues. First, the eigenvalues of the network must be geometrically spaced. Second, the eigenvectors must be related to one another via translation. These constraints are easily generalizable for matrices that have complex and distinct eigenvalues. Analogous albeit less compact constraints hold for matrices with degenerate eigenvalues. These constraints, along with considerations on initial conditions, provide a general recipe to build linear recurrent neural networks that support scale-invariant sequential activity.


2020 ◽  
Author(s):  
Yang Liu ◽  
Hansaim Lim ◽  
Lei Xie

AbstractMotivationDrug discovery is time-consuming and costly. Machine learning, especially deep learning, shows a great potential in accelerating the drug discovery process and reducing its cost. A big challenge in developing robust and generalizable deep learning models for drug design is the lack of a large amount of data with high quality and balanced labels. To address this challenge, we developed a self-training method PLANS that exploits millions of unlabeled chemical compounds as well as partially labeled pharmacological data to improve the performance of neural network models.ResultWe evaluated the self-training with PLANS for Cytochrome P450 binding activity prediction task, and proved that our method could significantly improve the performance of the neural network model with a large margin. Compared with the baseline deep neural network model, the PLANS-trained neural network model improved accuracy, precision, recall, and F1 score by 13.4%, 12.5%, 8.3%, and 10.3%, respectively. The self-training with PLANS is model agnostic, and can be applied to any deep learning architectures. Thus, PLANS provides a general solution to utilize unlabeled and partially labeled data to improve the predictive modeling for drug discovery.AvailabilityThe code that implements PLANS is available at https://github.com/XieResearchGroup/PLANS


Sign in / Sign up

Export Citation Format

Share Document