Rich and lazy learning of task representations in brains and neural networks

AbstractHow do neural populations code for multiple, potentially conflicting tasks? Here, we used computational simulations involving neural networks to define “lazy” and “rich” coding solutions to this multitasking problem, which trade off learning speed for robustness. During lazy learning the input dimensionality is expanded by random projections to the network hidden layer, whereas in rich learning hidden units acquire structured representations that privilege relevant over irrelevant features. For context-dependent decision-making, one rich solution is to project task representations onto low-dimensional and orthogonal manifolds. Using behavioural testing and neuroimaging in humans, and analysis of neural signals from macaque prefrontal cortex, we report evidence for neural coding patterns in biological brains whose dimensionality and neural geometry are consistent with the rich learning regime.

Download Full-text

Convergence Behavior of DNNs with Mutual-Information-Based Regularization

Entropy ◽

10.3390/e22070727 ◽

2020 ◽

Vol 22 (7) ◽

pp. 727 ◽

Cited By ~ 1

Author(s):

Hlynur Jónsson ◽

Giovanni Cherubini ◽

Evangelos Eleftheriou

Keyword(s):

Neural Networks ◽

Mutual Information ◽

Low Complexity ◽

High Dimensional ◽

Test Accuracy ◽

Compression Phase ◽

Hidden Layer ◽

Low Dimensional ◽

Fully Connected ◽

Fully Connected Networks

Information theory concepts are leveraged with the goal of better understanding and improving Deep Neural Networks (DNNs). The information plane of neural networks describes the behavior during training of the mutual information at various depths between input/output and hidden-layer variables. Previous analysis revealed that most of the training epochs are spent on compressing the input, in some networks where finiteness of the mutual information can be established. However, the estimation of mutual information is nontrivial for high-dimensional continuous random variables. Therefore, the computation of the mutual information for DNNs and its visualization on the information plane mostly focused on low-complexity fully connected networks. In fact, even the existence of the compression phase in complex DNNs has been questioned and viewed as an open problem. In this paper, we present the convergence of mutual information on the information plane for a high-dimensional VGG-16 Convolutional Neural Network (CNN) by resorting to Mutual Information Neural Estimation (MINE), thus confirming and extending the results obtained with low-dimensional fully connected networks. Furthermore, we demonstrate the benefits of regularizing a network, especially for a large number of training epochs, by adopting mutual information estimates as additional terms in the loss function characteristic of the network. Experimental results show that the regularization stabilizes the test accuracy and significantly reduces its variance.

Download Full-text

Gearbox Fault Detection Using Neural Networks Technology

Volume 7A: 17th Biennial Conference on Mechanical Vibration and Noise ◽

10.1115/detc99/vib-8329 ◽

1999 ◽

Author(s):

Qiao Sun ◽

Xiaolei Li ◽

Baoyun Xu

Keyword(s):

Genetic Algorithm ◽

Neural Networks ◽

Feature Extraction ◽

Fault Diagnosis ◽

Learning Algorithm ◽

Extraction Methods ◽

Bp Network ◽

Speed Increase ◽

Learning Speed ◽

Hidden Layer

Abstract This paper describes the application of neural networks to gearbox fault diagnosis. In order to increase learning speed of BP network, a modified learning algorithm was presented. Considering of the difficulty of choosing neural networks’ architecture, genetic algorithm was employed. The discussion of the effect of hidden layer nodes shows that with the increase of the number of nodes, the learning speed increase also yet result in poor generalization ability. The test of fault tolerance ability tells that neural networks have ‘bench type’ tolerance ability. This ensures that when signals were contaminated by noise or feature extraction methods were not effective, the result can still be acceptable. To test the performance of the application of neural networks on gearbox fault diagnosis, experiments of single fault and multi-faults were both implemented and diagnosed by neural networks. The results were satisfied.

Download Full-text

Analysis of Non-Linear Activation Functions for Classification Tasks Using Convolutional Neural Networks

Recent Patents on Computer Science ◽

10.2174/2213275911666181025143029 ◽

2019 ◽

Vol 12 (3) ◽

pp. 156-161 ◽

Cited By ~ 3

Author(s):

Aman Dureja ◽

Payal Pahwa

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Activation Function ◽

Primary Objective ◽

Experimental Comparison ◽

Activation Functions ◽

Practical Applications ◽

Network Activation ◽

Non Linear ◽

Hidden Layer

Background: In making the deep neural network, activation functions play an important role. But the choice of activation functions also affects the network in term of optimization and to retrieve the better results. Several activation functions have been introduced in machine learning for many practical applications. But which activation function should use at hidden layer of deep neural networks was not identified. Objective: The primary objective of this analysis was to describe which activation function must be used at hidden layers for deep neural networks to solve complex non-linear problems. Methods: The configuration for this comparative model was used by using the datasets of 2 classes (Cat/Dog). The number of Convolutional layer used in this network was 3 and the pooling layer was also introduced after each layer of CNN layer. The total of the dataset was divided into the two parts. The first 8000 images were mainly used for training the network and the next 2000 images were used for testing the network. Results: The experimental comparison was done by analyzing the network by taking different activation functions on each layer of CNN network. The validation error and accuracy on Cat/Dog dataset were analyzed using activation functions (ReLU, Tanh, Selu, PRelu, Elu) at number of hidden layers. Overall the Relu gave best performance with the validation loss at 25th Epoch 0.3912 and validation accuracy at 25th Epoch 0.8320. Conclusion: It is found that a CNN model with ReLU hidden layers (3 hidden layers here) gives best results and improve overall performance better in term of accuracy and speed. These advantages of ReLU in CNN at number of hidden layers are helpful to effectively and fast retrieval of images from the databases.

Download Full-text

Hardware implementation of radial-basis neural networks with Gaussian activation functions on FPGA

Neural Computing and Applications ◽

10.1007/s00521-021-05706-3 ◽

2021 ◽

Author(s):

Volodymyr Shymkovych ◽

Sergii Telenyk ◽

Petro Kravets

Keyword(s):

Neural Networks ◽

Hardware Implementation ◽

Gaussian Function ◽

Activation Function ◽

Rbf Neural Networks ◽

Activation Functions ◽

Rbf Network ◽

Combination Scheme ◽

Radial Basis ◽

Hidden Layer

AbstractThis article introduces a method for realizing the Gaussian activation function of radial-basis (RBF) neural networks with their hardware implementation on field-programmable gaits area (FPGAs). The results of modeling of the Gaussian function on FPGA chips of different families have been presented. RBF neural networks of various topologies have been synthesized and investigated. The hardware component implemented by this algorithm is an RBF neural network with four neurons of the latent layer and one neuron with a sigmoid activation function on an FPGA using 16-bit numbers with a fixed point, which took 1193 logic matrix gate (LUTs—LookUpTable). Each hidden layer neuron of the RBF network is designed on an FPGA as a separate computing unit. The speed as a total delay of the combination scheme of the block RBF network was 101.579 ns. The implementation of the Gaussian activation functions of the hidden layer of the RBF network occupies 106 LUTs, and the speed of the Gaussian activation functions is 29.33 ns. The absolute error is ± 0.005. The Spartan 3 family of chips for modeling has been used to get these results. Modeling on chips of other series has been also introduced in the article. RBF neural networks of various topologies have been synthesized and investigated. Hardware implementation of RBF neural networks with such speed allows them to be used in real-time control systems for high-speed objects.

Download Full-text

Deep learning for AI

Communications of the ACM ◽

10.1145/3448250 ◽

2021 ◽

Vol 64 (7) ◽

pp. 58-65

Author(s):

Yoshua Bengio ◽

Yann Lecun ◽

Geoffrey Hinton

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Internal Representations ◽

The Rich

How can neural networks learn the rich internal representations required for difficult tasks such as recognizing objects or understanding language?

Download Full-text

Air quality prediction in Milan: feed-forward neural networks, pruned neural networks and lazy learning

Ecological Modelling ◽

10.1016/j.ecolmodel.2005.01.008 ◽

2005 ◽

Vol 185 (2-4) ◽

pp. 513-529 ◽

Cited By ~ 126

Author(s):

Giorgio Corani

Keyword(s):

Neural Networks ◽

Air Quality ◽

Quality Prediction ◽

Lazy Learning ◽

Feed Forward ◽

Air Quality Prediction ◽

Feed Forward Neural Networks

Download Full-text

Exploiting heterogeneity in operational neural networks by synaptic plasticity

Neural Computing and Applications ◽

10.1007/s00521-020-05543-w ◽

2021 ◽

Author(s):

Serkan Kiranyaz ◽

Junaid Malik ◽

Habib Ben Abdallah ◽

Turker Ince ◽

Alexandros Iosifidis ◽

...

Keyword(s):

Neural Networks ◽

Synaptic Plasticity ◽

Network Model ◽

Neuron Model ◽

Linear Operators ◽

Training Data ◽

Learning Performance ◽

Minimal Network ◽

Hidden Layer ◽

Hidden Neurons

AbstractThe recently proposed network model, Operational Neural Networks (ONNs), can generalize the conventional Convolutional Neural Networks (CNNs) that are homogenous only with a linear neuron model. As a heterogenous network model, ONNs are based on a generalized neuron model that can encapsulate any set of non-linear operators to boost diversity and to learn highly complex and multi-modal functions or spaces with minimal network complexity and training data. However, the default search method to find optimal operators in ONNs, the so-called Greedy Iterative Search (GIS) method, usually takes several training sessions to find a single operator set per layer. This is not only computationally demanding, also the network heterogeneity is limited since the same set of operators will then be used for all neurons in each layer. To address this deficiency and exploit a superior level of heterogeneity, in this study the focus is drawn on searching the best-possible operator set(s) for the hidden neurons of the network based on the “Synaptic Plasticity” paradigm that poses the essential learning theory in biological neurons. During training, each operator set in the library can be evaluated by their synaptic plasticity level, ranked from the worst to the best, and an “elite” ONN can then be configured using the top-ranked operator sets found at each hidden layer. Experimental results over highly challenging problems demonstrate that the elite ONNs even with few neurons and layers can achieve a superior learning performance than GIS-based ONNs and as a result, the performance gap over the CNNs further widens.

Download Full-text

Task representations in neural networks trained to perform many cognitive tasks

Nature Neuroscience ◽

10.1038/s41593-018-0310-2 ◽

2019 ◽

Vol 22 (2) ◽

pp. 297-306 ◽

Cited By ~ 61

Author(s):

Guangyu Robert Yang ◽

Madhura R. Joglekar ◽

H. Francis Song ◽

William T. Newsome ◽

Xiao-Jing Wang

Keyword(s):

Neural Networks ◽

Cognitive Tasks ◽

Task Representations

Download Full-text

CHAOS AND HYPERCHAOS IN A CLASS OF SIMPLE CELLULAR NEURAL NETWORKS MODELED BY O.D.E.

International Journal of Bifurcation and Chaos ◽

10.1142/s0218127406016409 ◽

2006 ◽

Vol 16 (09) ◽

pp. 2729-2736 ◽

Cited By ~ 5

Author(s):

XIAO-SONG YANG ◽

YAN HUANG

Keyword(s):

Neural Networks ◽

Differential Equations ◽

Ordinary Differential Equations ◽

Lyapunov Exponents ◽

Cellular Neural Networks ◽

New Class ◽

Connection Matrices ◽

Low Dimensional

This paper presents a new class of chaotic and hyperchaotic low dimensional cellular neural networks modeled by ordinary differential equations with some simple connection matrices. The chaoticity of these neural networks is indicated by positive Lyapunov exponents calculated by a computer.

Download Full-text

Generalization-Based Acquisition of Training Data for Motor Primitive Learning by Neural Networks

Applied Sciences ◽

10.3390/app11031013 ◽

2021 ◽

Vol 11 (3) ◽

pp. 1013

Author(s):

Zvezdan Lončarević ◽

Rok Pahič ◽

Aleš Ude ◽

Andrej Gams

Keyword(s):

Neural Networks ◽

Dimensionality Reduction ◽

Gaussian Process Regression ◽

Search Space ◽

Robot Learning ◽

Training Data ◽

Practical Applications ◽

Latent Space ◽

Real Robot ◽

Low Dimensional

Autonomous robot learning in unstructured environments often faces the problem that the dimensionality of the search space is too large for practical applications. Dimensionality reduction techniques have been developed to address this problem and describe motor skills in low-dimensional latent spaces. Most of these techniques require the availability of a sufficiently large database of example task executions to compute the latent space. However, the generation of many example task executions on a real robot is tedious, and prone to errors and equipment failures. The main result of this paper is a new approach for efficient database gathering by performing a small number of task executions with a real robot and applying statistical generalization, e.g., Gaussian process regression, to generate more data. We have shown in our experiments that the data generated this way can be used for dimensionality reduction with autoencoder neural networks. The resulting latent spaces can be exploited to implement robot learning more efficiently. The proposed approach has been evaluated on the problem of robotic throwing at a target. Simulation and real-world results with a humanoid robot TALOS are provided. They confirm the effectiveness of generalization-based database acquisition and the efficiency of learning in a low-dimensional latent space.

Download Full-text