The information theory of developmental pruning: Optimizing global network architectures using local synaptic rules

During development, biological neural networks produce more synapses and neurons than needed. Many of these synapses and neurons are later removed in a process known as neural pruning. Why networks should initially be over-populated, and the processes that determine which synapses and neurons are ultimately pruned, remains unclear. We study the mechanisms and significance of neural pruning in model neural networks. In a deep Boltzmann machine model of sensory encoding, we find that (1) synaptic pruning is necessary to learn efficient network architectures that retain computationally-relevant connections, (2) pruning by synaptic weight alone does not optimize network size and (3) pruning based on a locally-available measure of importance based on Fisher information allows the network to identify structurally important vs. unimportant connections and neurons. This locally-available measure of importance has a biological interpretation in terms of the correlations between presynaptic and postsynaptic neurons, and implies an efficient activity-driven pruning rule. Overall, we show how local activity-dependent synaptic pruning can solve the global problem of optimizing a network architecture. We relate these findings to biology as follows: (I) Synaptic over-production is necessary for activity-dependent connectivity optimization. (II) In networks that have more neurons than needed, cells compete for activity, and only the most important and selective neurons are retained. (III) Cells may also be pruned due to a loss of synapses on their axons. This occurs when the information they convey is not relevant to the target population.

Download Full-text

The Information Theory of Developmental Pruning: Optimizing Global Network Architecture Using Local Synaptic Rules

10.1101/2020.11.30.403360 ◽

2020 ◽

Author(s):

Carolin Scholl ◽

Michael E. Rule ◽

Matthias H. Hennig

Keyword(s):

Network Architecture ◽

Target Population ◽

Network Size ◽

Global Network ◽

Machine Model ◽

Deep Boltzmann Machine ◽

Biological Interpretation ◽

Sensory Encoding ◽

Activity Dependent ◽

Synaptic Pruning

AbstractDuring development, biological neural networks produce more synapses and neurons than needed. Many of these synapses and neurons are later removed in a process known as neural pruning. Why networks should initially be over-populated, and processes that determine which synapses and neurons are ultimately pruned, remains unclear. We study the mechanisms and significance of neural pruning in model neural network. In a deep Boltzmann machine model of sensory encoding, we find that (1) synaptic pruning is necessary to learn efficient network architectures that retain computationally-relevant connections, (2) pruning by synaptic weight alone does not optimize network size and (3) pruning based on a locally-available proxy for “sloppiness” based on Fisher Information allows the network to identify structurally important vs. unimportant connections and neurons. This locally-available measure of importance has a biological interpretation in terms of the correlations between presynaptic and postsynaptic neurons, and implies an efficient activity-driven pruning rule. Overall, we show how local activity-dependent synaptic pruning can solve the global problem of optimizing a network architecture. We relate these findings to biology as follows: (I) Synaptic over-production is necessary for activity-dependent connectivity optimization. (II) In networks that have more neurons than needed, cells compete for activity, and only the most important and selective neurons are retained. (III) Cells may also be pruned due to a loss of synapses on their axons. This occurs when the information they convey is not relevant to the target population.

Download Full-text

Prediction of slip in cable-drum systems using structured neural networks

Proceedings of the Institution of Mechanical Engineers Part C Journal of Mechanical Engineering Science ◽

10.1177/0954406213487471 ◽

2013 ◽

Vol 228 (3) ◽

pp. 441-456 ◽

Cited By ~ 2

Author(s):

Ergin Kilic ◽

Melik Dolen

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Size ◽

Network Architectures ◽

Network Development ◽

Low Velocity ◽

Error Band ◽

Artificial Neural ◽

Recurrent Type ◽

Selection Of

This study focuses on the slip prediction in a cable-drum system using artificial neural networks for the prospect of developing linear motion sensing scheme for such mechanisms. Both feed-forward and recurrent-type artificial neural network architectures are considered to capture the slip dynamics of cable-drum mechanisms. In the article, the network development is presented in a progressive (step-by-step) fashion for the purpose of not only making the design process transparent to the readers but also highlighting the corresponding challenges associated with the design phase (i.e. selection of architecture, network size, training process parameters, etc.). Prediction performances of the devised networks are evaluated rigorously via an experimental study. Finally, a structured neural network, which embodies the network with the best prediction performance, is further developed to overcome the drift observed at low velocity. The study illustrates that the resulting structured neural network could predict the slip in the mechanism within an error band of 100 µm when an absolute reference is utilized.

Download Full-text

Speeding Up Back-Propagation Neural Networks

10.28945/2931 ◽

2005 ◽

Cited By ~ 4

Author(s):

Mohammed A. Otair ◽

Walid A. Salameh

Keyword(s):

Neural Networks ◽

Back Propagation ◽

Network Size ◽

Learning Rate ◽

Network Architectures ◽

Local Minima ◽

Set Size ◽

Long Time ◽

Multilayer Neural Networks ◽

Optical Time

There are many successful applications of Backpropagation (BP) for training multilayer neural networks. However, it has many shortcomings. Learning often takes long time to converge, and it may fall into local minima. One of the possible remedies to escape from local minima is by using a very small learning rate, which slows down the learning process. The proposed algorithm presented in this study used for training depends on a multilayer neural network with a very small learning rate, especially when using a large training set size. It can be applied in a generic manner for any network size that uses a backpropgation algorithm through an optical time (seen time). The paper describes the proposed algorithm, and how it can improve the performance of back-propagation (BP). The feasibility of proposed algorithm is shown through out number of experiments on different network architectures.

Download Full-text

Network Approximation using Tensor Sketching

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/321 ◽

2018 ◽

Cited By ~ 1

Author(s):

Shiva Prasad Kasiviswanathan ◽

Nina Narodytska ◽

Hongxia Jin

Keyword(s):

Neural Networks ◽

Language Processing ◽

Network Architecture ◽

Deep Neural Networks ◽

Network Architectures ◽

Effective Parameters ◽

Unified Framework ◽

Design Changes ◽

Target Network ◽

Fully Connected

Deep neural networks are powerful learning models that achieve state-of-the-art performance on many computer vision, speech, and language processing tasks. In this paper, we study a fundamental question that arises when designing deep network architectures: Given a target network architecture can we design a `smaller' network architecture that 'approximates' the operation of the target network? The question is, in part, motivated by the challenge of parameter reduction (compression) in modern deep neural networks, as the ever increasing storage and memory requirements of these networks pose a problem in resource constrained environments.In this work, we focus on deep convolutional neural network architectures, and propose a novel randomized tensor sketching technique that we utilize to develop a unified framework for approximating the operation of both the convolutional and fully connected layers. By applying the sketching technique along different tensor dimensions, we design changes to the convolutional and fully connected layers that substantially reduce the number of effective parameters in a network. We show that the resulting smaller network can be trained directly, and has a classification accuracy that is comparable to the original network.

Download Full-text

A mixed-scale dense convolutional neural network for image analysis

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1715832114 ◽

2017 ◽

Vol 115 (2) ◽

pp. 254-259 ◽

Cited By ~ 60

Author(s):

Daniël M. Pelt ◽

James A. Sethian

Keyword(s):

Neural Network ◽

Neural Networks ◽

Image Processing ◽

Network Architecture ◽

Training Data ◽

Network Architectures ◽

Feature Maps ◽

Deep Convolutional Neural Networks ◽

Single Set ◽

Reduced Risk

Deep convolutional neural networks have been successfully applied to many image-processing problems in recent works. Popular network architectures often add additional operations and connections to the standard architecture to enable training deeper networks. To achieve accurate results in practice, a large number of trainable parameters are often required. Here, we introduce a network architecture based on using dilated convolutions to capture features at different image scales and densely connecting all feature maps with each other. The resulting architecture is able to achieve accurate results with relatively few parameters and consists of a single set of operations, making it easier to implement, train, and apply in practice, and automatically adapts to different problems. We compare results of the proposed network architecture with popular existing architectures for several segmentation problems, showing that the proposed architecture is able to achieve accurate results with fewer parameters, with a reduced risk of overfitting the training data.

Download Full-text

Genetic interference reduces the evolvability of modular and non-modular visual neural networks

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2006.1967 ◽

2007 ◽

Vol 362 (1479) ◽

pp. 403-410 ◽

Cited By ~ 3

Author(s):

Raffaele Calabretta

Keyword(s):

Neural Networks ◽

Network Architecture ◽

Biological Evolution ◽

Network Architectures ◽

Network Connection ◽

The Neural Network ◽

Theories Of Mind ◽

Genetic Level ◽

Genetic Interference ◽

Definition Of

The aim of this paper is to propose an interdisciplinary evolutionary connectionism approach for the study of the evolution of modularity. It is argued that neural networks as a model of the nervous system and genetic algorithms as simulative models of biological evolution would allow us to formulate a clear and operative definition of module and to simulate the different evolutionary scenarios proposed for the origin of modularity. I will present a recent model in which the evolution of primate cortical visual streams is possible starting from non-modular neural networks. Simulation results not only confirm the existence of the phenomenon of neural interference in non-modular network architectures but also, for the first time, reveal the existence of another kind of interference at the genetic level, i.e. genetic interference, a new population genetic mechanism that is independent from the network architecture. Our simulations clearly show that genetic interference reduces the evolvability of visual neural networks and sexual reproduction can at least partially solve the problem of genetic interference. Finally, it is shown that entrusting the task of finding the neural network architecture to evolution and that of finding the network connection weights to learning is a way to completely avoid the problem of genetic interference. On the basis of this evidence, it is possible to formulate a new hypothesis on the origin of structural modularity, and thus to overcome the traditional dichotomy between innatist and empiricist theories of mind.

Download Full-text

Rumor Detection Based on SAGNN: Simplified Aggregation Graph Neural Networks

Machine Learning and Knowledge Extraction ◽

10.3390/make3010005 ◽

2021 ◽

Vol 3 (1) ◽

pp. 84-94

Author(s):

Liang Zhang ◽

Jingqun Li ◽

Bin Zhou ◽

Yan Jia

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Network Architectures ◽

Neural Network Architecture ◽

Convolutional Networks ◽

Effective Manner ◽

Deep Learning Network ◽

Graph Neural Networks ◽

Rumor Detection

Identifying fake news on media has been an important issue. This is especially true considering the wide spread of rumors on popular social networks such as Twitter. Various kinds of techniques have been proposed for automatic rumor detection. In this work, we study the application of graph neural networks for rumor classification at a lower level, instead of applying existing neural network architectures to detect rumors. The responses to true rumors and false rumors display distinct characteristics. This suggests that it is essential to capture such interactions in an effective manner for a deep learning network to achieve better rumor detection performance. To this end we present a simplified aggregation graph neural network architecture. Experiments on publicly available Twitter datasets demonstrate that the proposed network has performance on a par with or even better than that of state-of-the-art graph convolutional networks, while significantly reducing the computational complexity.

Download Full-text

CommNets: Communicating Neural Network Architectures for Resource Constrained Systems

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019909 ◽

2019 ◽

Vol 33 ◽

pp. 9909-9910

Author(s):

Prince M Abudu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Recurrent Neural Networks ◽

Network Architecture ◽

Resource Constraints ◽

Constrained Systems ◽

Detection Accuracy ◽

Network Architectures ◽

Operating Environments ◽

Hidden States

Applications that require heterogeneous sensor deployments continue to face practical challenges owing to resource constraints within their operating environments (i.e. energy efficiency, computational power and reliability). This has motivated the need for effective ways of selecting a sensing strategy that maximizes detection accuracy for events of interest using available resources and data-driven approaches. Inspired by those limitations, we ask a fundamental question: whether state-of-the-art Recurrent Neural Networks can observe different series of data and communicate their hidden states to collectively solve an objective in a distributed fashion. We realize our answer by conducting a series of systematic analyses of a Communicating Recurrent Neural Network architecture on varying time-steps, objective functions and number of nodes. The experimental setup we employ models tasks synonymous with those in Wireless Sensor Networks. Our contributions show that Recurrent Neural Networks can communicate through their hidden states and we achieve promising results.

Download Full-text

Self-supervised learning with physics-aware neural networks – I. Galaxy model fitting

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa2228 ◽

2020 ◽

Vol 498 (3) ◽

pp. 3713-3719

Author(s):

M A Aragon-Calvo ◽

J C Carvajal

Keyword(s):

Neural Network ◽

Neural Networks ◽

Unsupervised Learning ◽

Supervised Learning ◽

Network Architecture ◽

Numerical Models ◽

Model Fitting ◽

Internal Representation ◽

Network Size ◽

Hybrid Network

ABSTRACT Estimating the parameters of a model describing a set of observations using a neural network is, in general, solved in a supervised way. In cases when we do not have access to the model’s true parameters, this approach can not be applied. Standard unsupervised learning techniques, on the other hand, do not produce meaningful or semantic representations that can be associated with the model’s parameters. Here we introduce a novel self-supervised hybrid network architecture that combines traditional neural network elements with analytic or numerical models, which represent a physical process to be learned by the system. Self-supervised learning is achieved by generating an internal representation equivalent to the parameters of the physical model. This semantic representation is used to evaluate the model and compare it to the input data during training. The semantic autoencoder architecture described here shares the robustness of neural networks while including an explicit model of the data, learns in an unsupervised way, and estimates, by construction, parameters with direct physical interpretation. As an illustrative application, we perform unsupervised learning for 2D model fitting of exponential light profiles and evaluate the performance of the network as a function of network size and noise.

Download Full-text

Self-organized map: the new aproach for study of genetic divergence in kale

10.1101/2020.05.14.095711 ◽

2020 ◽

Author(s):

Orlando Gonçalves Brito ◽

Valter Carvalho de Andrade Júnior ◽

Alcinei Mistico Azevedo ◽

Maria Thereza Netta Lopes Silva ◽

Ludimila Geiciane de Sá ◽

...

Keyword(s):

Neural Networks ◽

Discriminant Analysis ◽

Genetic Divergence ◽

Network Architecture ◽

Network Architectures ◽

Self Organizing Map ◽

Network Configuration ◽

Dissimilarity Matrix ◽

Self Organized ◽

Plant Level

The objective of this study was to study the genetic divergence between genotypes of kale, to propose a methodology for the use of neural networks of the SOM type and to test its efficiency through Anderson's discriminant analysis. We evaluated 33 families of half-siblings of kale and three commercial cultivars. The design was a randomized block with four replications with six plants per plot. A total of 14 plant-level quantitative traits were evaluated. Genetic values were predicted at family level via REML / BLUP. For the study of divergence, neural networks of the SOM type (Self-organizing Map) were adopted. We evaluated different network architectures, whose consistencies of the clusters were identified by the Anderson discriminant analysis and by the number of empty clusters. After selecting the best network configuration, a dissimilarity matrix was obtained, from which a dendrogram was constructed using the UPGMA method. The best network architecture was formed with five rows and one column, totaling five neurons and consequently five clusters. The greatest dissimilarity was established between clusters I and V. The crossing between the genotypes of cluster I and those belonging to clusters III and V are the most recommended, since they aim to recombine families with characteristics of interest to the improvement and high dissimilarity. Anderson's discriminant analysis showed that the genotype classification was 100% correct, indicating the efficiency of the methodology used.

Download Full-text