scholarly journals Reverse-Engineering Neural Networks to Characterize Their Cost Functions

2020 ◽  
Vol 32 (11) ◽  
pp. 2085-2121
Author(s):  
Takuya Isomura ◽  
Karl Friston

This letter considers a class of biologically plausible cost functions for neural networks, where the same cost function is minimized by both neural activity and plasticity. We show that such cost functions can be cast as a variational bound on model evidence under an implicit generative model. Using generative models based on partially observed Markov decision processes (POMDP), we show that neural activity and plasticity perform Bayesian inference and learning, respectively, by maximizing model evidence. Using mathematical and numerical analyses, we establish the formal equivalence between neural network cost functions and variational free energy under some prior beliefs about latent states that generate inputs. These prior beliefs are determined by particular constants (e.g., thresholds) that define the cost function. This means that the Bayes optimal encoding of latent or hidden states is achieved when the network's implicit priors match the process that generates its inputs. This equivalence is potentially important because it suggests that any hyperparameter of a neural network can itself be optimized—by minimization with respect to variational free energy. Furthermore, it enables one to characterize a neural network formally, in terms of its prior beliefs.

2019 ◽  
Author(s):  
Takuya Isomura ◽  
Karl Friston

AbstractThis work considers a class of biologically plausible cost functions for neural networks, where the same cost function is minimised by both neural activity and plasticity. We show that such cost functions can be cast as a variational bound on model evidence under an implicit generative model. Using generative models based on Markov decision processes (MDP), we show, analytically, that neural activity and plasticity perform Bayesian inference and learning, respectively, by maximising model evidence. Using mathematical and numerical analyses, we then confirm that biologically plausible cost functions—used in neural networks—correspond to variational free energy under some prior beliefs about the prevalence of latent states that generate inputs. These prior beliefs are determined by particular constants (i.e., thresholds) that define the cost function. This means that the Bayes optimal encoding of latent or hidden states is achieved when, and only when, the network’s implicit priors match the process that generates the inputs. Our results suggest that when a neural network minimises its cost function, it is implicitly minimising variational free energy under optimal or sub-optimal prior beliefs. This insight is potentially important because it suggests that any free parameter of a neural network’s cost function can itself be optimised—by minimisation with respect to variational free energy.


2002 ◽  
Vol 12 (01) ◽  
pp. 31-43 ◽  
Author(s):  
GARY YEN ◽  
HAIMING LU

In this paper, we propose a genetic algorithm based design procedure for a multi-layer feed-forward neural network. A hierarchical genetic algorithm is used to evolve both the neural network's topology and weighting parameters. Compared with traditional genetic algorithm based designs for neural networks, the hierarchical approach addresses several deficiencies, including a feasibility check highlighted in literature. A multi-objective cost function is used herein to optimize the performance and topology of the evolved neural network simultaneously. In the prediction of Mackey–Glass chaotic time series, the networks designed by the proposed approach prove to be competitive, or even superior, to traditional learning algorithms for the multi-layer Perceptron networks and radial-basis function networks. Based upon the chosen cost function, a linear weight combination decision-making approach has been applied to derive an approximated Pareto-optimal solution set. Therefore, designing a set of neural networks can be considered as solving a two-objective optimization problem.


2017 ◽  
Author(s):  
Michelle J Wu ◽  
Johan OL Andreasson ◽  
Wipapat Kladwang ◽  
William J Greenleaf ◽  
Rhiju Das ◽  
...  

AbstractRNA is a functionally versatile molecule that plays key roles in genetic regulation and in emerging technologies to control biological processes. Computational models of RNA secondary structure are well-developed but often fall short in making quantitative predictions of the behavior of multi-RNA complexes. Recently, large datasets characterizing hundreds of thousands of individual RNA complexes have emerged as rich sources of information about RNA energetics. Meanwhile, advances in machine learning have enabled the training of complex neural networks from large datasets. Here, we assess whether a recurrent neural network model, Ribonet, can learn from high-throughput binding data, using simulation and experimental studies to test model accuracy but also determine if they learned meaningful information about the biophysics of RNA folding. We began by evaluating the model on energetic values predicted by the Turner model to assess whether the neural network could learn a representation that recovered known biophysical principles. First, we trained Ribonet to predict the simulated free energy of an RNA in complex with multiple input RNAs. Our model accurately predicts free energies of new sequences but also shows evidence of having learned base pairing information, as assessed by in silico double mutant analysis. Next, we extended this model to predict the simulated affinity between an arbitrary RNA sequence and a reporter RNA. While these more indirect measurements precluded the learning of basic principles of RNA biophysics, the resulting model achieved sub-kcal/mol accuracy and enabled design of simple RNA input responsive riboswitches with high activation ratios predicted by the Turner model from which the training data were generated. Finally, we compiled and trained on an experimental dataset comprising over 600,000 experimental affinity measurements published on the Eterna open laboratory. Though our tests revealed that the model likely did not learn a physically realistic representation of RNA interactions, it nevertheless achieved good performance of 0.76 kcal/mol on test sets with the application of transfer learning and novel sequence-specific data augmentation strategies. These results suggest that recurrent neural network architectures, despite being naïve to the physics of RNA folding, have the potential to capture complex biophysical information. However, more diverse datasets, ideally involving more direct free energy measurements, may be necessary to train de novo predictive models that are consistent with the fundamentals of RNA biophysics.Author SummaryThe precise design of RNA interactions is essential to gaining greater control over RNA-based biotechnology tools, including designer riboswitches and CRISPR-Cas9 gene editing. However, the classic model for energetics governing these interactions fails to quantitatively predict the behavior of RNA molecules. We developed a recurrent neural network model, Ribonet, to quantitatively predict these values from sequence alone. Using simulated data, we show that this model is able to learn simple base pairing rules, despite having no a priori knowledge about RNA folding encoded in the network architecture. This model also enables design of new switching RNAs that are predicted to be effective by the “ground truth” simulated model. We applied transfer learning to retrain Ribonet using hundreds of thousands of RNA-RNA affinity measurements and demonstrate simple data augmentation techniques that improve model performance. At the same time, data diversity currently available set limits on Ribonet’s accuracy. Recurrent neural networks are a promising tool for modeling nucleic acid biophysics and may enable design of complex RNAs for novel applications.


Author(s):  
M Venkata Krishna Reddy* ◽  
Pradeep S.

1. Bilal, A. Jourabloo, M. Ye, X. Liu, and L. Ren. Do Convolutional Neural Networks Learn Class Hierarchy? IEEE Transactions on Visualization and Computer Graphics, 24(1):152–162, Jan. 2018. 2. M. Carney, B. Webster, I. Alvarado, K. Phillips, N. Howell, J. Griffith, J. Jongejan, A. Pitaru, and A. Chen. Teachable Machine: Approachable Web-Based Tool for Exploring Machine Learning Classification. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems, CHI ’20. ACM, Honolulu, HI, USA, 2020. 3. A. Karpathy. CS231n Convolutional Neural Networks for Visual Recognition, 2016 4. M. Kahng, N. Thorat, D. H. Chau, F. B. Viegas, and M. Wattenberg. GANLab: Understanding Complex Deep Generative Models using Interactive Visual Experimentation. IEEE Transactions on Visualization and Computer Graphics, 25(1):310–320, Jan. 2019. 5. J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, and H. Lipson. Understanding Neural Networks Through Deep Visualization. In ICML Deep Learning Workshop, 2015 6. M. Kahng, P. Y. Andrews, A. Kalro, and D. H. Chau. ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models. IEEE Transactions on Visualization and Computer Graphics, 24(1):88–97, Jan. 2018. 7. https://cs231n.github.io/convolutional-networks/ 8. https://www.analyticsvidhya.com/blog/2020/02/learn-imageclassification-cnn-convolutional-neural-networks-3-datasets/ 9. https://towardsdatascience.com/understanding-cnn-convolutionalneural- network-69fd626ee7d4 10. https://medium.com/@birdortyedi_23820/deep-learning-lab-episode-2- cifar- 10-631aea84f11e 11. J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, T. Liu, X. Wang, G. Wang, J. Cai, and T. Chen. Recent advances in convolutional neural networks. Pattern Recognition, 77:354–377, May 2018. 12. Hamid, Y., Shah, F.A. and Sugumaram, M. (2014), ―Wavelet neural network model for network intrusion detection system‖, International Journal of Information Technology, Vol. 11 No. 2, pp. 251-263 13. G Sreeram , S Pradeep, K SrinivasRao , B.Deevan Raju , Parveen Nikhat , ― Moving ridge neuronal espionage network simulation for reticulum invasion sensing‖. International Journal of Pervasive Computing and Communications.https://doi.org/10.1108/IJPCC-05- 2020-0036 14. E. Stevens, L. Antiga, and T. Viehmann. Deep Learning with PyTorch. O’Reilly Media, 2019. 15. J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, and H. Lipson. Understanding Neural Networks Through Deep Visualization. In ICML Deep Learning Workshop, 2015. 16. Aman Dureja, Payal Pahwa, ―Analysis of Non-Linear Activation Functions for Classification Tasks Using Convolutional Neural Networks‖, Recent Advances in Computer Science , Vol 2, Issue 3, 2019 ,PP-156-161 17. https://missinglink.ai/guides/neural-network-concepts/7-types-neuralnetwork-activation-functions-right/


Author(s):  
TAO WANG ◽  
XIAOLIANG XING ◽  
XINHUA ZHUANG

In this paper, we describe an optimal learning algorithm for designing one-layer neural networks by means of global minimization. Taking the properties of a well-defined neural network into account, we derive a cost function to measure the goodness of the network quantitatively. The connection weights are determined by the gradient descent rule to minimize the cost function. The optimal learning algorithm is formed as either the unconstraint-based or the constraint-based minimization problem. It ensures the realization of each desired associative mapping with the best noise reduction ability in the sense of optimization. We also investigate the storage capacity of the neural network, the degree of noise reduction for a desired associative mapping, and the convergence of the learning algorithm in an analytic way. Finally, a large number of computer experimental results are presented.


1990 ◽  
Vol 01 (03) ◽  
pp. 237-245 ◽  
Author(s):  
Edgardo A. Ferrán ◽  
Roberto P. J. Perazzo

A model is proposed in which the synaptic efficacies of a feedforward neural network are adapted with a cost function that vanishes if the boolean function that is represented has the same symmetry properties as the target one. The function chosen according to this procedure is thus taken as an archetype of the whole symmetry class. Several examples are presented showing how this type of partial learning can produce a behaviour of the net that is reminiscent of that of dyslexic persons.


2021 ◽  
Vol 2094 (3) ◽  
pp. 032041
Author(s):  
S I Bartsev ◽  
G M Markova

Abstract The study is concerned with the comparison of two methods for identification of stimulus received by artificial neural network using neural activity pattern that corresponds to the period of storing information about this stimulus in the working memory. We used simple recurrent neural networks learned to pass the delayed matching-to-sample test. Neural activity was detected at the period of pause between receiving stimuli. The analysis of neural excitation patterns showed that neural networks encoded variables that were relevant for the task during the delayed matching-to-sample test, and their activity patterns were dynamic. The method of centroids allowed identifying the type of the received stimuli with efficiency up to 75% while the method of neural network-based decoder showed 100% efficiency. In addition, this method was applied to determine the minimal set of neurons whose activity was the most significant for stimulus recognition.


Entropy ◽  
2021 ◽  
Vol 23 (11) ◽  
pp. 1481
Author(s):  
Yang Sun ◽  
Hangdong Zhao ◽  
Jonathan Scarlett

In recent years, neural network based image priors have been shown to be highly effective for linear inverse problems, often significantly outperforming conventional methods that are based on sparsity and related notions. While pre-trained generative models are perhaps the most common, it has additionally been shown that even untrained neural networks can serve as excellent priors in various imaging applications. In this paper, we seek to broaden the applicability and understanding of untrained neural network priors by investigating the interaction between architecture selection, measurement models (e.g., inpainting vs. denoising vs. compressive sensing), and signal types (e.g., smooth vs. erratic). We motivate the problem via statistical learning theory, and provide two practical algorithms for tuning architectural hyperparameters. Using experimental evaluations, we demonstrate that the optimal hyperparameters may vary significantly between tasks and can exhibit large performance gaps when tuned for the wrong task. In addition, we investigate which hyperparameters tend to be more important, and which are robust to deviations from the optimum.


Author(s):  
Takuya Isomura

The mutual information between the state of a neural network and the state of the external world represents the amount of information stored in the neural network that is associated with the external world. In contrast, the surprise of the sensory input indicates the unpredictability of the current input. In other words, this is a measure of inference ability, and an upper bound of the surprise is known as the variational free energy. According to the free-energy principle (FEP), a neural network continuously minimizes the free energy to perceive the external world. For the survival of animals, inference ability is considered to be more important than simply memorized information. In this study, the free energy is shown to represent the gap between the amount of information stored in the neural network and that available for inference. This concept involves both the FEP and the infomax principle, and will be a useful measure for quantifying the amount of information available for inference.


Sign in / Sign up

Export Citation Format

Share Document