Neurons learn by predicting future activity

AbstractPlasticity mechanisms in the brain are still not well understood. Here we demonstrate that the ability of a neuron to predict its future activity may provide an effective mechanism for learning in the brain. We show that comparing a neuron’s predicted activity with the actual activity provides a useful learning signal for modifying synaptic weights. Interestingly, this predictive learning rule can be derived from a metabolic principle, where neurons need to minimize their own synaptic activity (cost), while maximizing their impact on local blood supply by recruiting other neurons. This reveals an unexpected connection that learning in neural networks could result from simply maximizing the energy balance by each neuron. We validated this predictive learning rule in neural network simulations and in data recorded from awake animals. We found that in the sensory cortex it is indeed possible to predict a neuron’s activity ∼10-20ms into the future. Moreover, in response to stimuli, cortical neurons changed their firing rate to minimize surprise: i.e. the difference between actual and expected activity, as predicted by our model. Our results also suggest that spontaneous brain activity provides “training data” for neurons to learn to predict cortical dynamics. Thus, this work demonstrates that the ability of a neuron to predict its future inputs could be an important missing element to understand computation in the brain.Significance statementUnderstanding how the brain learns may lead to machines with human-like intellectual capacities. Donald Hebb proposed the influential idea that the brain’s learning algorithm is based on correlated firing: a.k.a. ‘cells that fire together wire together’. However, Hebb’s rule and other biologically inspired learning algorithms are still likely missing some important components needed to replicate brain learning mechanisms. Here we provide evidence for a predictive learning rule: a neuron predicts its future activity and adjusts its incoming weights to minimize surprise: i.e. the difference between actual and expected activity. Interestingly, we show that such a rule is equivalent to maximizing the neuron’s energy balance, which could be paraphrased as: cells adjust their weights to achieve maximum impact with minimum activity.

Download Full-text

Temporal Event Association and Output-Dependent Learning: A Proposed Scheme of Neural Molecular Connections

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.1999.p0234 ◽

1999 ◽

Vol 3 (4) ◽

pp. 234-244 ◽

Cited By ~ 2

Author(s):

Yukifumi Shigematsu ◽

◽

Hiroshi Okamoto ◽

Kazuhisa Ichikawa ◽

Gen Matsumoto ◽

...

Keyword(s):

Action Potentials ◽

Single Neuron ◽

Calcium Concentration ◽

Molecular Mechanisms ◽

Learning Algorithm ◽

Learning Rule ◽

Voltage Dependent ◽

Temporal Events ◽

Dependent Learning ◽

The Brain

We introduce a model of temporal-event-associated and output-dependent learning rule, genetically acquired and expressed in a single neuron. This is essentially indispensable for the brain to acquire algorithms, how to process its self-selected information, by itself. This proposed learning rule is revised-Hebbian with a synaptic history trace to correlate one temporal event to others. Temporal events are memorized to be expressed at the synaptic site of inputs and in the form of the asymmetric neural strength corrections associated with temporal events. This learning algorithm has an advantage to associate one temporal event with others, resulting in the neuron with predictability but also makes recalling flexible. Re ’ calling is, according to this learning, independent of timing, supposed to be crucial in learning. Underlying molecular mechanisms for our proposed learning rule are discussed and we identify three important factors: 1) the back-propagating action potentials experimentally observed a single neuron play a crucial role for outputdependent learning, 2) temporally associated, nonlinear couplings are modeled at molecular levels with glutamate receptors, voltage-dependent channels, intracellular calcium concentration, protein kinases and phosphatase, and 3) intracellular concentration of inositol-tri-phosphate [IP3] is the memory substrate of synaptic history.

Download Full-text

An Empirical Investigation of Transfer Effects for Reinforcement Learning

Computational Intelligence and Neuroscience ◽

10.1155/2020/8873057 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Jung-Sing Jwo ◽

Ching-Sheng Lin ◽

Cheng-Hsiung Lee ◽

Ya-Ching Lo

Keyword(s):

Reinforcement Learning ◽

Empirical Investigation ◽

Learning Algorithm ◽

Q Learning ◽

Sorting Problem ◽

Long Time ◽

The Difference ◽

Reinforcement Model ◽

The Brain

Previous studies have shown that training a reinforcement model for the sorting problem takes very long time, even for small sets of data. To study whether transfer learning could improve the training process of reinforcement learning, we employ Q-learning as the base of the reinforcement learning algorithm, apply the sorting problem as a case study, and assess the performance from two aspects, the time expense and the brain capacity. We compare the total number of training steps between nontransfer and transfer methods to study the efficiencies and evaluate their differences in brain capacity (i.e., the percentage of the updated Q-values in the Q-table). According to our experimental results, the difference in the total number of training steps will become smaller when the size of the numbers to be sorted increases. Our results also show that the brain capacities of transfer and nontransfer reinforcement learning will be similar when they both reach a similar training level.

Download Full-text

Neuroimaging Biomarker of Major Depressive Disorder

European Psychiatry ◽

10.1016/j.eurpsy.2016.01.1811 ◽

2016 ◽

Vol 33 (S1) ◽

pp. S492-S493

Author(s):

N. Ichikawa ◽

Y. Okamoto ◽

G. Okada ◽

G. Lisi ◽

N. Yahata ◽

...

Keyword(s):

Machine Learning ◽

Major Depressive Disorder ◽

Functional Connectivity ◽

Depressive Disorder ◽

Learning Algorithm ◽

Training Data ◽

Fmri Data ◽

Machine Learning Algorithm ◽

Major Depressive ◽

The Brain

IntroductionRecent studies have shown that it is important to understand the brain mechanism specifically by focusing on the common and unique functional connectivity in each disorder including depression.ObjectivesTo specify the biomarker of major depressive disorder (MDD), we applied the sparse machine learning algorithm to classify several types of affective disorders using the resting state fMRI data collected in multiple sites, and this study shows the results of depression as a part of those results.AimsThe aim of this study is to understand some specific pattern of functional connectivity in MDD, which would support diagnosis of depression and development of focused and personalized treatments in the future.MethodsThe neuroimaging data from patients with major depressive disorder (MDD, n = 100) and healthy control adults (HC: n = 100) from multiple sites were used for the training dataset. A completely separate dataset (n = 16) was kept aside for testing. After all preprocessing of fMRI data, based on one hundred and forty anatomical region of interests (ROIs), 9730 functional connectivities during resting states were prepared as the input of the sparse machine-learning algorithm.ResultsAs results, 20 functional connectivities were selected with the classification performance of Accuracy: 83.0% (Sensitivity: 81.0%, Specificity: 85.0%). The test data, which was completely separate from the training data, showed the performance accuracy of 83.3%.ConclusionsThe selected functional connectivities based on the sparse machine learning algorithm included the brain regions which have been associated with depression.Disclosure of interestThe authors have not supplied their declaration of competing interest.

Download Full-text

Representation Learning for Motor Imagery Recognition with Deep Neural Network

Electronics ◽

10.3390/electronics10020112 ◽

2021 ◽

Vol 10 (2) ◽

pp. 112

Author(s):

Fangzhou Xu ◽

Fenqi Rong ◽

Yunjing Miao ◽

Yanan Sun ◽

Gege Dong ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Motor Imagery ◽

Learning Algorithm ◽

Representation Learning ◽

Training Data ◽

Gradient Boosting ◽

Future Research ◽

Deep Learning Algorithm ◽

The Brain

This study describes a method for classifying electrocorticograms (ECoGs) based on motor imagery (MI) on the brain–computer interface (BCI) system. This method is different from the traditional feature extraction and classification method. In this paper, the proposed method employs the deep learning algorithm for extracting features and the traditional algorithm for classification. Specifically, we mainly use the convolution neural network (CNN) to extract the features from the training data and then classify those features by combing with the gradient boosting (GB) algorithm. The comprehensive study with CNN and GB algorithms will profoundly help us to obtain more feature information from brain activities, enabling us to obtain the classification results from human body actions. The performance of the proposed framework has been evaluated on the dataset I of BCI Competition III. Furthermore, the combination of deep learning and traditional algorithms provides some ideas for future research with the BCI systems.

Download Full-text

Software Defined Network : The Comparison of SVM kernel on DDoS Detection

RSF Conference Series: Engineering and Technology ◽

10.31098/cset.v1i1.413 ◽

2021 ◽

Vol 1 (1) ◽

pp. 281-290

Author(s):

Rifki Indra Perwira ◽

Hari Prapcoyo

Keyword(s):

Network Security ◽

New Technology ◽

Training Data ◽

Software Defined Network ◽

Data Traffic ◽

Ddos Attacks ◽

Ddos Attack ◽

Ddos Detection ◽

The Difference ◽

The Brain

SDN is a new technology in the concept of a network where there is a separation between the data plane and the control plane as the brain that regulates data forwarding so that it becomes a target for DDoS attacks. Detection of DDoS attacks is an important topic in the field of network security. because of the difficulty of detecting the difference between normal traffic and anomalous attacks. Based on data from helpnetsecurity.com, in 2020 there were 4.83 million attempted DoS/DDoS attacks on various services, this shows that network security is very important. Various methods have been used in detecting DDoS attacks such as using a threshold on passing network traffic with an average traffic size compared to 3 times the standard deviation, the weakness of this method is if there is a spike in traffic it will be detected as an attack even though the traffic is normal so that it increases false positives. To maintain security on the SDN network, the reason is that a system is needed that can detect DDoS attacks anomalously by taking advantage of the habits that appear on the system and assuming that if there are deviations from the habits that appear then it is declared a DDoS attack, the SVM method is used to categorize the data traffic obtained from the controller to detect whether it is a DDoS attack or not. Based on the tests conducted with 500 training data, the accuracy is 99,2%. The conclusion of this paper is that the RBF SVM kernel can be very good at detecting anomalous DDoS attacks.

Download Full-text

Scalable Approach to High Coverages on Oxides via Iterative Training of a Machine-Learning Algorithm

10.26434/chemrxiv.10288514.v1 ◽

2019 ◽

Author(s):

Andrew Medford ◽

Shengchun Yang ◽

Fuzhu Liu

Keyword(s):

Machine Learning ◽

Chemical Potential ◽

Learning Algorithm ◽

Absolute Error ◽

Low Energy ◽

Training Data ◽

High Coverage ◽

Metal Compounds ◽

Adsorption Energies ◽

The Stability

Understanding the interaction of multiple types of adsorbate molecules on solid surfaces is crucial to establishing the stability of catalysts under various chemical environments. Computational studies on the high coverage and mixed coverages of reaction intermediates are still challenging, especially for transition-metal compounds. In this work, we present a framework to predict differential adsorption energies and identify low-energy structures under high- and mixed-adsorbate coverages on oxide materials. The approach uses Gaussian process machine-learning models with quantified uncertainty in conjunction with an iterative training algorithm to actively identify the training set. The framework is demonstrated for the mixed adsorption of CHx, NHx and OHx species on the oxygen vacancy and pristine rutile TiO2(110) surface sites. The results indicate that the proposed algorithm is highly efficient at identifying the most valuable training data, and is able to predict differential adsorption energies with a mean absolute error of ~0.3 eV based on <25% of the total DFT data. The algorithm is also used to identify 76% of the low-energy structures based on <30% of the total DFT data, enabling construction of surface phase diagrams that account for high and mixed coverage as a function of the chemical potential of C, H, O, and N. Furthermore, the computational scaling indicates the algorithm scales nearly linearly (N1.12) as the number of adsorbates increases. This framework can be directly extended to metals, metal oxides, and other materials, providing a practical route toward the investigation of the behavior of catalysts under high-coverage conditions.

Download Full-text

MODER2: first-order Markov modeling and discovery of monomeric and dimeric binding motifs

Bioinformatics ◽

10.1093/bioinformatics/btaa045 ◽

2020 ◽

Vol 36 (9) ◽

pp. 2690-2696

Author(s):

Jarkko Toivonen ◽

Pratyush K Das ◽

Jussi Taipale ◽

Esko Ukkonen

Keyword(s):

Markov Models ◽

Expectation Maximization Algorithm ◽

Software Tool ◽

Specific Weight ◽

Training Data ◽

Supplementary Information ◽

Markov Modeling ◽

Binding Motifs ◽

The Difference ◽

Probability Matrices

Abstract Motivation Position-specific probability matrices (PPMs, also called position-specific weight matrices) have been the dominating model for transcription factor (TF)-binding motifs in DNA. There is, however, increasing recent evidence of better performance of higher order models such as Markov models of order one, also called adjacent dinucleotide matrices (ADMs). ADMs can model dependencies between adjacent nucleotides, unlike PPMs. A modeling technique and software tool that would estimate such models simultaneously both for monomers and their dimers have been missing. Results We present an ADM-based mixture model for monomeric and dimeric TF-binding motifs and an expectation maximization algorithm MODER2 for learning such models from training data and seeds. The model is a mixture that includes monomers and dimers, built from the monomers, with a description of the dimeric structure (spacing, orientation). The technique is modular, meaning that the co-operative effect of dimerization is made explicit by evaluating the difference between expected and observed models. The model is validated using HT-SELEX and generated datasets, and by comparing to some earlier PPM and ADM techniques. The ADM models explain data slightly better than PPM models for 314 tested TFs (or their DNA-binding domains) from four families (bHLH, bZIP, ETS and Homeodomain), the ADM mixture models by MODER2 being the best on average. Availability and implementation Software implementation is available from https://github.com/jttoivon/moder2. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Information-Theoretic Generalization Bounds for Meta-Learning and Applications

Entropy ◽

10.3390/e23010126 ◽

2021 ◽

Vol 23 (1) ◽

pp. 126

Author(s):

Sharu Theresa Jose ◽

Osvaldo Simeone

Keyword(s):

Learning Algorithm ◽

Broad Class ◽

Performance Measure ◽

Training Data ◽

Learning To Learn ◽

Data Set ◽

Information Theoretic ◽

Meta Learning ◽

Task Training ◽

Test Sets

Meta-learning, or “learning to learn”, refers to techniques that infer an inductive bias from data corresponding to multiple related tasks with the goal of improving the sample efficiency for new, previously unobserved, tasks. A key performance measure for meta-learning is the meta-generalization gap, that is, the difference between the average loss measured on the meta-training data and on a new, randomly selected task. This paper presents novel information-theoretic upper bounds on the meta-generalization gap. Two broad classes of meta-learning algorithms are considered that use either separate within-task training and test sets, like model agnostic meta-learning (MAML), or joint within-task training and test sets, like reptile. Extending the existing work for conventional learning, an upper bound on the meta-generalization gap is derived for the former class that depends on the mutual information (MI) between the output of the meta-learning algorithm and its input meta-training data. For the latter, the derived bound includes an additional MI between the output of the per-task learning procedure and corresponding data set to capture within-task uncertainty. Tighter bounds are then developed for the two classes via novel individual task MI (ITMI) bounds. Applications of the derived bounds are finally discussed, including a broad class of noisy iterative algorithms for meta-learning.

Download Full-text

Deep learning-based framework for the distinction of membranous nephropathy: a new approach through hyperspectral imagery

BMC Nephrology ◽

10.1186/s12882-021-02421-y ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Tianqi Tu ◽

Xueling Wei ◽

Yue Yang ◽

Nianrong Zhang ◽

Wei Li ◽

...

Keyword(s):

Deep Learning ◽

Renal Biopsy ◽

Membranous Nephropathy ◽

Learning Algorithm ◽

Hyperspectral Imagery ◽

Chinese Patients ◽

Support Vector ◽

Deep Learning Algorithm ◽

The Difference ◽

Complex Deposition

Abstract Background Common subtypes seen in Chinese patients with membranous nephropathy (MN) include idiopathic membranous nephropathy (IMN) and hepatitis B virus-related membranous nephropathy (HBV-MN). However, the morphologic differences are not visible under the light microscope in certain renal biopsy tissues. Methods We propose here a deep learning-based framework for processing hyperspectral images of renal biopsy tissue to define the difference between IMN and HBV-MN based on the component of their immune complex deposition. Results The proposed framework can achieve an overall accuracy of 95.04% in classification, which also leads to better performance than support vector machine (SVM)-based algorithms. Conclusion IMN and HBV-MN can be correctly separated via the deep learning framework using hyperspectral imagery. Our results suggest the potential of the deep learning algorithm as a new method to aid in the diagnosis of MN.

Download Full-text

Event detection of different English data sources based on transfer learning

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189798 ◽

2021 ◽

pp. 1-11

Author(s):

Yanan Huang ◽

Yuji Miao ◽

Zhenjing Da

Keyword(s):

Transfer Learning ◽

Event Detection ◽

Visual Analysis ◽

Learning Algorithm ◽

Data Sources ◽

Data Set ◽

Data Source ◽

Single Data Source ◽

The Difference ◽

Single Data

The methods of multi-modal English event detection under a single data source and isomorphic event detection of different English data sources based on transfer learning still need to be improved. In order to improve the efficiency of English and data source time detection, based on the transfer learning algorithm, this paper proposes multi-modal event detection under a single data source and isomorphic event detection based on transfer learning for different data sources. Moreover, by stacking multiple classification models, this paper makes each feature merge with each other, and conducts confrontation training through the difference between the two classifiers to further make the distribution of different source data similar. In addition, in order to verify the algorithm proposed in this paper, a multi-source English event detection data set is collected through a data collection method. Finally, this paper uses the data set to verify the method proposed in this paper and compare it with the current most mainstream transfer learning methods. Through experimental analysis, convergence analysis, visual analysis and parameter evaluation, the effectiveness of the algorithm proposed in this paper is demonstrated.

Download Full-text