An Exponential Tail Bound for the Deleted Estimate

There is an accumulating evidence in the literature that stability of learning algorithms is a key characteristic that permits a learning algorithm to generalize. Despite various insightful results in this direction, there seems to be an overlooked dichotomy in the type of stability-based generalization bounds we have in the literature. On one hand, the literature seems to suggest that exponential generalization bounds for the estimated risk, which are optimal, can be only obtained through stringent, distribution independent and computationally intractable notions of stability such as uniform stability. On the other hand, it seems that weaker notions of stability such as hypothesis stability, although it is distribution dependent and more amenable to computation, can only yield polynomial generalization bounds for the estimated risk, which are suboptimal. In this paper, we address the gap between these two regimes of results. In particular, the main question we address here is whether it is possible to derive exponential generalization bounds for the estimated risk using a notion of stability that is computationally tractable and distribution dependent, but weaker than uniform stability. Using recent advances in concentration inequalities, and using a notion of stability that is weaker than uniform stability but distribution dependent and amenable to computation, we derive an exponential tail bound for the concentration of the estimated risk of a hypothesis returned by a general learning rule, where the estimated risk is expressed in terms of the deleted estimate. Interestingly, we note that our final bound has similarities to previous exponential generalization bounds for the deleted estimate, in particular, the result of Bousquet and Elisseeff (2002) for the regression case.

Download Full-text

Unsupervised learning by competing hidden units

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1820458116 ◽

2019 ◽

Vol 116 (16) ◽

pp. 7723-7731 ◽

Cited By ~ 16

Author(s):

Dmitry Krotov ◽

John J. Hopfield

Keyword(s):

Learning Algorithm ◽

Lower Layer ◽

Learning Rule ◽

Backpropagation Algorithm ◽

Feedforward Networks ◽

Feature Detectors ◽

End To End ◽

Hidden Layer ◽

Full Network ◽

Global Inhibition

It is widely believed that end-to-end training with the backpropagation algorithm is essential for learning good feature detectors in early layers of artificial neural networks, so that these detectors are useful for the task performed by the higher layers of that neural network. At the same time, the traditional form of backpropagation is biologically implausible. In the present paper we propose an unusual learning rule, which has a degree of biological plausibility and which is motivated by Hebb’s idea that change of the synapse strength should be local—i.e., should depend only on the activities of the pre- and postsynaptic neurons. We design a learning algorithm that utilizes global inhibition in the hidden layer and is capable of learning early feature detectors in a completely unsupervised way. These learned lower-layer feature detectors can be used to train higher-layer weights in a usual supervised way so that the performance of the full network is comparable to the performance of standard feedforward networks trained end-to-end with a backpropagation algorithm on simple tasks.

Download Full-text

Modeling The Tower Of Hanoi Using Neural Network

Jurnal Teknologi ◽

10.11113/jt.v21.1071 ◽

1993 ◽

pp. 47-56

Author(s):

Mohamed Othman ◽

Mohd. Hassan Selamat ◽

Zaiton Muda ◽

Lili Norliya Abdullah

Keyword(s):

Neural Network ◽

Knowledge Representation ◽

Network Architecture ◽

Learning Algorithm ◽

Learning Rule ◽

Neural System ◽

Neural Systems ◽

Tower Of Hanoi ◽

Artificial Neural ◽

The Given

This paper discusses the modeling of Tower of Hanoi using the concepts of neural network. The basis idea of backpropagation learning algorithm in Artificial Neural Systems is then described. While similar in some ways, Artificial Neural System learning deviates from tradition in its dependence on the modification of individual weights to bring about changes in a knowledge representation distributed across connection in a network. This unique form of learning is analyzed from two aspects: the selection of an appropriate network architecture for representing the problem, and the choice of a suitable learning rule capable qf reproducing the desired function within the given network. Key words: Tower of Hanoi; Backpropagation Algorithm; Knowledge Representation;

Download Full-text

Self-Organizing Neural Networks Based on OxRAM Devices under a Fully Unsupervised Training Scheme

Materials ◽

10.3390/ma12213482 ◽

2019 ◽

Vol 12 (21) ◽

pp. 3482 ◽

Cited By ~ 3

Author(s):

Marta Pedró ◽

Javier Martín-Martínez ◽

Marcos Maestro-Izquierdo ◽

Rosana Rodríguez ◽

Montserrat Nafría

Keyword(s):

Learning Algorithm ◽

Random Access ◽

Learning Rule ◽

Resistive Random Access Memory ◽

Spike Timing ◽

Self Organization ◽

Access Memory ◽

Training Scheme ◽

Unsupervised Training ◽

Neuromorphic System

A fully-unsupervised learning algorithm for reaching self-organization in neuromorphic architectures is provided in this work. We experimentally demonstrate spike-timing dependent plasticity (STDP) in Oxide-based Resistive Random Access Memory (OxRAM) devices, and propose a set of waveforms in order to induce symmetric conductivity changes. An empirical model is used to describe the observed plasticity. A neuromorphic system based on the tested devices is simulated, where the developed learning algorithm is tested, involving STDP as the local learning rule. The design of the system and learning scheme permits to concatenate multiple neuromorphic layers, where autonomous hierarchical computing can be performed.

Download Full-text

Improved SpikeProp for Using Particle Swarm Optimization

Mathematical Problems in Engineering ◽

10.1155/2013/257085 ◽

2013 ◽

Vol 2013 ◽

pp. 1-13 ◽

Cited By ~ 6

Author(s):

Falah Y. H. Ahmed ◽

Siti Mariyam Shamsuddin ◽

Siti Zaiton Mohd Hashim

Keyword(s):

Particle Swarm Optimization ◽

Supervised Learning ◽

Learning Algorithm ◽

Particle Swarm ◽

Learning Rule ◽

Temporal Coding ◽

Spiking Neurons ◽

Learning Enhancement ◽

Swarm Optimization ◽

Spiking Networks

A spiking neurons network encodes information in the timing of individual spike times. A novel supervised learning rule for SpikeProp is derived to overcome the discontinuities introduced by the spiking thresholding. This algorithm is based on an error-backpropagation learning rule suited for supervised learning of spiking neurons that use exact spike time coding. The SpikeProp is able to demonstrate the spiking neurons that can perform complex nonlinear classification in fast temporal coding. This study proposes enhancements of SpikeProp learning algorithm for supervised training of spiking networks which can deal with complex patterns. The proposed methods include the SpikeProp particle swarm optimization (PSO) and angle driven dependency learning rate. These methods are presented to SpikeProp network for multilayer learning enhancement and weights optimization. Input and output patterns are encoded as spike trains of precisely timed spikes, and the network learns to transform the input trains into target output trains. With these enhancements, our proposed methods outperformed other conventional neural network architectures.

Download Full-text

A Cross-Correlated Delay Shift Supervised Learning Method for Spiking Neurons with Application to Interictal Spike Detection in Epilepsy

International Journal of Neural Systems ◽

10.1142/s0129065717500022 ◽

2017 ◽

Vol 27 (03) ◽

pp. 1750002 ◽

Cited By ~ 24

Author(s):

Lilin Guo ◽

Zhenzhong Wang ◽

Mercedes Cabrerizo ◽

Malek Adjouadi

Keyword(s):

Adaptive Learning ◽

Learning Algorithm ◽

Recognition Performance ◽

Learning Rule ◽

Classification Performance ◽

Spiking Neurons ◽

Learning Performance ◽

Learning Method ◽

Neuron Models ◽

Interictal Spike

This study introduces a novel learning algorithm for spiking neurons, called CCDS, which is able to learn and reproduce arbitrary spike patterns in a supervised fashion allowing the processing of spatiotemporal information encoded in the precise timing of spikes. Unlike the Remote Supervised Method (ReSuMe), synapse delays and axonal delays in CCDS are variants which are modulated together with weights during learning. The CCDS rule is both biologically plausible and computationally efficient. The properties of this learning rule are investigated extensively through experimental evaluations in terms of reliability, adaptive learning performance, generality to different neuron models, learning in the presence of noise, effects of its learning parameters and classification performance. Results presented show that the CCDS learning method achieves learning accuracy and learning speed comparable with ReSuMe, but improves classification accuracy when compared to both the Spike Pattern Association Neuron (SPAN) learning rule and the Tempotron learning rule. The merit of CCDS rule is further validated on a practical example involving the automated detection of interictal spikes in EEG records of patients with epilepsy. Results again show that with proper encoding, the CCDS rule achieves good recognition performance.

Download Full-text

A New Supervised Learning Algorithm for Spiking Neurons

Neural Computation ◽

10.1162/neco_a_00450 ◽

2013 ◽

Vol 25 (6) ◽

pp. 1472-1511 ◽

Cited By ~ 39

Author(s):

Yan Xu ◽

Xiaoqin Zeng ◽

Shuiming Zhong

Keyword(s):

Supervised Learning ◽

Learning Algorithm ◽

Learning Rule ◽

Classification Problem ◽

Higher Learning ◽

Spiking Neurons ◽

Spiking Neuron ◽

Temporal Encoding ◽

The Times ◽

Perceptron Learning

The purpose of supervised learning with temporal encoding for spiking neurons is to make the neurons emit a specific spike train encoded by the precise firing times of spikes. If only running time is considered, the supervised learning for a spiking neuron is equivalent to distinguishing the times of desired output spikes and the other time during the running process of the neuron through adjusting synaptic weights, which can be regarded as a classification problem. Based on this idea, this letter proposes a new supervised learning method for spiking neurons with temporal encoding; it first transforms the supervised learning into a classification problem and then solves the problem by using the perceptron learning rule. The experiment results show that the proposed method has higher learning accuracy and efficiency over the existing learning methods, so it is more powerful for solving complex and real-time problems.

Download Full-text

Brain networks for confidence weighting and hierarchical inference during probabilistic learning

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1615773114 ◽

2017 ◽

Vol 114 (19) ◽

pp. E3859-E3868 ◽

Cited By ~ 39

Author(s):

Florent Meyniel ◽

Stanislas Dehaene

Keyword(s):

Transition Probabilities ◽

Learning Algorithm ◽

Transition Probability ◽

Inferior Frontal Gyrus ◽

Learning Rule ◽

Accurate Estimate ◽

Ideal Observer ◽

Probabilistic Learning ◽

Delta Rule ◽

The Right

Learning is difficult when the world fluctuates randomly and ceaselessly. Classical learning algorithms, such as the delta rule with constant learning rate, are not optimal. Mathematically, the optimal learning rule requires weighting prior knowledge and incoming evidence according to their respective reliabilities. This “confidence weighting” implies the maintenance of an accurate estimate of the reliability of what has been learned. Here, using fMRI and an ideal-observer analysis, we demonstrate that the brain’s learning algorithm relies on confidence weighting. While in the fMRI scanner, human adults attempted to learn the transition probabilities underlying an auditory or visual sequence, and reported their confidence in those estimates. They knew that these transition probabilities could change simultaneously at unpredicted moments, and therefore that the learning problem was inherently hierarchical. Subjective confidence reports tightly followed the predictions derived from the ideal observer. In particular, subjects managed to attach distinct levels of confidence to each learned transition probability, as required by Bayes-optimal inference. Distinct brain areas tracked the likelihood of new observations given current predictions, and the confidence in those predictions. Both signals were combined in the right inferior frontal gyrus, where they operated in agreement with the confidence-weighting model. This brain region also presented signatures of a hierarchical process that disentangles distinct sources of uncertainty. Together, our results provide evidence that the sense of confidence is an essential ingredient of probabilistic learning in the human brain, and that the right inferior frontal gyrus hosts a confidence-based statistical learning algorithm for auditory and visual sequences.

Download Full-text

An Empirical Investigation Into Deep and Shallow Rule Learning

Frontiers in Artificial Intelligence ◽

10.3389/frai.2021.689398 ◽

2021 ◽

Vol 4 ◽

Author(s):

Florian Beck ◽

Johannes Fürnkranz

Keyword(s):

Learning Algorithm ◽

State Of The Art ◽

Rule Learning ◽

Learning Rule ◽

Disjunctive Normal Form ◽

Universal Function ◽

Point Of View ◽

Positive Class ◽

Rule Sets ◽

Inductive Rule Learning

Inductive rule learning is arguably among the most traditional paradigms in machine learning. Although we have seen considerable progress over the years in learning rule-based theories, all state-of-the-art learners still learn descriptions that directly relate the input features to the target concept. In the simplest case, concept learning, this is a disjunctive normal form (DNF) description of the positive class. While it is clear that this is sufficient from a logical point of view because every logical expression can be reduced to an equivalent DNF expression, it could nevertheless be the case that more structured representations, which form deep theories by forming intermediate concepts, could be easier to learn, in very much the same way as deep neural networks are able to outperform shallow networks, even though the latter are also universal function approximators. However, there are several non-trivial obstacles that need to be overcome before a sufficiently powerful deep rule learning algorithm could be developed and be compared to the state-of-the-art in inductive rule learning. In this paper, we therefore take a different approach: we empirically compare deep and shallow rule sets that have been optimized with a uniform general mini-batch based optimization algorithm. In our experiments on both artificial and real-world benchmark data, deep rule networks outperformed their shallow counterparts, which we take as an indication that it is worth-while to devote more efforts to learning deep rule structures from data.

Download Full-text

GENERALIZATION BOUNDS OF REGULARIZATION ALGORITHMS DERIVED SIMULTANEOUSLY THROUGH HYPOTHESIS SPACE COMPLEXITY, ALGORITHMIC STABILITY AND DATA QUALITY

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691311004213 ◽

2011 ◽

Vol 09 (04) ◽

pp. 549-570 ◽

Cited By ~ 4

Author(s):

XIANGYU CHANG ◽

ZONGBEN XU ◽

BIN ZOU ◽

HAI ZHANG

Keyword(s):

Data Quality ◽

Learning Algorithm ◽

Regularization Parameter ◽

Space Complexity ◽

Support Vector ◽

Generalization Performance ◽

Practical Applications ◽

Generalization Bounds ◽

Hypothesis Space ◽

Algorithmic Stability

A main issue in machine learning research is to analyze the generalization performance of a learning machine. Most classical results on the generalization performance of regularization algorithms are derived merely with the complexity of hypothesis space or the stability property of a learning algorithm. However, in practical applications, the performance of a learning algorithm is not actually affected only by an unitary factor just like the complexity of hypothesis space, stability of the algorithm and data quality. Therefore, in this paper, we develop a framework of evaluating the generalization performance of regularization algorithms combinatively in terms of hypothesis space complexity, algorithmic stability and data quality. We establish new bounds on the learning rate of regularization algorithms based on the measure of uniform stability and empirical covering number for general type of loss functions. As applications of the generic results, we evaluate the learning rates of support vector machines and regularization networks, and propose a new strategy for regularization parameter setting.

Download Full-text

Artificial Neural Network-Based Automated ECG Signal Classifier

ISRN Biomedical Engineering ◽

10.1155/2013/261917 ◽

2013 ◽

Vol 2013 ◽

pp. 1-6 ◽

Cited By ~ 14

Author(s):

Sahar H. El-Khafif ◽

Mohamed A. El-Brawany

Keyword(s):

Neural Network ◽

Heart Disease ◽

Ischemic Heart Disease ◽

Learning Algorithm ◽

Normal Sinus Rhythm ◽

Learning Rule ◽

Higher Order ◽

Ischemic Heart ◽

Ecg Signal ◽

Ecg Signals

The ECG signal is well known for its nonlinear dynamic behavior and a key characteristic that is utilized in this research; the nonlinear component of its dynamics changes more significantly between normal and abnormal conditions than does the linear one. As the higher-order statistics (HOS) preserve phase information, this study makes use of one-dimensional slices from the higher-order spectral domain of normal and ischemic subjects. A feedforward multilayer neural network (NN) with error back-propagation (BP) learning algorithm was used as an automated ECG classifier to investigate the possibility of recognizing ischemic heart disease from normal ECG signals. Different NN structures are tested using two data sets extracted from polyspectrum slices and polycoherence indices of the ECG signals. ECG signals from the MIT/BIH CD-ROM, the Normal Sinus Rhythm Database (NSR-DB), and European ST-T database have been utilized in this paper. The best classification rates obtained are 93% and 91.9% using EDBD learning rule with two hidden layers for the first structure and one hidden layer for the second structure, respectively. The results successfully showed that the presented NN-based classifier can be used for diagnosis of ischemic heart disease.

Download Full-text