scholarly journals The Eighty Five Percent Rule for Optimal Learning

2018 ◽  
Author(s):  
Robert C. Wilson ◽  
Amitai Shenhav ◽  
Mark Straccia ◽  
Jonathan D. Cohen

AbstractResearchers and educators have long wrestled with the question of how best to teach their clients be they human, animal or machine. Here we focus on the role of a single variable, the difficulty of training, and examine its effect on the rate of learning. In many situations we find that there is a sweet spot in which training is neither too easy nor too hard, and where learning progresses most quickly. We derive conditions for this sweet spot for a broad class of learning algorithms in the context of binary classification tasks, in which ambiguous stimuli must be sorted into one of two classes. For all of these gradient-descent based learning algorithms we find that the optimal error rate for training is around 15.87% or, conversely, that the optimal training accuracy is about 85%. We demonstrate the efficacy of this ‘Eighty Five Percent Rule’ for artificial neural networks used in AI and biologically plausible neural networks thought to describe human and animal learning.

2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Robert C. Wilson ◽  
Amitai Shenhav ◽  
Mark Straccia ◽  
Jonathan D. Cohen

Abstract Researchers and educators have long wrestled with the question of how best to teach their clients be they humans, non-human animals or machines. Here, we examine the role of a single variable, the difficulty of training, on the rate of learning. In many situations we find that there is a sweet spot in which training is neither too easy nor too hard, and where learning progresses most quickly. We derive conditions for this sweet spot for a broad class of learning algorithms in the context of binary classification tasks. For all of these stochastic gradient-descent based learning algorithms, we find that the optimal error rate for training is around 15.87% or, conversely, that the optimal training accuracy is about 85%. We demonstrate the efficacy of this ‘Eighty Five Percent Rule’ for artificial neural networks used in AI and biologically plausible neural networks thought to describe animal learning.


Author(s):  
TAO WANG ◽  
XIAOLIANG XING ◽  
XINHUA ZHUANG

In this paper, we describe an optimal learning algorithm for designing one-layer neural networks by means of global minimization. Taking the properties of a well-defined neural network into account, we derive a cost function to measure the goodness of the network quantitatively. The connection weights are determined by the gradient descent rule to minimize the cost function. The optimal learning algorithm is formed as either the unconstraint-based or the constraint-based minimization problem. It ensures the realization of each desired associative mapping with the best noise reduction ability in the sense of optimization. We also investigate the storage capacity of the neural network, the degree of noise reduction for a desired associative mapping, and the convergence of the learning algorithm in an analytic way. Finally, a large number of computer experimental results are presented.


2021 ◽  
Author(s):  
Anasse HANAFI ◽  
Mohammed BOUHORMA ◽  
Lotfi ELAACHAK

Abstract Machine learning (ML) is a large field of study that overlaps with and inherits ideas from many related fields such as artificial intelligence (AI). The main focus of the field is learning from previous experiences. Classification in ML is a supervised learning method, in which the computer program learns from the data given to it and make new classifications. There are many different types of classification tasks in ML and dedicated approaches to modeling that may be used for each. For example, classification predictive modeling involves assigning a class label to input samples, binary classification refers to predicting one of two classes and multi-class classification involves predicting one of more than two categories. Recurrent Neural Networks (RNNs) are very powerful sequence models for classification problems, however, in this paper, we will use RNNs as generative models, which means they can learn the sequences of a problem and then generate entirely a new sequence for the problem domain, with the hope to better control the output of the generated text, because it is not always possible to learn the exact distribution of the data either implicitly or explicitly.


Entropy ◽  
2020 ◽  
Vol 22 (1) ◽  
pp. 101
Author(s):  
Rita Fioresi ◽  
Pratik Chaudhari ◽  
Stefano Soatto

This paper is a step towards developing a geometric understanding of a popular algorithm for training deep neural networks named stochastic gradient descent (SGD). We built upon a recent result which observed that the noise in SGD while training typical networks is highly non-isotropic. That motivated a deterministic model in which the trajectories of our dynamical systems are described via geodesics of a family of metrics arising from a certain diffusion matrix; namely, the covariance of the stochastic gradients in SGD. Our model is analogous to models in general relativity: the role of the electromagnetic field in the latter is played by the gradient of the loss function of a deep network in the former.


Author(s):  
Rehab M. Duwairi ◽  
Saad A. Al-Zboon ◽  
Rami A. Al-Dwairi ◽  
Ahmad Obaidi

The rapid development of artificial neural network techniques, especially convolutional neural networks, encouraged the researchers to adapt such techniques in the medical domain. Specifically, to provide assist tools to help the professionals in patients’ diagnosis. The main problem faced by the researchers in the medical domain is the lack of available annotated datasets which can be used to train and evaluate large and complex deep neural networks. In this paper, to assist researchers who are interested in applying deep learning techniques to aid the ophthalmologists in diagnosing eye-related diseases, we provide an optical coherence tomography dataset with collaboration with ophthalmologists from the King Abdullah University Hospital, Irbid, Jordan. This dataset consists of 21,991 OCT images distributed over seven eye diseases in addition to normal images (no disease), namely, Choroidal Neovascularisation, Full Macular Hole (Full Thickness), Partial Macular Hole, Central Serous Retinopathy, Geographic atrophy, Macular Retinal Oedema, and Vitreomacular Traction. To the best of our knowledge, this dataset is the largest of its kind, where images belong to actual patients from Jordan and the annotation was carried out by ophthalmologists. Two classification tasks were applied to this dataset; a binary classification to distinguish between images which belong to healthy eyes (normal) and images which belong to diseased eyes (abnormal). The second classification task is a multi-class classification, where the deep neural network is trained to distinguish between the seven diseases listed above in addition to the normal case. In both classification tasks, the U-Net neural network was modified and subsequently utilised. This modification adds an additional block of layers to the original U-Net model to become capable of handling classification as the original network is used for image segmentation. The results of the binary classification were equal to 84.90% and 69.50% as accuracy and quadratic weighted kappa, respectively. The results of the multi-class classification, by contrast, were equal to 63.68% and 66.06% as accuracy and quadratic weighted kappa, respectively.


2004 ◽  
Author(s):  
Lyle E. Bourne ◽  
Alice F. Healy ◽  
James A. Kole ◽  
William D. Raymond

Sign in / Sign up

Export Citation Format

Share Document