Statistical Physics Theory of Supervised Learning and Generalization

Supervised learning in the presence of concept drift: a modelling framework

Neural Computing and Applications ◽

10.1007/s00521-021-06035-1 ◽

2021 ◽

Author(s):

M. Straat ◽

F. Abadi ◽

Z. Kan ◽

C. Göpfert ◽

B. Hammer ◽

...

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Statistical Physics ◽

Concept Drift ◽

Activation Function ◽

High Dimensional ◽

Weight Decay ◽

Modelling Framework ◽

Different Types ◽

Gradient Based

AbstractWe present a modelling framework for the investigation of supervised learning in non-stationary environments. Specifically, we model two example types of learning systems: prototype-based learning vector quantization (LVQ) for classification and shallow, layered neural networks for regression tasks. We investigate so-called student–teacher scenarios in which the systems are trained from a stream of high-dimensional, labeled data. Properties of the target task are considered to be non-stationary due to drift processes while the training is performed. Different types of concept drift are studied, which affect the density of example inputs only, the target rule itself, or both. By applying methods from statistical physics, we develop a modelling framework for the mathematical analysis of the training dynamics in non-stationary environments. Our results show that standard LVQ algorithms are already suitable for the training in non-stationary environments to a certain extent. However, the application of weight decay as an explicit mechanism of forgetting does not improve the performance under the considered drift processes. Furthermore, we investigate gradient-based training of layered neural networks with sigmoidal activation functions and compare with the use of rectified linear units. Our findings show that the sensitivity to concept drift and the effectiveness of weight decay differs significantly between the two types of activation function.

Download Full-text

Dynamics of Visual Supervised Learning: A Statistical-Physics Approach

Perception ◽

10.1068/v96l0709 ◽

1996 ◽

Vol 25 (1_suppl) ◽

pp. 165-165

Author(s):

A Unzicker ◽

M Jüttner ◽

I Rentschler

Keyword(s):

Supervised Learning ◽

Statistical Physics ◽

Energy Gap ◽

Feature Space ◽

Mathematical Structure ◽

Thermodynamic System ◽

Classification Performance ◽

Classification Error ◽

Wide Range ◽

Supervised Learning And Classification

We analysed human supervised learning and classification performance for compound Gabor gray-level patterns. We found that internal visual representations for supervised learning and classification may not be constructed in a smooth process of gradual development (Jüttner and Rentschler, 1996 Vision Research in press). Rather, it seemed that certain learning states (‘stereotypes’) recur that may be considered as ‘perceptual hypotheses’. Such effects have a transient character and cannot, therefore, be studied on the basis of cumulative learning data, which allow smoothing at the expense of temporal resolution. Thus, we analyse classification behaviour in terms of the evolution of a thermodynamic system, that is a system characterised by Gibbs statistics. Here it is assumed that a classification error occurs when a noise-influenced decision process passes an ‘energy gap’ related to the distance of signals in feature space. This approach has been extended to a wide range of distance-based models, originated by different fields, such as classical psychometrics, signal detection theory, technical pattern recognition, and connectionism. We made use of the finding that all these models can be related to a uniform mathematical structure (Unzicker et al, 1995 Perception24 Supplement, 95). The subjects' performance can then be described as a cooling process that reveals adaptive feature extraction during learning.

Download Full-text