Accelerating DNN Training Through Selective Localized Learning

Frontiers in Neuroscience ◽

10.3389/fnins.2021.759807 ◽

2022 ◽

Vol 15 ◽

Author(s):

Sarada Krithivasan ◽

Sanchari Sen ◽

Swagath Venkataramani ◽

Anand Raghunathan

Keyword(s):

Transition Layer ◽

Hebbian Learning ◽

Low Cost ◽

Mode Selection ◽

Back Propagation ◽

Stochastic Gradient Descent ◽

Selection Algorithm ◽

Weak Supervision ◽

Training Time ◽

Localized Learning

Training Deep Neural Networks (DNNs) places immense compute requirements on the underlying hardware platforms, expending large amounts of time and energy. We propose LoCal+SGD, a new algorithmic approach to accelerate DNN training by selectively combining localized or Hebbian learning within a Stochastic Gradient Descent (SGD) based training framework. Back-propagation is a computationally expensive process that requires 2 Generalized Matrix Multiply (GEMM) operations to compute the error and weight gradients for each layer. We alleviate this by selectively updating some layers' weights using localized learning rules that require only 1 GEMM operation per layer. Further, since localized weight updates are performed during the forward pass itself, the layer activations for such layers do not need to be stored until the backward pass, resulting in a reduced memory footprint. Localized updates can substantially boost training speed, but need to be used judiciously in order to preserve accuracy and convergence. We address this challenge through a Learning Mode Selection Algorithm, which gradually selects and moves layers to localized learning as training progresses. Specifically, for each epoch, the algorithm identifies a Localized→SGD transition layer that delineates the network into two regions. Layers before the transition layer use localized updates, while the transition layer and later layers use gradient-based updates. We propose both static and dynamic approaches to the design of the learning mode selection algorithm. The static algorithm utilizes a pre-defined scheduler function to identify the position of the transition layer, while the dynamic algorithm analyzes the dynamics of the weight updates made to the transition layer to determine how the boundary between SGD and localized updates is shifted in future epochs. We also propose a low-cost weak supervision mechanism that controls the learning rate of localized updates based on the overall training loss. We applied LoCal+SGD to 8 image recognition CNNs (including ResNet50 and MobileNetV2) across 3 datasets (Cifar10, Cifar100, and ImageNet). Our measurements on an Nvidia GTX 1080Ti GPU demonstrate upto 1.5× improvement in end-to-end training time with ~0.5% loss in Top-1 classification accuracy.

Download Full-text

Self-organizing probability neural network-based intelligent non-intrusive load monitoring with applications to low-cost residential measuring devices

Transactions of the Institute of Measurement and Control ◽

10.1177/0142331220950865 ◽

2020 ◽

pp. 014233122095086

Author(s):

Zejian Zhou ◽

Yingmeng Xiang ◽

Hao Xu ◽

Yishen Wang ◽

Di Shi ◽

...

Keyword(s):

Neural Network ◽

Probabilistic Neural Network ◽

Low Cost ◽

Back Propagation ◽

Learning Technologies ◽

Training Time ◽

The Public ◽

Measuring Devices ◽

Load Monitoring ◽

Self Organizing

Non-intrusive load monitoring (NILM) is a critical technique for advanced smart grid management due to the convenience of monitoring and analysing individual appliances’ power consumption in a non-intrusive fashion. Inspired by emerging machine learning technologies, many recent non-intrusive load monitoring studies have adopted artificial neural networks (ANN) to disaggregate appliances’ power from the non-intrusive sensors’ measurements. However, back-propagation ANNs have a very limit ability to disaggregate appliances caused by the great training time and uncertainty of convergence, which are critical flaws for low-cost devices. In this paper, a novel self-organizing probabilistic neural network (SPNN)-based non-intrusive load monitoring algorithm has been developed specifically for low-cost residential measuring devices. The proposed SPNN has been designed to estimate the probability density function classifying the different types of appliances. Compared to back-propagation ANNs, the SPNN requires less iterative synaptic weights update and provides guaranteed convergence. Meanwhile, the novel SPNN has less space complexity when compared with conventional PNNs by the self-organizing mechanism which automatically edits the neuron numbers. These advantages make the algorithm especially favourable to low-cost residential NILM devices. The effectiveness of the proposed algorithm is demonstrated through numerical simulation by using the public REDD dataset. Performance comparisons with well-known benchmark algorithms have also been provided in the experiment section.

Download Full-text

Some statistical and CI models to predict chaotic high-frequency financial data

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189107 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6419-6430

Author(s):

Dusan Marcek

Keyword(s):

Time Series Data ◽

Moving Average ◽

Methodological Approach ◽

Back Propagation ◽

Large Data ◽

Series Data ◽

Data Set ◽

Training Time ◽

Optimal Population ◽

Forecast Time

To forecast time series data, two methodological frameworks of statistical and computational intelligence modelling are considered. The statistical methodological approach is based on the theory of invertible ARIMA (Auto-Regressive Integrated Moving Average) models with Maximum Likelihood (ML) estimating method. As a competitive tool to statistical forecasting models, we use the popular classic neural network (NN) of perceptron type. To train NN, the Back-Propagation (BP) algorithm and heuristics like genetic and micro-genetic algorithm (GA and MGA) are implemented on the large data set. A comparative analysis of selected learning methods is performed and evaluated. From performed experiments we find that the optimal population size will likely be 20 with the lowest training time from all NN trained by the evolutionary algorithms, while the prediction accuracy level is lesser, but still acceptable by managers.

Download Full-text

Parameters or cues?

Bilingualism Language and Cognition ◽

10.1017/s1366728904001233 ◽

2004 ◽

Vol 7 (1) ◽

pp. 35-36 ◽

Cited By ~ 1

Author(s):

BRIAN MACWHINNEY

Keyword(s):

Second Language ◽

Second Language Acquisition ◽

Language Acquisition ◽

Hebbian Learning ◽

Universal Grammar ◽

Back Propagation ◽

Parameter Setting ◽

Functional Categories ◽

The Core ◽

Over Time

Truscott and Sharwood Smith (henceforth T&SS) attempt to show how second language acquisition can occur without any learning. In their APT model, change depends only on the tuning of innate principles through the normal course of processing of L2. There are some features of their model that I find attractive. Specifically, their acceptance of the concepts of competition and activation strength brings them in line with standard processing accounts like the Competition Model (Bates and MacWhinney, 1982; MacWhinney, 1987, in press). At the same time, their reliance on parameters as the core constructs guiding learning leaves this model squarely within the framework of Chomsky's theory of Principles and Parameters (P&P). As such, it stipulates that the specific functional categories of Universal Grammar serve as the fundamental guide to both first and second language acquisition. Like other accounts in the P&P framework, this model attempts to view second language acquisition as involving no real learning beyond the deductive process of parameter-setting based on the detection of certain triggers. The specific innovation of the APT model is that changes in activation strength during processing function as the trigger to the setting of parameters. Unlike other P&P models, APT does not set parameters in an absolute fashion, allowing their activation weight to change by the processing of new input over time. The use of the concept of activation in APT is far more restricted than its use in connectionist models that allow for Hebbian learning, self-organizing features maps, or back-propagation.

Download Full-text

Competitive Learning in a Spiking Neural Network: Towards an Intelligent Pattern Classifier

Sensors ◽

10.3390/s20020500 ◽

2020 ◽

Vol 20 (2) ◽

pp. 500 ◽

Cited By ~ 3

Author(s):

Sergey A. Lobov ◽

Andrey V. Chernyshov ◽

Nadia P. Krilova ◽

Maxim O. Shamshin ◽

Victor B. Kazantsev

Keyword(s):

Supervised Learning ◽

Hebbian Learning ◽

Back Propagation ◽

Spike Timing ◽

Temporal Coding ◽

Competitive Learning ◽

Neuron Activity ◽

Synaptic Competition ◽

Rate Coding ◽

Emg Patterns

One of the modern trends in the design of human–machine interfaces (HMI) is to involve the so called spiking neuron networks (SNNs) in signal processing. The SNNs can be trained by simple and efficient biologically inspired algorithms. In particular, we have shown that sensory neurons in the input layer of SNNs can simultaneously encode the input signal based both on the spiking frequency rate and on varying the latency in generating spikes. In the case of such mixed temporal-rate coding, the SNN should implement learning working properly for both types of coding. Based on this, we investigate how a single neuron can be trained with pure rate and temporal patterns, and then build a universal SNN that is trained using mixed coding. In particular, we study Hebbian and competitive learning in SNN in the context of temporal and rate coding problems. We show that the use of Hebbian learning through pair-based and triplet-based spike timing-dependent plasticity (STDP) rule is accomplishable for temporal coding, but not for rate coding. Synaptic competition inducing depression of poorly used synapses is required to ensure a neural selectivity in the rate coding. This kind of competition can be implemented by the so-called forgetting function that is dependent on neuron activity. We show that coherent use of the triplet-based STDP and synaptic competition with the forgetting function is sufficient for the rate coding. Next, we propose a SNN capable of classifying electromyographical (EMG) patterns using an unsupervised learning procedure. The neuron competition achieved via lateral inhibition ensures the “winner takes all” principle among classifier neurons. The SNN also provides gradual output response dependent on muscular contraction strength. Furthermore, we modify the SNN to implement a supervised learning method based on stimulation of the target classifier neuron synchronously with the network input. In a problem of discrimination of three EMG patterns, the SNN with supervised learning shows median accuracy 99.5% that is close to the result demonstrated by multi-layer perceptron learned by back propagation of an error algorithm.

Download Full-text

A fast mode selection algorithm for transcoding from MPEG-2 to H.264

2011 International Conference on Consumer Electronics, Communications and Networks (CECNet) ◽

10.1109/cecnet.2011.5768175 ◽

2011 ◽

Author(s):

Junli Li ◽

Jianghua Xiang ◽

Yang Lou ◽

Jiebo Huang

Keyword(s):

Mode Selection ◽

Fast Mode ◽

Selection Algorithm

Download Full-text

A Fast Inter Mode Selection Algorithm Based on the Statistical Features for the H.264

Journal of Convergence Information Technology ◽

10.4156/jcit.vol7.issue10.44 ◽

2012 ◽

Vol 7 (10) ◽

pp. 372-380

Author(s):

Chunjiang Duanmu ◽

Donghui Zhou

Keyword(s):

Mode Selection ◽

Statistical Features ◽

Selection Algorithm

Download Full-text

A Mode Selection Algorithm for Mitigating Interference in D2D Enabled Next-Generation Heterogeneous Cellular Networks

2019 International Conference on Information Networking (ICOIN) ◽

10.1109/icoin.2019.8718182 ◽

2019 ◽

Cited By ~ 1

Author(s):

Md Kamruzzaman ◽

Nurul I Sarkar ◽

Jairo Gutierrez ◽

Sayan Kumar Ray

Keyword(s):

Cellular Networks ◽

Mode Selection ◽

Next Generation ◽

Selection Algorithm ◽

Heterogeneous Cellular Networks

Download Full-text

Improvement of Fast Intra Prediction Mode Selection Algorithm for H264

2019 IEEE 19th International Conference on Communication Technology (ICCT) ◽

10.1109/icct46805.2019.8947097 ◽

2019 ◽

Author(s):

Shengli Jiao ◽

Li Luan ◽

Hua Qu ◽

Maozhu Zhang

Keyword(s):

Mode Selection ◽

Intra Prediction ◽

Selection Algorithm ◽

Prediction Mode ◽

Fast Intra Prediction

Download Full-text

Idleness-Aware Dynamic Power Mode Selection on the i.MX 7ULP IoT Edge Processor

Journal of Low Power Electronics and Applications ◽

10.3390/jlpea10020019 ◽

2020 ◽

Vol 10 (2) ◽

pp. 19

Author(s):

Alfio Di Mauro ◽

Hamed Fatemi ◽

Jose Pineda de Gyvez ◽

Luca Benini

Keyword(s):

Power Consumption ◽

Power Management ◽

Management Strategy ◽

Low Cost ◽

Mode Selection ◽

Worst Case ◽

Dynamic Power ◽

Dynamic Tuning ◽

Power Management Strategy ◽

Power Mode

Power management is a crucial concern in micro-controller platforms for the Internet of Things (IoT) edge. Many applications present a variable and difficult to predict workload profile, usually driven by external inputs. The dynamic tuning of power consumption to the application requirements is indeed a viable approach to save energy. In this paper, we propose the implementation of a power management strategy for a novel low-cost low-power heterogeneous dual-core SoC for IoT edge fabricated in 28 nm FD-SOI technology. Ss with more complex power management policies implemented on high-end application processors, we propose a power management strategy where the power mode is dynamically selected to ensure user-specified target idleness. We demonstrate that the dynamic power mode selection introduced by our power manager allows achieving more than 43% power consumption reduction with respect to static worst-case power mode selection, without any significant penalty in the performance of a running application.

Download Full-text

AN EVALUATION OF MULTIPLE FEED-FORWARD NETWORKS ON GPUs

International Journal of Neural Systems ◽

10.1142/s0129065711002638 ◽

2011 ◽

Vol 21 (01) ◽

pp. 31-47 ◽

Cited By ~ 14

Author(s):

NOEL LOPES ◽

BERNARDETE RIBEIRO

Keyword(s):

Graphics Processing Unit ◽

Parallel Implementation ◽

Low Cost ◽

Back Propagation ◽

General Purpose ◽

Training System ◽

Graphics Hardware ◽

Processing Unit ◽

Data Parallel ◽

Graphics Processing

The Graphics Processing Unit (GPU) originally designed for rendering graphics and which is difficult to program for other tasks, has since evolved into a device suitable for general-purpose computations. As a result graphics hardware has become progressively more attractive yielding unprecedented performance at a relatively low cost. Thus, it is the ideal candidate to accelerate a wide variety of data parallel tasks in many fields such as in Machine Learning (ML). As problems become more and more demanding, parallel implementations of learning algorithms are crucial for a useful application. In particular, the implementation of Neural Networks (NNs) in GPUs can significantly reduce the long training times during the learning process. In this paper we present a GPU parallel implementation of the Back-Propagation (BP) and Multiple Back-Propagation (MBP) algorithms, and describe the GPU kernels needed for this task. The results obtained on well-known benchmarks show faster training times and improved performances as compared to the implementation in traditional hardware, due to maximized floating-point throughput and memory bandwidth. Moreover, a preliminary GPU based Autonomous Training System (ATS) is developed which aims at automatically finding high-quality NNs-based solutions for a given problem.

Download Full-text