Neural networks with a self-refreshing memory: Knowledge transfer in sequential learning tasks without catastrophic forgetting

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.

Download Full-text

Ego-Motion Estimation Using Recurrent Convolutional Neural Networks through Optical Flow Learning

Electronics ◽

10.3390/electronics10030222 ◽

2021 ◽

Vol 10 (3) ◽

pp. 222

Author(s):

Baigan Zhao ◽

Yingping Huang ◽

Hongjian Wei ◽

Xing Hu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Motion Estimation ◽

Optical Flow ◽

Feature Representation ◽

Sequential Learning ◽

Navigation Systems ◽

Motion State ◽

Training Schemes ◽

Sequential Images

Visual odometry (VO) refers to incremental estimation of the motion state of an agent (e.g., vehicle and robot) by using image information, and is a key component of modern localization and navigation systems. Addressing the monocular VO problem, this paper presents a novel end-to-end network for estimation of camera ego-motion. The network learns the latent subspace of optical flow (OF) and models sequential dynamics so that the motion estimation is constrained by the relations between sequential images. We compute the OF field of consecutive images and extract the latent OF representation in a self-encoding manner. A Recurrent Neural Network is then followed to examine the OF changes, i.e., to conduct sequential learning. The extracted sequential OF subspace is used to compute the regression of the 6-dimensional pose vector. We derive three models with different network structures and different training schemes: LS-CNN-VO, LS-AE-VO, and LS-RCNN-VO. Particularly, we separately train the encoder in an unsupervised manner. By this means, we avoid non-convergence during the training of the whole network and allow more generalized and effective feature representation. Substantial experiments have been conducted on KITTI and Malaga datasets, and the results demonstrate that our LS-RCNN-VO outperforms the existing learning-based VO approaches.

Download Full-text

Predicting Financial Prices of Stock Market using Recurrent Convolutional Neural Networks

International Journal of Intelligent Systems and Applications ◽

10.5815/ijisa.2020.06.02 ◽

2020 ◽

Vol 12 (6) ◽

pp. 21-32

Author(s):

Muhammad Zulqarnain ◽

◽

Rozaida Ghazali ◽

Muhammad Ghulam Ghouse ◽

Yana Mazwin Mohmad Hassim ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Time Series ◽

Financial Time Series ◽

Time Series Prediction ◽

Financial Time ◽

Data Mining Approach ◽

Learning Tasks ◽

Financial Prices ◽

Stock Indexes

Financial time-series prediction has been long and the most challenging issues in financial market analysis. The deep neural networks is one of the excellent data mining approach has received great attention by researchers in several areas of time-series prediction since last 10 years. “Convolutional neural network (CNN) and recurrent neural network (RNN) models have become the mainstream methods for financial predictions. In this paper, we proposed to combine architectures, which exploit the advantages of CNN and RNN simultaneously, for the prediction of trading signals. Our model is essentially presented to financial time series predicting signals through a CNN layer, and directly fed into a gated recurrent unit (GRU) layer to capture long-term signals dependencies. GRU model perform better in sequential learning tasks and solve the vanishing gradients and exploding issue in standard RNNs. We evaluate our model on three datasets for stock indexes of the Hang Seng Indexes (HSI), the Deutscher Aktienindex (DAX) and the S&P 500 Index range 2008 to 2016, and associate the GRU-CNN based approaches with the existing deep learning models. Experimental results present that the proposed GRU-CNN model obtained the best prediction accuracy 56.2% on HIS dataset, 56.1% on DAX dataset and 56.3% on S&P500 dataset respectively.

Download Full-text

Deep learning methods for data analysis

10.12681/eadd/44841 ◽

2018 ◽

Author(s):

Νικόλαος Πασσαλής

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Data Analysis ◽

Probability Density Function ◽

Knowledge Transfer ◽

Dimensionality Reduction ◽

Probability Density ◽

Deep Convolutional Neural Networks ◽

Bag Of Features ◽

Probability Density Function Pdf

Οι πρόσφατες εξελίξεις στον τομέα της Βαθιάς Μάθησης (Deep Learning) παρείχαν ισχυρά εργαλεία ανάλυσης δεδομένων. Παρόλα αυτά, η μεγάλη υπολογιστική πολυπλοκότητα των μεθόδων Βαθιάς Μάθησης περιορίζει σημαντικά τη δυνατότητα εφαρμογής τους, ειδικά όταν οι διαθέσιμοι υπολογιστικοί πόροι είναι περιορισμένοι. Επιπλέον, η ευελιξία πολλών μεθόδων βαθιάς μάθησης περιορίζεται σημαντικά από την αδυναμία τους να συνδυαστούν αποτελεσματικά με κλασικές μεθόδους Μηχανικής Μάθησης. Η κύρια στόχευση της παρούσας διδακτορικής διατριβής είναι η ανάπτυξη μεθόδων Βαθιάς Μάθησης οι οποίες θα μπορούν να χρησιμοποιηθούν αποτελεσματικά για την επίλυση διαφόρων προβλημάτων ανάλυσης δεδομένων (κατηγοριοποίηση, ομαδοποίηση, παλινδρόμηση, κτλ.) με τη χρήση διαφορετικών δεδομένων (εικόνα, βίντεο, κείμενο, χρονοσειρές), ενώ ταυτόχρονα αντιμετωπίζουν αποτελεσματικά τα παραπάνω προβλήματα. Για τον σκοπό αυτό, πρώτα αναπτύχθηκε μία νευρωνική επέκταση του μοντέλου του Σάκου Χαρακτηριστικών (Bag-of-Features), η οποία συνδυάστηκε με πολλούς διαφορετικούς εξαγωγείς χαρακτηριστικών (feature extractors), συμπεριλαμβανομένων Βαθιών Συνελικτικών Νευρωνικών Δικτύων (Deep Convolutional Neural Networks). Αυτό επέτρεψε τη σημαντική αύξηση και της ακρίβειας των δικτύων, όσο και της αντοχής τους σε μεταβολές στην κατανομή εισόδου, καθώς και τη μείωση του πλήθους των παραμέτρων που απαιτούνται σε σύγκριση με ανταγωνιστικές μεθόδους. Στη συνέχεια, προτάθηκε μία μέθοδος μάθησης αναπαραστάσεων η οποία είναι ικανή να παράγει αναπαραστάσεις προσαρμοσμένες για το πρόβλημα της ανάκτησης πληροφορίας, αυξάνοντας σημαντικά την επίδοση των αναπαραστάσεων στα αντίστοιχα προβλήματα. Έπειτα, προτάθηκε μία ευέλικτη και αποδοτική μέθοδος μεταφοράς γνώσης (knowledge transfer), η οποία είναι σε θέση να ‘‘αποστάξει’’ τη γνώση από ένα μεγάλο και περίπλοκο νευρωνικό δίκτυο σε ένα γρηγορότερο και μικρότερο. Η αποτελεσματικότητα της προτεινόμενης μεθόδου διαπιστώθηκε με τη χρήση πολλών διαφορετικών πρωτοκόλλων αξιολόγησης. Επίσης, διαπιστώθηκε ότι το πρόβλημα μείωσης διάστασης (dimensionality reduction) μπορεί να εκφραστεί ως ένα πρόβλημα μεταφοράς γνώσης από μία κατάλληλα ορισμένη Συνάρτηση Πυκνότητας Πιθανότητας (Probability Density Function, PDF) σε ένα μοντέλο Μηχανικής Μάθησης με τη χρήση της μεθόδου που περιεγράφηκε προηγουμένως. Έτσι είναι εφικτό να οριστεί ένα γενικό πλαίσιο (framework) μείωσης διάστασης, το οποίο επίσης συνδυάστηκε με μοντέλα Βαθιάς Μάθησης, ώστε να εξάγει αναπαραστάσεις βελτιστοποιημένες για προβλήματα ομαδοποίησης. Τέλος, αναπτύχθηκε μία βιβλιοθήκη ανοικτού κώδικα η οποία υλοποιεί την παραπάνω μέθοδο μείωσης διάστασης, καθώς και μία μέθοδο σταθεροποίησης της σύγκλισης στοχαστικών τεχνικών βελτιστοποίησης αρχιτεκτονικών Βαθιάς Μάθησης.

Download Full-text

Pruning for Hardware-Based Deep Spiking Neural Networks Using Gated Schottky Diode as Synaptic Devices

Journal of Nanoscience and Nanotechnology ◽

10.1166/jnn.2020.18772 ◽

2020 ◽

Vol 20 (11) ◽

pp. 6603-6608 ◽

Cited By ~ 1

Author(s):

Sung-Tae Lee ◽

Suhwan Lim ◽

Jong-Ho Bae ◽

Dongseok Kwon ◽

Hyeong-Su Kim ◽

...

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Schottky Diodes ◽

Computational Cost ◽

Spiking Neural Networks ◽

Training Procedure ◽

Learning Tasks ◽

L1 Regularization ◽

The Cost ◽

High Computational Cost

Deep learning represents state-of-the-art results in various machine learning tasks, but for applications that require real-time inference, the high computational cost of deep neural networks becomes a bottleneck for the efficiency. To overcome the high computational cost of deep neural networks, spiking neural networks (SNN) have been proposed. Herein, we propose a hardware implementation of the SNN with gated Schottky diodes as synaptic devices. In addition, we apply L1 regularization for connection pruning of the deep spiking neural networks using gated Schottky diodes as synap-tic devices. Applying L1 regularization eliminates the need for a re-training procedure because it prunes the weights based on the cost function. The compressed hardware-based SNN is energy efficient while achieving a classification accuracy of 97.85% which is comparable to 98.13% of the software deep neural networks (DNN).

Download Full-text

Neural Networks in ECG Classification

Neural Networks in Healthcare ◽

10.4018/978-1-59140-848-2.ch004 ◽

2011 ◽

pp. 81-104 ◽

Cited By ~ 2

Author(s):

G. Camps-Valls ◽

J. F. Guerrero-Martinez

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Feature Selection ◽

Adaptive Systems ◽

Clinical Environment ◽

Learning Models ◽

Cardiac Pathology ◽

Learning Tasks ◽

Beat Detection ◽

Machine Learning Models

In this chapter, we review the vast field of application of artificial neural networks in cardiac pathology discrimination based on electrocardiographic signals. We discuss advantages and drawbacks of neural and adaptive systems in cardiovascular medicine and catch a glimpse of forthcoming developments in machine learning models for the real clinical environment. Some problems are identified in the learning tasks of beat detection, feature selection/extraction, and classification, and some proposals and suggestions are given to alleviate the problems of interpretability, overfitting, and adaptation. These have become important problems in recent years and will surely constitute the basis of some investigations in the immediate future.

Download Full-text

Training binary neural networks with knowledge transfer

Neurocomputing ◽

10.1016/j.neucom.2018.09.103 ◽

2020 ◽

Vol 396 ◽

pp. 534-541

Author(s):

Sam Leroux ◽

Bert Vankeirsbilck ◽

Tim Verbelen ◽

Pieter Simoens ◽

Bart Dhoedt

Keyword(s):

Neural Networks ◽

Knowledge Transfer

Download Full-text

Optimizing Multilingual Knowledge Transfer for Time-Delay Neural Networks with Low-Rank Factorization

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2018.8461798 ◽

2018 ◽

Cited By ~ 2

Author(s):

Francis Keith ◽

William Hartmann ◽

Man-Hung Siu ◽

Jeff Ma ◽

Owen Kimball

Keyword(s):

Neural Networks ◽

Time Delay ◽

Knowledge Transfer ◽

Low Rank ◽

Rank Factorization

Download Full-text

Ensemble Convolutional Neural Networks With Knowledge Transfer for Leather Defect Classification in Industrial Settings

IEEE Access ◽

10.1109/access.2020.3034731 ◽

2020 ◽

Vol 8 ◽

pp. 198600-198614

Author(s):

Masood Aslam ◽

Tariq M. Khan ◽

Syed Saud Naqvi ◽

Geoff Holmes ◽

Rafea Naffa

Keyword(s):

Neural Networks ◽

Knowledge Transfer ◽

Convolutional Neural Networks ◽

Defect Classification ◽

Industrial Settings

Download Full-text

Local Learning Algorithms for Sequential Tasks in Neural Networks

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.1998.p0221 ◽

1998 ◽

Vol 2 (6) ◽

pp. 221-227 ◽

Cited By ~ 2

Author(s):

Anthony Robins ◽

◽

Marcus Frean ◽

Keyword(s):

Neural Network ◽

Neural Networks ◽

Learning Algorithms ◽

Learning Task ◽

Sequential Learning ◽

Local Learning ◽

Neural Network Learning ◽

Network Learning ◽

Sequential Task ◽

Global And Local

In this paper, we explore the concept of sequential learning and the efficacy of global and local neural network learning algorithms on a sequential learning task. Pseudorehearsal, a method developed by Robins19) to solve the catastrophic forgetting problem which arises from the excessive plasticity of neural networks, is significantly more effective than other local learning algorithms for the sequential task. We further consider the concept of local learning and suggest that pseudorehearsal is so effective because it works directly at the level of the learned function, and not indirectly on the representation of the function within the network. We also briefly explore the effect of local learning on generalization within the task.

Download Full-text