Neural networks with a self-refreshing memory: Knowledge transfer in sequential learning tasks without catastrophic forgetting

2000 ◽  
Vol 12 (1) ◽  
pp. 1-19 ◽  
Author(s):  
Bernard Ans ◽  
Stephane Rousset
Algorithms ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 39
Author(s):  
Carlos Lassance ◽  
Vincent Gripon ◽  
Antonio Ortega

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.


Electronics ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 222
Author(s):  
Baigan Zhao ◽  
Yingping Huang ◽  
Hongjian Wei ◽  
Xing Hu

Visual odometry (VO) refers to incremental estimation of the motion state of an agent (e.g., vehicle and robot) by using image information, and is a key component of modern localization and navigation systems. Addressing the monocular VO problem, this paper presents a novel end-to-end network for estimation of camera ego-motion. The network learns the latent subspace of optical flow (OF) and models sequential dynamics so that the motion estimation is constrained by the relations between sequential images. We compute the OF field of consecutive images and extract the latent OF representation in a self-encoding manner. A Recurrent Neural Network is then followed to examine the OF changes, i.e., to conduct sequential learning. The extracted sequential OF subspace is used to compute the regression of the 6-dimensional pose vector. We derive three models with different network structures and different training schemes: LS-CNN-VO, LS-AE-VO, and LS-RCNN-VO. Particularly, we separately train the encoder in an unsupervised manner. By this means, we avoid non-convergence during the training of the whole network and allow more generalized and effective feature representation. Substantial experiments have been conducted on KITTI and Malaga datasets, and the results demonstrate that our LS-RCNN-VO outperforms the existing learning-based VO approaches.


2020 ◽  
Vol 12 (6) ◽  
pp. 21-32
Author(s):  
Muhammad Zulqarnain ◽  
◽  
Rozaida Ghazali ◽  
Muhammad Ghulam Ghouse ◽  
Yana Mazwin Mohmad Hassim ◽  
...  

Financial time-series prediction has been long and the most challenging issues in financial market analysis. The deep neural networks is one of the excellent data mining approach has received great attention by researchers in several areas of time-series prediction since last 10 years. “Convolutional neural network (CNN) and recurrent neural network (RNN) models have become the mainstream methods for financial predictions. In this paper, we proposed to combine architectures, which exploit the advantages of CNN and RNN simultaneously, for the prediction of trading signals. Our model is essentially presented to financial time series predicting signals through a CNN layer, and directly fed into a gated recurrent unit (GRU) layer to capture long-term signals dependencies. GRU model perform better in sequential learning tasks and solve the vanishing gradients and exploding issue in standard RNNs. We evaluate our model on three datasets for stock indexes of the Hang Seng Indexes (HSI), the Deutscher Aktienindex (DAX) and the S&P 500 Index range 2008 to 2016, and associate the GRU-CNN based approaches with the existing deep learning models. Experimental results present that the proposed GRU-CNN model obtained the best prediction accuracy 56.2% on HIS dataset, 56.1% on DAX dataset and 56.3% on S&P500 dataset respectively.


2018 ◽  
Author(s):  
Νικόλαος Πασσαλής

Οι πρόσφατες εξελίξεις στον τομέα της Βαθιάς Μάθησης (Deep Learning) παρείχαν ισχυρά εργαλεία ανάλυσης δεδομένων. Παρόλα αυτά, η μεγάλη υπολογιστική πολυπλοκότητα των μεθόδων Βαθιάς Μάθησης περιορίζει σημαντικά τη δυνατότητα εφαρμογής τους, ειδικά όταν οι διαθέσιμοι υπολογιστικοί πόροι είναι περιορισμένοι. Επιπλέον, η ευελιξία πολλών μεθόδων βαθιάς μάθησης περιορίζεται σημαντικά από την αδυναμία τους να συνδυαστούν αποτελεσματικά με κλασικές μεθόδους Μηχανικής Μάθησης. Η κύρια στόχευση της παρούσας διδακτορικής διατριβής είναι η ανάπτυξη μεθόδων Βαθιάς Μάθησης οι οποίες θα μπορούν να χρησιμοποιηθούν αποτελεσματικά για την επίλυση διαφόρων προβλημάτων ανάλυσης δεδομένων (κατηγοριοποίηση, ομαδοποίηση, παλινδρόμηση, κτλ.) με τη χρήση διαφορετικών δεδομένων (εικόνα, βίντεο, κείμενο, χρονοσειρές), ενώ ταυτόχρονα αντιμετωπίζουν αποτελεσματικά τα παραπάνω προβλήματα. Για τον σκοπό αυτό, πρώτα αναπτύχθηκε μία νευρωνική επέκταση του μοντέλου του Σάκου Χαρακτηριστικών (Bag-of-Features), η οποία συνδυάστηκε με πολλούς διαφορετικούς εξαγωγείς χαρακτηριστικών (feature extractors), συμπεριλαμβανομένων Βαθιών Συνελικτικών Νευρωνικών Δικτύων (Deep Convolutional Neural Networks). Αυτό επέτρεψε τη σημαντική αύξηση και της ακρίβειας των δικτύων, όσο και της αντοχής τους σε μεταβολές στην κατανομή εισόδου, καθώς και τη μείωση του πλήθους των παραμέτρων που απαιτούνται σε σύγκριση με ανταγωνιστικές μεθόδους. Στη συνέχεια, προτάθηκε μία μέθοδος μάθησης αναπαραστάσεων η οποία είναι ικανή να παράγει αναπαραστάσεις προσαρμοσμένες για το πρόβλημα της ανάκτησης πληροφορίας, αυξάνοντας σημαντικά την επίδοση των αναπαραστάσεων στα αντίστοιχα προβλήματα. Έπειτα, προτάθηκε μία ευέλικτη και αποδοτική μέθοδος μεταφοράς γνώσης (knowledge transfer), η οποία είναι σε θέση να ‘‘αποστάξει’’ τη γνώση από ένα μεγάλο και περίπλοκο νευρωνικό δίκτυο σε ένα γρηγορότερο και μικρότερο. Η αποτελεσματικότητα της προτεινόμενης μεθόδου διαπιστώθηκε με τη χρήση πολλών διαφορετικών πρωτοκόλλων αξιολόγησης. Επίσης, διαπιστώθηκε ότι το πρόβλημα μείωσης διάστασης (dimensionality reduction) μπορεί να εκφραστεί ως ένα πρόβλημα μεταφοράς γνώσης από μία κατάλληλα ορισμένη Συνάρτηση Πυκνότητας Πιθανότητας (Probability Density Function, PDF) σε ένα μοντέλο Μηχανικής Μάθησης με τη χρήση της μεθόδου που περιεγράφηκε προηγουμένως. Έτσι είναι εφικτό να οριστεί ένα γενικό πλαίσιο (framework) μείωσης διάστασης, το οποίο επίσης συνδυάστηκε με μοντέλα Βαθιάς Μάθησης, ώστε να εξάγει αναπαραστάσεις βελτιστοποιημένες για προβλήματα ομαδοποίησης. Τέλος, αναπτύχθηκε μία βιβλιοθήκη ανοικτού κώδικα η οποία υλοποιεί την παραπάνω μέθοδο μείωσης διάστασης, καθώς και μία μέθοδο σταθεροποίησης της σύγκλισης στοχαστικών τεχνικών βελτιστοποίησης αρχιτεκτονικών Βαθιάς Μάθησης.


2020 ◽  
Vol 20 (11) ◽  
pp. 6603-6608 ◽  
Author(s):  
Sung-Tae Lee ◽  
Suhwan Lim ◽  
Jong-Ho Bae ◽  
Dongseok Kwon ◽  
Hyeong-Su Kim ◽  
...  

Deep learning represents state-of-the-art results in various machine learning tasks, but for applications that require real-time inference, the high computational cost of deep neural networks becomes a bottleneck for the efficiency. To overcome the high computational cost of deep neural networks, spiking neural networks (SNN) have been proposed. Herein, we propose a hardware implementation of the SNN with gated Schottky diodes as synaptic devices. In addition, we apply L1 regularization for connection pruning of the deep spiking neural networks using gated Schottky diodes as synap-tic devices. Applying L1 regularization eliminates the need for a re-training procedure because it prunes the weights based on the cost function. The compressed hardware-based SNN is energy efficient while achieving a classification accuracy of 97.85% which is comparable to 98.13% of the software deep neural networks (DNN).


2011 ◽  
pp. 81-104 ◽  
Author(s):  
G. Camps-Valls ◽  
J. F. Guerrero-Martinez

In this chapter, we review the vast field of application of artificial neural networks in cardiac pathology discrimination based on electrocardiographic signals. We discuss advantages and drawbacks of neural and adaptive systems in cardiovascular medicine and catch a glimpse of forthcoming developments in machine learning models for the real clinical environment. Some problems are identified in the learning tasks of beat detection, feature selection/extraction, and classification, and some proposals and suggestions are given to alleviate the problems of interpretability, overfitting, and adaptation. These have become important problems in recent years and will surely constitute the basis of some investigations in the immediate future.


2020 ◽  
Vol 396 ◽  
pp. 534-541
Author(s):  
Sam Leroux ◽  
Bert Vankeirsbilck ◽  
Tim Verbelen ◽  
Pieter Simoens ◽  
Bart Dhoedt

IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 198600-198614
Author(s):  
Masood Aslam ◽  
Tariq M. Khan ◽  
Syed Saud Naqvi ◽  
Geoff Holmes ◽  
Rafea Naffa

Author(s):  
Anthony Robins ◽  
◽  
Marcus Frean ◽  

In this paper, we explore the concept of sequential learning and the efficacy of global and local neural network learning algorithms on a sequential learning task. Pseudorehearsal, a method developed by Robins19) to solve the catastrophic forgetting problem which arises from the excessive plasticity of neural networks, is significantly more effective than other local learning algorithms for the sequential task. We further consider the concept of local learning and suggest that pseudorehearsal is so effective because it works directly at the level of the learned function, and not indirectly on the representation of the function within the network. We also briefly explore the effect of local learning on generalization within the task.


Sign in / Sign up

Export Citation Format

Share Document