Progressive Convolutional Neural Network for Incremental Learning

Zahid Ali Siddiqui; Unsang Park

doi:10.3390/electronics10161879

Progressive Convolutional Neural Network for Incremental Learning

Electronics ◽

10.3390/electronics10161879 ◽

2021 ◽

Vol 10 (16) ◽

pp. 1879

Author(s):

Zahid Ali Siddiqui ◽

Unsang Park

Keyword(s):

Neural Network ◽

Incremental Learning ◽

Network Architecture ◽

Structural Changes ◽

Training Time ◽

Learning Technique ◽

Incremental Training ◽

Fully Connected ◽

Fine Tune ◽

Entire Network

In this paper, we present a novel incremental learning technique to solve the catastrophic forgetting problem observed in the CNN architectures. We used a progressive deep neural network to incrementally learn new classes while keeping the performance of the network unchanged on old classes. The incremental training requires us to train the network only for new classes and fine-tune the final fully connected layer, without needing to train the entire network again, which significantly reduces the training time. We evaluate the proposed architecture extensively on image classification task using Fashion MNIST, CIFAR-100 and ImageNet-1000 datasets. Experimental results show that the proposed network architecture not only alleviates catastrophic forgetting but can also leverages prior knowledge via lateral connections to previously learned classes and their features. In addition, the proposed scheme is easily scalable and does not require structural changes on the network trained on the old task, which are highly required properties in embedded systems.

Download Full-text

Image reconstruction through a multimode fiber with a simple neural network architecture

Scientific Reports ◽

10.1038/s41598-020-79646-8 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Changyan Zhu ◽

Eng Aik Chan ◽

You Wang ◽

Weina Peng ◽

Ruixiang Guo ◽

...

Keyword(s):

Neural Network ◽

Image Reconstruction ◽

Network Architecture ◽

Neural Network Architecture ◽

Training Time ◽

Modal Dispersion ◽

Multimode Fibers ◽

Speckle Patterns ◽

Hidden Layer ◽

Simple Neural Network

AbstractMultimode fibers (MMFs) have the potential to carry complex images for endoscopy and related applications, but decoding the complex speckle patterns produced by mode-mixing and modal dispersion in MMFs is a serious challenge. Several groups have recently shown that convolutional neural networks (CNNs) can be trained to perform high-fidelity MMF image reconstruction. We find that a considerably simpler neural network architecture, the single hidden layer dense neural network, performs at least as well as previously-used CNNs in terms of image reconstruction fidelity, and is superior in terms of training time and computing resources required. The trained networks can accurately reconstruct MMF images collected over a week after the cessation of the training set, with the dense network performing as well as the CNN over the entire period.

Download Full-text

NASIL: Neural Network Architecture Searching for Incremental Learning in Image Classification

Parallel Architectures, Algorithms and Programming - Communications in Computer and Information Science ◽

10.1007/978-981-16-0010-4_7 ◽

2021 ◽

pp. 68-80

Author(s):

Xianya Fu ◽

Wenrui Li ◽

Qiurui Chen ◽

Lianyi Zhang ◽

Kai Yang ◽

...

Keyword(s):

Neural Network ◽

Image Classification ◽

Incremental Learning ◽

Network Architecture ◽

Neural Network Architecture

Download Full-text

Is One Teacher Model Enough to Transfer Knowledge to a Student Model?

Algorithms ◽

10.3390/a14110334 ◽

2021 ◽

Vol 14 (11) ◽

pp. 334

Author(s):

Nicola Landro ◽

Ignazio Gallo ◽

Riccardo La Grassa

Keyword(s):

Neural Network ◽

Transfer Learning ◽

Learning Problem ◽

Learning Techniques ◽

Starting Point ◽

Learning Technique ◽

Interesting Approach ◽

Classification Tasks ◽

Learned Features ◽

Fine Tune

Nowadays, the transfer learning technique can be successfully applied in the deep learning field through techniques that fine-tune the CNN’s starting point so it may learn over a huge dataset such as ImageNet and continue to learn on a fixed dataset to achieve better performance. In this paper, we designed a transfer learning methodology that combines the learned features of different teachers to a student network in an end-to-end model, improving the performance of the student network in classification tasks over different datasets. In addition to this, we tried to answer the following questions which are in any case directly related to the transfer learning problem addressed here. Is it possible to improve the performance of a small neural network by using the knowledge gained from a more powerful neural network? Can a deep neural network outperform the teacher using transfer learning? Experimental results suggest that neural networks can transfer their learning to student networks using our proposed architecture, designed to bring to light a new interesting approach for transfer learning techniques. Finally, we provide details of the code and the experimental settings.

Download Full-text

STOCHASTIC PSEUDOSPIN NEURAL NETWORK WITH TRIDIAGONAL SYNAPTIC CONNECTIONS

Radio Electronics Computer Science Control ◽

10.15588/1607-3274-2021-2-12 ◽

2021 ◽

pp. 114-122

Author(s):

R. М. Peleshchak ◽

V. V. Lytvyn ◽

О. І. Cherniak ◽

І. R. Peleshchak ◽

М. V. Doroshenko

Keyword(s):

Neural Network ◽

Network Architecture ◽

Connection Matrix ◽

Synaptic Connections ◽

Hessenberg Matrix ◽

Tuning Time ◽

Hebb Rule ◽

The Matrix ◽

Dipole Glass ◽

Fully Connected

Context. To reduce the computational resource time in the problems of diagnosing and recognizing distorted images based on a fully connected stochastic pseudospin neural network, it becomes necessary to thin out synaptic connections between neurons, which is solved using the method of diagonalizing the matrix of synaptic connections without losing interaction between all neurons in the network. Objective. To create an architecture of a stochastic pseudo-spin neural network with diagonal synaptic connections without loosing the interaction between all the neurons in the layer to reduce its learning time. Method. The paper uses the Hausholder method, the method of compressing input images based on the diagonalization of the matrix of synaptic connections and the computer mathematics system MATLAB for converting a fully connected neural network into a tridiagonal form with hidden synaptic connections between all neurons. Results. We developed a model of a stochastic neural network architecture with sparse renormalized synaptic connections that take into account deleted synaptic connections. Based on the transformation of the synaptic connection matrix of a fully connected neural network into a Hessenberg matrix with tridiagonal synaptic connections, we proposed a renormalized local Hebb rule. Using the computer mathematics system “WolframMathematica 11.3”, we calculated, as a function of the number of neurons N, the relative tuning time of synaptic connections (per iteration) in a stochastic pseudospin neural network with a tridiagonal connection Matrix, relative to the tuning time of synaptic connections (per iteration) in a fully connected synaptic neural network. Conclusions. We found that with an increase in the number of neurons, the tuning time of synaptic connections (per iteration) in a stochastic pseudospin neural network with a tridiagonal connection Matrix, relative to the tuning time of synaptic connections (per iteration) in a fully connected synaptic neural network, decreases according to a hyperbolic law. Depending on the direction of pseudospin neurons, we proposed a classification of a renormalized neural network with a ferromagnetic structure, an antiferromagnetic structure, and a dipole glass.

Download Full-text

FnnmOS-ELM: A Flexible Neural Network Mixed Online Sequential Elm

Applied Sciences ◽

10.3390/app9183772 ◽

2019 ◽

Vol 9 (18) ◽

pp. 3772

Author(s):

Xiali Li ◽

Shuai He ◽

Junzhi Yu ◽

Licheng Wu ◽

Zhao Yue

Keyword(s):

Neural Network ◽

Network Performance ◽

Classification Performance ◽

Feature Representation ◽

Mixed Structure ◽

Training Time ◽

Learning Speed ◽

Learning Machine ◽

Fully Connected ◽

The Relationship

The learning speed of online sequential extreme learning machine (OS-ELM) algorithms is much higher than that of convolutional neural networks (CNNs) or recurrent neural network (RNNs) on regression and simple classification datasets. However, the general feature extraction of OS-ELM makes it difficult to conveniently and effectively perform classification on some large and complex datasets, e.g., CIFAR. In this paper, we propose a flexible OS-ELM-mixed neural network, termed as fnnmOS-ELM. In this mixed structure, the OS-ELM can replace a part of fully connected layers in CNNs or RNNs. Our framework not only exploits the strong feature representation of CNNs or RNNs, but also performs at a fast speed in terms of classification. Additionally, it avoids the problem of long training time and large parameter size of CNNs or RNNs to some extent. Further, we propose a method for optimizing network performance by splicing OS-ELM after CNN or RNN structures. Iris, IMDb, CIFAR-10, and CIFAR-100 datasets are employed to verify the performance of the fnnmOS-ELM. The relationship between hyper-parameters and the performance of the fnnmOS-ELM is explored, which sheds light on the optimization of network performance. Finally, the experimental results demonstrate that the fnnmOS-ELM has a stronger feature representation and higher classification performance than contemporary methods.

Download Full-text

Recurrent neural network architecture with pre-synaptic inhibition for incremental learning

Neural Networks ◽

10.1016/j.neunet.2006.06.005 ◽

2006 ◽

Vol 19 (8) ◽

pp. 1106-1119 ◽

Cited By ~ 6

Author(s):

Hiroyuki Ohta ◽

Yukio Pegio Gunji

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Incremental Learning ◽

Network Architecture ◽

Synaptic Inhibition ◽

Neural Network Architecture

Download Full-text

Classification of Plant Leaf Diseases Based on Improved Convolutional Neural Network

Sensors ◽

10.3390/s19194161 ◽

2019 ◽

Vol 19 (19) ◽

pp. 4161 ◽

Cited By ~ 11

Author(s):

Hang ◽

Zhang ◽

Chen ◽

Zhang ◽

Wang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Convergence Time ◽

Model Parameters ◽

Data Set ◽

Training Time ◽

Plant Leaf ◽

The Neural Network ◽

Target Disease ◽

Fully Connected

Plant leaf diseases are closely related to people's daily life. Due to the wide variety of diseases, it is not only time-consuming and labor-intensive to identify and classify diseases by artificial eyes, but also easy to be misidentified with having a high error rate. Therefore, we proposed a deep learning-based method to identify and classify plant leaf diseases. The proposed method can take the advantages of the neural network to extract the characteristics of diseased parts, and thus to classify target disease areas. To address the issues of long training convergence time and too-large model parameters, the traditional convolutional neural network was improved by combining a structure of inception module, a squeeze-and-excitation (SE) module and a global pooling layer to identify diseases. Through the Inception structure, the feature data of the convolutional layer were fused in multi-scales to improve the accuracy on the leaf disease dataset. Finally, the global average pooling layer was used instead of the fully connected layer to reduce the number of model parameters. Compared with some traditional convolutional neural networks, our model yielded better performance and achieved an accuracy of 91.7% on the test data set. At the same time, the number of model parameters and training time have also been greatly reduced. The experimental classification on plant leaf diseases indicated that our method is feasible and effective.

Download Full-text

A NEW NEURAL NETWORK ARCHITECTURE WITH ASSOCIATIVE MEMORY, PRUNING AND ORDER-SENSITIVE LEARNING

International Journal of Neural Systems ◽

10.1142/s0129065799000332 ◽

1999 ◽

Vol 09 (04) ◽

pp. 351-370 ◽

Cited By ~ 4

Author(s):

M. SREENIVASA RAO ◽

ARUN K. PUJARI

Keyword(s):

Neural Network ◽

Associative Memory ◽

Network Architecture ◽

Hopfield Network ◽

New Paradigm ◽

Stable States ◽

Neural Network Architecture ◽

Language Understanding ◽

Structure Database ◽

Learning Technique

A new paradigm of neural network architecture is proposed that works as associative memory along with capabilities of pruning and order-sensitive learning. The network has a composite structure wherein each node of the network is a Hopfield network by itself. The Hopfield network employs an order-sensitive learning technique and converges to user-specified stable states without having any spurious states. This is based on geometrical structure of the network and of the energy function. The network is so designed that it allows pruning in binary order as it progressively carries out associative memory retrieval. The capacity of the network is 2n, where n is the number of basic nodes in the network. The capabilities of the network are demonstrated by experimenting on three different application areas, namely a Library Database, a Protein Structure Database and Natural Language Understanding.

Download Full-text

AN INCREMENTAL LEARNING NEURAL NETWORK FOR PATTERN CLASSIFICATION

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001499000501 ◽

1999 ◽

Vol 13 (06) ◽

pp. 913-928 ◽

Cited By ~ 2

Author(s):

CHENG-AN HUNG ◽

SHENG-FUU LIN

Keyword(s):

Neural Network ◽

Incremental Learning ◽

Network Architecture ◽

Character Recognition ◽

Feature Space ◽

Neural Network Architecture ◽

Single Class ◽

Fast Learning ◽

Fuzzy Adaptive ◽

Iris Data

A neural network architecture that incorporates a supervised mechanism into a fuzzy adaptive Hamming net (FAHN) is presented. The FAHN constructs hyper-rectangles that represent template weights in an unsupervised learning paradigm. Learning in the FAHN consists of creating and adjusting hyper-rectangles in feature space. By aggregating multiple hyper-rectangles into a single class, we can build a classifier, to be henceforth termed as a supervised fuzzy adaptive Hamming net (SFAHN), that discriminates between nonconvex and even discontinuous classes. The SFAHN can operate at a fast-learning rate in online (incremental) or offline (batch) applications, without becoming unstable. The performance of the SFAHN is tested on the Fisher iris data and on an online character recognition problem.

Download Full-text

Neural Network Identifiability for a Family of Sigmoidal Nonlinearities

Constructive Approximation ◽

10.1007/s00365-021-09544-3 ◽

2021 ◽

Author(s):

Verner Vlačić ◽

Helmut Bölcskei

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Uniform Norm ◽

Feed Forward Neural Network ◽

The Subject ◽

Hidden Layer ◽

Arbitrary Precision ◽

Necessary And Sufficient ◽

Fully Connected

AbstractThis paper addresses the following question of neural network identifiability: Does the input–output map realized by a feed-forward neural network with respect to a given nonlinearity uniquely specify the network architecture, weights, and biases? The existing literature on the subject (Sussman in Neural Netw 5(4):589–593, 1992; Albertini et al. in Artificial neural networks for speech and vision, 1993; Fefferman in Rev Mat Iberoam 10(3):507–555, 1994) suggests that the answer should be yes, up to certain symmetries induced by the nonlinearity, and provided that the networks under consideration satisfy certain “genericity conditions.” The results in Sussman (1992) and Albertini et al. (1993) apply to networks with a single hidden layer and in Fefferman (1994) the networks need to be fully connected. In an effort to answer the identifiability question in greater generality, we derive necessary genericity conditions for the identifiability of neural networks of arbitrary depth and connectivity with an arbitrary nonlinearity. Moreover, we construct a family of nonlinearities for which these genericity conditions are minimal, i.e., both necessary and sufficient. This family is large enough to approximate many commonly encountered nonlinearities to within arbitrary precision in the uniform norm.

Download Full-text