Improving Learning Performance in Neural Networks

Falah Al-akashi;

doi:10.21742/ijhit.2021.14.1.02

Exploiting heterogeneity in operational neural networks by synaptic plasticity

Neural Computing and Applications ◽

10.1007/s00521-020-05543-w ◽

2021 ◽

Author(s):

Serkan Kiranyaz ◽

Junaid Malik ◽

Habib Ben Abdallah ◽

Turker Ince ◽

Alexandros Iosifidis ◽

...

Keyword(s):

Neural Networks ◽

Synaptic Plasticity ◽

Network Model ◽

Neuron Model ◽

Linear Operators ◽

Training Data ◽

Learning Performance ◽

Minimal Network ◽

Hidden Layer ◽

Hidden Neurons

AbstractThe recently proposed network model, Operational Neural Networks (ONNs), can generalize the conventional Convolutional Neural Networks (CNNs) that are homogenous only with a linear neuron model. As a heterogenous network model, ONNs are based on a generalized neuron model that can encapsulate any set of non-linear operators to boost diversity and to learn highly complex and multi-modal functions or spaces with minimal network complexity and training data. However, the default search method to find optimal operators in ONNs, the so-called Greedy Iterative Search (GIS) method, usually takes several training sessions to find a single operator set per layer. This is not only computationally demanding, also the network heterogeneity is limited since the same set of operators will then be used for all neurons in each layer. To address this deficiency and exploit a superior level of heterogeneity, in this study the focus is drawn on searching the best-possible operator set(s) for the hidden neurons of the network based on the “Synaptic Plasticity” paradigm that poses the essential learning theory in biological neurons. During training, each operator set in the library can be evaluated by their synaptic plasticity level, ranked from the worst to the best, and an “elite” ONN can then be configured using the top-ranked operator sets found at each hidden layer. Experimental results over highly challenging problems demonstrate that the elite ONNs even with few neurons and layers can achieve a superior learning performance than GIS-based ONNs and as a result, the performance gap over the CNNs further widens.

Download Full-text

Learning performance prediction via convolutional GRU and explainable neural networks in e-learning environments

Computing ◽

10.1007/s00607-018-00699-9 ◽

2019 ◽

Vol 101 (6) ◽

pp. 587-604 ◽

Cited By ~ 3

Author(s):

Xizhe Wang ◽

Pengze Wu ◽

Guang Liu ◽

Qionghao Huang ◽

Xiaoling Hu ◽

...

Keyword(s):

Neural Networks ◽

Learning Environments ◽

Performance Prediction ◽

Learning Performance ◽

E Learning

Download Full-text

Relationship between levels of persistent excitation, architectures of neural networks and deterministic learning performance

2017 36th Chinese Control Conference (CCC) ◽

10.23919/chicc.2017.8027663 ◽

2017 ◽

Author(s):

Tongjia Zheng ◽

Cong Wang

Keyword(s):

Neural Networks ◽

Learning Performance ◽

Persistent Excitation ◽

Deterministic Learning

Download Full-text

The VC Dimension and Pseudodimension of Two-Layer Neural Networks with Discrete Inputs

Neural Computation ◽

10.1162/neco.1996.8.3.625 ◽

1996 ◽

Vol 8 (3) ◽

pp. 625-628 ◽

Cited By ~ 9

Author(s):

Peter L. Bartlett ◽

Robert C. Williamson

Keyword(s):

Neural Networks ◽

Basis Function ◽

Upper Bounds ◽

Pac Learning ◽

Learning Performance ◽

Sigmoid Function ◽

Vc Dimension ◽

Learning Framework ◽

Probably Approximately Correct ◽

Training Examples

We give upper bounds on the Vapnik-Chervonenkis dimension and pseudodimension of two-layer neural networks that use the standard sigmoid function or radial basis function and have inputs from {−D, …,D}n. In Valiant's probably approximately correct (pac) learning framework for pattern classification, and in Haussler's generalization of this framework to nonlinear regression, the results imply that the number of training examples necessary for satisfactory learning performance grows no more rapidly than W log (WD), where W is the number of weights. The previous best bound for these networks was O(W4).

Download Full-text

The remarkable robustness of surrogate gradient learning for instilling complex function in spiking neural networks

10.1101/2020.06.29.176925 ◽

2020 ◽

Author(s):

Friedemann Zenke ◽

Tim P. Vogels

Keyword(s):

Neural Networks ◽

Complex Function ◽

Spiking Neural Networks ◽

Learning Performance ◽

Design Parameters ◽

Classification Problems ◽

Systematic Account ◽

Practical Algorithms ◽

Spiking Networks ◽

Gradient Learning

AbstractBrains process information in spiking neural networks. Their intricate connections shape the diverse functions these networks perform. In comparison, the functional capabilities of models of spiking networks are still rudimentary. This shortcoming is mainly due to the lack of insight and practical algorithms to construct the necessary connectivity. Any such algorithm typically attempts to build networks by iteratively reducing the error compared to a desired output. But assigning credit to hidden units in multi-layered spiking networks has remained challenging due to the non-differentiable nonlinearity of spikes. To avoid this issue, one can employ surrogate gradients to discover the required connectivity in spiking network models. However, the choice of a surrogate is not unique, raising the question of how its implementation influences the effectiveness of the method. Here, we use numerical simulations to systematically study how essential design parameters of surrogate gradients impact learning performance on a range of classification problems. We show that surrogate gradient learning is robust to different shapes of underlying surrogate derivatives, but the choice of the derivative’s scale can substantially affect learning performance. When we combine surrogate gradients with a suitable activity regularization technique, robust information processing can be achieved in spiking networks even at the sparse activity limit. Our study provides a systematic account of the remarkable robustness of surrogate gradient learning and serves as a practical guide to model functional spiking neural networks.

Download Full-text

Spatio-Temporal Downscaling of Climate Data Using Convolutional and Error-Predicting Neural Networks

Frontiers in Climate ◽

10.3389/fclim.2021.656479 ◽

2021 ◽

Vol 3 ◽

Author(s):

Agon Serifi ◽

Tobias Günther ◽

Nikolina Ban

Keyword(s):

Neural Networks ◽

High Resolution ◽

Network Architecture ◽

Supervised Machine Learning ◽

Learning Performance ◽

Data Reconstruction ◽

Practical Reasons ◽

Climate Data ◽

Low Resolution ◽

Weather And Climate

Numerical weather and climate simulations nowadays produce terabytes of data, and the data volume continues to increase rapidly since an increase in resolution greatly benefits the simulation of weather and climate. In practice, however, data is often available at lower resolution only, for which there are many practical reasons, such as data coarsening to meet memory constraints, limited computational resources, favoring multiple low-resolution ensemble simulations over few high-resolution simulations, as well as limits of sensing instruments in observations. In order to enable a more insightful analysis, we investigate the capabilities of neural networks to reconstruct high-resolution data from given low-resolution simulations. For this, we phrase the data reconstruction as a super-resolution problem from multiple data sources, tailored toward meteorological and climatological data. We therefore investigate supervised machine learning using multiple deep convolutional neural network architectures to test the limits of data reconstruction for various spatial and temporal resolutions, low-frequent and high-frequent input data, and the generalization to numerical and observed data. Once such downscaling networks are trained, they serve two purposes: First, legacy low-resolution simulations can be downscaled to reconstruct high-resolution detail. Second, past observations that have been taken at lower resolutions can be increased to higher resolutions, opening new analysis possibilities. For the downscaling of high-frequent fields like precipitation, we show that error-predicting networks are far less suitable than deconvolutional neural networks due to the poor learning performance. We demonstrate that deep convolutional downscaling has the potential to become a building block of modern weather and climate analysis in both research and operational forecasting, and show that the ideal choice of the network architecture depends on the type of data to predict, i.e., there is no single best architecture for all variables.

Download Full-text

Neutrosophic Compound Orthogonal Neural Network and Its Applications in Neutrosophic Function Approximation

Symmetry ◽

10.3390/sym11020147 ◽

2019 ◽

Vol 11 (2) ◽

pp. 147 ◽

Cited By ~ 1

Author(s):

Jun Ye ◽

Wenhua Cui

Keyword(s):

Neural Network ◽

Neural Networks ◽

Learning Algorithm ◽

Learning Performance ◽

Uncertain Information ◽

Sigmoid Function ◽

Nonlinear Functions ◽

Single Output ◽

Classification Pattern ◽

Hidden Layer

Neural networks are powerful universal approximation tools. They have been utilized for functions/data approximation, classification, pattern recognition, as well as their various applications. Uncertain or interval values result from the incompleteness of measurements, human observation and estimations in the real world. Thus, a neutrosophic number (NsN) can represent both certain and uncertain information in an indeterminate setting and imply a changeable interval depending on its indeterminate ranges. In NsN settings, however, existing interval neural networks cannot deal with uncertain problems with NsNs. Therefore, this original study proposes a neutrosophic compound orthogonal neural network (NCONN) for the first time, containing the NsN weight values, NsN input and output, and hidden layer neutrosophic neuron functions, to approximate neutrosophic functions/NsN data. In the proposed NCONN model, single input and single output neurons are the transmission notes of NsN data and hidden layer neutrosophic neurons are constructed by the compound functions of both the Chebyshev neutrosophic orthogonal polynomial and the neutrosophic sigmoid function. In addition, illustrative and actual examples are provided to verify the effectiveness and learning performance of the proposed NCONN model for approximating neutrosophic nonlinear functions and NsN data. The contribution of this study is that the proposed NCONN can handle the approximation problems of neutrosophic nonlinear functions and NsN data. However, the main advantage is that the proposed NCONN implies a simple learning algorithm, higher speed learning convergence, and higher learning accuracy in indeterminate/NsN environments.

Download Full-text

Use of a Sparse Structure to Improve Learning Performance of Recurrent Neural Networks

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-642-24965-5_36 ◽

2011 ◽

pp. 323-331 ◽

Cited By ~ 1

Author(s):

Hiromitsu Awano ◽

Shun Nishide ◽

Hiroaki Arie ◽

Jun Tani ◽

Toru Takahashi ◽

...

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Learning Performance

Download Full-text

Predicting and Understanding Student Learning Performance Using Multi-source Sparse Attention Convolutional Neural Networks

IEEE Transactions on Big Data ◽

10.1109/tbdata.2021.3125204 ◽

2021 ◽

pp. 1-1

Author(s):

Yupei Zhang ◽

Rui An ◽

Shuhui Liu ◽

Jiaqi Cui ◽

Xuequn Shang

Keyword(s):

Neural Networks ◽

Student Learning ◽

Convolutional Neural Networks ◽

Learning Performance

Download Full-text

A Study of Deep Neural Networks Transfer Learning For Fault Diagnosis Applications

Annual Conference of the PHM Society ◽

10.36001/phmconf.2021.v13i1.2996 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Michael Franco-Garcia ◽

Alex Benasutti ◽

Larry Pearlstein ◽

Mohammed Alabsi

Keyword(s):

Neural Networks ◽

Fault Diagnosis ◽

Transfer Learning ◽

Deep Neural Networks ◽

Model Performance ◽

Poor Performance ◽

Operating Conditions ◽

Learning Performance ◽

Vibration Data ◽

Model Training

Intelligent fault diagnosis utilizing deep learning algorithms has been widely investigated recently. Although previous results demonstrated excellent performance, features learned by Deep Neural Networks (DNN) are part of a large black box. Consequently, lack of understanding of underlying physical meanings embedded within the features can lead to poor performance when applied to different but related datasets i.e. transfer learning applications. This study will investigate the transfer learning performance of a Convolution Neural Network (CNN) considering 4 different operating conditions. Utilizing the Case Western Reserve University (CWRU) bearing dataset, the CNN will be trained to classify 12 classes. Each class represents a unique differentfault scenario with varying severity i.e. inner race fault of 0.007”, 0.014” diameter. Initially, zero load data will be utilized for model training and the model will be tuned until testing accuracy above 99% is obtained. The model performance will be evaluated by feeding vibration data collected when the load is varied to 1, 2 and 3 HP. Initial results indicated that the classification accuracy will degrade substantially. Hence, this paper will visualize convolution kernels in time and frequency domains and will investigate the influence of changing loads on fault characteristics, network classification mechanism and activation strength.

Download Full-text