scholarly journals Improving Learning Performance in Neural Networks

2021 ◽  
Vol 14 (1) ◽  
pp. 27-42
Author(s):  
Falah Al-akashi ◽  
Author(s):  
Serkan Kiranyaz ◽  
Junaid Malik ◽  
Habib Ben Abdallah ◽  
Turker Ince ◽  
Alexandros Iosifidis ◽  
...  

AbstractThe recently proposed network model, Operational Neural Networks (ONNs), can generalize the conventional Convolutional Neural Networks (CNNs) that are homogenous only with a linear neuron model. As a heterogenous network model, ONNs are based on a generalized neuron model that can encapsulate any set of non-linear operators to boost diversity and to learn highly complex and multi-modal functions or spaces with minimal network complexity and training data. However, the default search method to find optimal operators in ONNs, the so-called Greedy Iterative Search (GIS) method, usually takes several training sessions to find a single operator set per layer. This is not only computationally demanding, also the network heterogeneity is limited since the same set of operators will then be used for all neurons in each layer. To address this deficiency and exploit a superior level of heterogeneity, in this study the focus is drawn on searching the best-possible operator set(s) for the hidden neurons of the network based on the “Synaptic Plasticity” paradigm that poses the essential learning theory in biological neurons. During training, each operator set in the library can be evaluated by their synaptic plasticity level, ranked from the worst to the best, and an “elite” ONN can then be configured using the top-ranked operator sets found at each hidden layer. Experimental results over highly challenging problems demonstrate that the elite ONNs even with few neurons and layers can achieve a superior learning performance than GIS-based ONNs and as a result, the performance gap over the CNNs further widens.


Computing ◽  
2019 ◽  
Vol 101 (6) ◽  
pp. 587-604 ◽  
Author(s):  
Xizhe Wang ◽  
Pengze Wu ◽  
Guang Liu ◽  
Qionghao Huang ◽  
Xiaoling Hu ◽  
...  

1996 ◽  
Vol 8 (3) ◽  
pp. 625-628 ◽  
Author(s):  
Peter L. Bartlett ◽  
Robert C. Williamson

We give upper bounds on the Vapnik-Chervonenkis dimension and pseudodimension of two-layer neural networks that use the standard sigmoid function or radial basis function and have inputs from {−D, …,D}n. In Valiant's probably approximately correct (pac) learning framework for pattern classification, and in Haussler's generalization of this framework to nonlinear regression, the results imply that the number of training examples necessary for satisfactory learning performance grows no more rapidly than W log (WD), where W is the number of weights. The previous best bound for these networks was O(W4).


2020 ◽  
Author(s):  
Friedemann Zenke ◽  
Tim P. Vogels

AbstractBrains process information in spiking neural networks. Their intricate connections shape the diverse functions these networks perform. In comparison, the functional capabilities of models of spiking networks are still rudimentary. This shortcoming is mainly due to the lack of insight and practical algorithms to construct the necessary connectivity. Any such algorithm typically attempts to build networks by iteratively reducing the error compared to a desired output. But assigning credit to hidden units in multi-layered spiking networks has remained challenging due to the non-differentiable nonlinearity of spikes. To avoid this issue, one can employ surrogate gradients to discover the required connectivity in spiking network models. However, the choice of a surrogate is not unique, raising the question of how its implementation influences the effectiveness of the method. Here, we use numerical simulations to systematically study how essential design parameters of surrogate gradients impact learning performance on a range of classification problems. We show that surrogate gradient learning is robust to different shapes of underlying surrogate derivatives, but the choice of the derivative’s scale can substantially affect learning performance. When we combine surrogate gradients with a suitable activity regularization technique, robust information processing can be achieved in spiking networks even at the sparse activity limit. Our study provides a systematic account of the remarkable robustness of surrogate gradient learning and serves as a practical guide to model functional spiking neural networks.


2021 ◽  
Vol 3 ◽  
Author(s):  
Agon Serifi ◽  
Tobias Günther ◽  
Nikolina Ban

Numerical weather and climate simulations nowadays produce terabytes of data, and the data volume continues to increase rapidly since an increase in resolution greatly benefits the simulation of weather and climate. In practice, however, data is often available at lower resolution only, for which there are many practical reasons, such as data coarsening to meet memory constraints, limited computational resources, favoring multiple low-resolution ensemble simulations over few high-resolution simulations, as well as limits of sensing instruments in observations. In order to enable a more insightful analysis, we investigate the capabilities of neural networks to reconstruct high-resolution data from given low-resolution simulations. For this, we phrase the data reconstruction as a super-resolution problem from multiple data sources, tailored toward meteorological and climatological data. We therefore investigate supervised machine learning using multiple deep convolutional neural network architectures to test the limits of data reconstruction for various spatial and temporal resolutions, low-frequent and high-frequent input data, and the generalization to numerical and observed data. Once such downscaling networks are trained, they serve two purposes: First, legacy low-resolution simulations can be downscaled to reconstruct high-resolution detail. Second, past observations that have been taken at lower resolutions can be increased to higher resolutions, opening new analysis possibilities. For the downscaling of high-frequent fields like precipitation, we show that error-predicting networks are far less suitable than deconvolutional neural networks due to the poor learning performance. We demonstrate that deep convolutional downscaling has the potential to become a building block of modern weather and climate analysis in both research and operational forecasting, and show that the ideal choice of the network architecture depends on the type of data to predict, i.e., there is no single best architecture for all variables.


Symmetry ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 147 ◽  
Author(s):  
Jun Ye ◽  
Wenhua Cui

Neural networks are powerful universal approximation tools. They have been utilized for functions/data approximation, classification, pattern recognition, as well as their various applications. Uncertain or interval values result from the incompleteness of measurements, human observation and estimations in the real world. Thus, a neutrosophic number (NsN) can represent both certain and uncertain information in an indeterminate setting and imply a changeable interval depending on its indeterminate ranges. In NsN settings, however, existing interval neural networks cannot deal with uncertain problems with NsNs. Therefore, this original study proposes a neutrosophic compound orthogonal neural network (NCONN) for the first time, containing the NsN weight values, NsN input and output, and hidden layer neutrosophic neuron functions, to approximate neutrosophic functions/NsN data. In the proposed NCONN model, single input and single output neurons are the transmission notes of NsN data and hidden layer neutrosophic neurons are constructed by the compound functions of both the Chebyshev neutrosophic orthogonal polynomial and the neutrosophic sigmoid function. In addition, illustrative and actual examples are provided to verify the effectiveness and learning performance of the proposed NCONN model for approximating neutrosophic nonlinear functions and NsN data. The contribution of this study is that the proposed NCONN can handle the approximation problems of neutrosophic nonlinear functions and NsN data. However, the main advantage is that the proposed NCONN implies a simple learning algorithm, higher speed learning convergence, and higher learning accuracy in indeterminate/NsN environments.


Author(s):  
Hiromitsu Awano ◽  
Shun Nishide ◽  
Hiroaki Arie ◽  
Jun Tani ◽  
Toru Takahashi ◽  
...  

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Michael Franco-Garcia ◽  
Alex Benasutti ◽  
Larry Pearlstein ◽  
Mohammed Alabsi

Intelligent fault diagnosis utilizing deep learning algorithms has been widely investigated recently. Although previous results demonstrated excellent performance, features learned by Deep Neural Networks (DNN) are part of a large black box. Consequently, lack of understanding of underlying physical meanings embedded within the features can lead to poor performance when applied to different but related datasets i.e. transfer learning applications. This study will investigate the transfer learning performance of a Convolution Neural Network (CNN) considering 4 different operating conditions. Utilizing the Case Western Reserve University (CWRU) bearing dataset, the CNN will be trained to classify 12 classes. Each class represents a unique differentfault scenario with varying severity i.e. inner race fault of 0.007”, 0.014” diameter. Initially, zero load data will be utilized for model training and the model will be tuned until testing accuracy above 99% is obtained. The model performance will be evaluated by feeding vibration data collected when the load is varied to 1, 2 and 3 HP. Initial results indicated that the classification accuracy will degrade substantially. Hence, this paper will visualize convolution kernels in time and frequency domains and will investigate the influence of changing loads on fault characteristics, network classification mechanism and activation strength.


Sign in / Sign up

Export Citation Format

Share Document