scholarly journals Regularisation of neural networks by enforcing Lipschitz continuity

Author(s):  
Henry Gouk ◽  
Eibe Frank ◽  
Bernhard Pfahringer ◽  
Michael J. Cree

AbstractWe investigate the effect of explicitly enforcing the Lipschitz continuity of neural networks with respect to their inputs. To this end, we provide a simple technique for computing an upper bound to the Lipschitz constant—for multiple p-norms—of a feed forward neural network composed of commonly used layer types. Our technique is then used to formulate training a neural network with a bounded Lipschitz constant as a constrained optimisation problem that can be solved using projected stochastic gradient methods. Our evaluation study shows that the performance of the resulting models exceeds that of models trained with other common regularisers. We also provide evidence that the hyperparameters are intuitive to tune, demonstrate how the choice of norm for computing the Lipschitz constant impacts the resulting model, and show that the performance gains provided by our method are particularly noticeable when only a small amount of training data is available.


Author(s):  
Prof. Ahlam Ansari ◽  
Ashhar Shaikh ◽  
Faraz Shaikh ◽  
Faisal Sayed

Artificial neural networks, usually just called neural networks, computing systems indefinitely inspired by the biological neural networks and they are extensive in both research as well as industry. It is critical to design quantum Neural Networks for complete quantum learning tasks. In this project, we suggest a computational neural network model based on principles of quantum mechanics which form a quantum feed-forward neural network proficient in universal quantum computation. This structure takes input from one layer of qubits and drives that input onto another layer of qubits. This layer of qubits evaluates this information and drives on the output to the next layer. Eventually, the path leads to the final layer of qubits. The layers do not have to be of the same breadth, meaning they need not have the same number of qubits as the layer before and/or after it. This assembly is trained on which path to take identical to classical ANN. The intended project can be compiled by the subsequent points provided here: 1. The expert training of the quantum neural network utilizing the fidelity as a cost function, providing both conventional and efficient quantum implementations. 2. Use of methods that enable quick optimization with reduced memory requirements. 3. Benchmarking our proposal for the quantum task of learning an unknown unitary and find extraordinary generality and a remarkable sturdiness to noisy training data.



1992 ◽  
Vol 26 (9-11) ◽  
pp. 2461-2464 ◽  
Author(s):  
R. D. Tyagi ◽  
Y. G. Du

A steady-statemathematical model of an activated sludgeprocess with a secondary settler was developed. With a limited number of training data samples obtained from the simulation at steady state, a feedforward neural network was established which exhibits an excellent capability for the operational prediction and determination.



2020 ◽  
Vol 10 (6) ◽  
pp. 2104
Author(s):  
Michał Tomaszewski ◽  
Paweł Michalski ◽  
Jakub Osuchowski

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.



2002 ◽  
Vol 12 (01) ◽  
pp. 31-43 ◽  
Author(s):  
GARY YEN ◽  
HAIMING LU

In this paper, we propose a genetic algorithm based design procedure for a multi-layer feed-forward neural network. A hierarchical genetic algorithm is used to evolve both the neural network's topology and weighting parameters. Compared with traditional genetic algorithm based designs for neural networks, the hierarchical approach addresses several deficiencies, including a feasibility check highlighted in literature. A multi-objective cost function is used herein to optimize the performance and topology of the evolved neural network simultaneously. In the prediction of Mackey–Glass chaotic time series, the networks designed by the proposed approach prove to be competitive, or even superior, to traditional learning algorithms for the multi-layer Perceptron networks and radial-basis function networks. Based upon the chosen cost function, a linear weight combination decision-making approach has been applied to derive an approximated Pareto-optimal solution set. Therefore, designing a set of neural networks can be considered as solving a two-objective optimization problem.



2020 ◽  
Vol 49 (4) ◽  
pp. 482-494
Author(s):  
Jurgita Kapočiūtė-Dzikienė ◽  
Senait Gebremichael Tesfagergish

Deep Neural Networks (DNNs) have proven to be especially successful in the area of Natural Language Processing (NLP) and Part-Of-Speech (POS) tagging—which is the process of mapping words to their corresponding POS labels depending on the context. Despite recent development of language technologies, low-resourced languages (such as an East African Tigrinya language), have received too little attention. We investigate the effectiveness of Deep Learning (DL) solutions for the low-resourced Tigrinya language of the Northern-Ethiopic branch. We have selected Tigrinya as the testbed example and have tested state-of-the-art DL approaches seeking to build the most accurate POS tagger. We have evaluated DNN classifiers (Feed Forward Neural Network – FFNN, Long Short-Term Memory method – LSTM, Bidirectional LSTM, and Convolutional Neural Network – CNN) on a top of neural word2vec word embeddings with a small training corpus known as Nagaoka Tigrinya Corpus. To determine the best DNN classifier type, its architecture and hyper-parameter set both manual and automatic hyper-parameter tuning has been performed. BiLSTM method was proved to be the most suitable for our solving task: it achieved the highest accuracy equal to 92% that is 65% above the random baseline.



SINERGI ◽  
2020 ◽  
Vol 24 (1) ◽  
pp. 29
Author(s):  
Widi Aribowo

Load shedding plays a key part in the avoidance of the power system outage. The frequency and voltage fluidity leads to the spread of a power system into sub-systems and leads to the outage as well as the severe breakdown of the system utility.  In recent years, Neural networks have been very victorious in several signal processing and control applications.  Recurrent Neural networks are capable of handling complex and non-linear problems. This paper provides an algorithm for load shedding using ELMAN Recurrent Neural Networks (RNN). Elman has proposed a partially RNN, where the feedforward connections are modifiable and the recurrent connections are fixed. The research is implemented in MATLAB and the performance is tested with a 6 bus system. The results are compared with the Genetic Algorithm (GA), Combining Genetic Algorithm with Feed Forward Neural Network (hybrid) and RNN. The proposed method is capable of assigning load releases needed and more efficient than other methods. 



2018 ◽  
Vol 26 (3) ◽  
pp. 349-368 ◽  
Author(s):  
Alemdar Hasanov

AbstractThis paper studies the Lipschitz continuity of the Fréchet gradient of the Tikhonov functional {J(k):=(1/2)\lVert u(0,\cdot\,;k)-f\rVert^{2}_{L^{2}(0,T)}} corresponding to an inverse coefficient problem for the {1D} parabolic equation {u_{t}=(k(x)u_{x})_{x}} with the Neumann boundary conditions {-k(0)u_{x}(0,t)=g(t)} and {u_{x}(l,t)=0}. In addition, compactness and Lipschitz continuity of the input-output operator\Phi[k]:=u(x,t;k)\lvert_{x=0^{+}},\quad\Phi[\,\cdot\,]:\mathcal{K}\subset H^{1% }(0,l)\mapsto H^{1}(0,T),as well as solvability of the regularized inverse problem and the Lipschitz continuity of the Fréchet gradient of the Tikhonov functional are proved. Furthermore, relationships between the sufficient conditions for the Lipschitz continuity of the Fréchet gradient and the regularity of the weak solution of the direct problem as well as the measured output {f(t):=u(0,t;k)} are established. One of the derived lemmas also introduces a useful application of the Lipschitz continuity of the Fréchet gradient. This lemma shows that an important advantage of gradient methods comes when dealing with the functionals of class {C^{1,1}(\mathcal{K})}. Specifically, this lemma asserts that if {J\in C^{1,1}(\mathcal{K})} and {\{k^{(n)}\}\subset\mathcal{K}} is the sequence of iterations obtained by the Landweber iteration algorithm {k^{(n+1)}=k^{(n)}+\omega_{n}J^{\prime}(k^{(n)})}, then for {\omega_{n}\in(0,2/L_{g})}, where {L_{g}>0} is the Lipschitz constant, the sequence {\{J(k^{(n)})\}} is monotonically decreasing and {\lim_{n\to\infty}\lVert J^{\prime}(k^{(n)})\rVert=0}.



2021 ◽  
Vol 4 (1) ◽  
pp. 71-79
Author(s):  
Borys Igorovych Tymchenko

Nowadays, means of preventive management in various spheres of human life are actively developing. The task of automated screening is to detect hidden problems at an early stage without human intervention, while the cost of responding to them is low. Visual inspection is often used to perform a screening task. Deep artificial neural networks are especially popular in image processing. One of the main problems when working with them is the need for a large amount of well-labeled data for training. In automated screening systems, available neural network approaches have limitations on the reliability of predictions due to the lack of accurately marked training data, as obtaining quality markup from professionals is very expensive, and sometimes not possible in principle. Therefore, there is a contradiction between increasing the requirements for the precision of predictions of neural network models without increasing the time spent on the one hand, and the need to reduce the cost of obtaining the markup of educational data. In this paper, we propose the parametric model of the segmentation dataset, which can be used to generate training data for model selection and benchmarking; and the multi-task learning method for training and inference of deep neural networks for semantic segmentation. Based on the proposed method, we develop a semi-supervised approach for segmentation of salient regions for classification task. The main advantage of the proposed method is that it uses semantically-similar general tasks, that have better labeling than original one, what allows users to reduce the cost of the labeling process. We propose to use classification task as a more general to the problem of semantic segmentation. As semantic segmentation aims to classify each pixel in the input image, classification aims to assign a class to all of the pixels in the input image. We evaluate our methods using the proposed dataset model, observing the Dice score improvement by seventeen percent. Additionally, we evaluate the robustness of the proposed method to different amount of the noise in labels and observe consistent improvement over baseline version.



Author(s):  
Tsung-Chih Lin ◽  
Yi-Ming Chang ◽  
Tun-Yuan Lee

This paper proposes a novel fuzzy modeling approach for identification of dynamic systems. A fuzzy model, recurrent interval type-2 fuzzy neural network (RIT2FNN), is constructed by using a recurrent neural network which recurrent weights, mean and standard deviation of the membership functions are updated. The complete back propagation (BP) algorithm tuning equations used to tune the antecedent and consequent parameters for the interval type-2 fuzzy neural networks (IT2FNNs) are developed to handle the training data corrupted by noise or rule uncertainties for nonlinear system identification involving external disturbances. Only by using the current inputs and most recent outputs of the input layers, the system can be completely identified based on RIT2FNNs. In order to show that the interval IT2FNNs can handle the measurement uncertainties, training data are corrupted by white Gaussian noise with signal-to-noise ratio (SNR) 20 dB. Simulation results are obtained for the identification of nonlinear system, which yield more improved performance than those using recurrent type-1 fuzzy neural networks (RT1FNNs).



2022 ◽  
pp. 1559-1575
Author(s):  
Mário Pereira Véstias

Machine learning is the study of algorithms and models for computing systems to do tasks based on pattern identification and inference. When it is difficult or infeasible to develop an algorithm to do a particular task, machine learning algorithms can provide an output based on previous training data. A well-known machine learning model is deep learning. The most recent deep learning models are based on artificial neural networks (ANN). There exist several types of artificial neural networks including the feedforward neural network, the Kohonen self-organizing neural network, the recurrent neural network, the convolutional neural network, the modular neural network, among others. This article focuses on convolutional neural networks with a description of the model, the training and inference processes and its applicability. It will also give an overview of the most used CNN models and what to expect from the next generation of CNN models.



Sign in / Sign up

Export Citation Format

Share Document