Deep Learning Theory and Software

In the past decade, deep learning has achieved a significant breakthrough in development. In addition to the emergence of convolution, the most important is self-learning of deep neural networks. By self-learning methods, adaptive weights of kernels and built-in parameters or interconnections are automatically modified such that the error rate is reduced along the learning process, and the recognition rate is improved. Emulating mechanism of the brain, it can have accurate recognition ability after learning. One of the most important self-learning methods is back-propagation (BP). The current BP method is indeed a systematic way of calculating the gradient of the loss with respect to adaptive interconnections. The main core of the gradient descent method addresses on modifying the weights negatively proportional to the determined gradient of the loss function, subsequently reducing the error of the network response in comparison with the standard answer. The basic assumption for this type of the gradient-based self-learning is that the loss function is the first-order differential.

2020 ◽  
Vol 44 (2) ◽  
pp. 282-289
Author(s):  
I.M. Kulikovsvkikh

Previous research in deep learning indicates that iterations of the gradient descent, over separable data converge toward the L2 maximum margin solution. Even in the absence of explicit regularization, the decision boundary still changes even if the classification error on training is equal to zero. This feature of the so-called “implicit regularization” allows gradient methods to use more aggressive learning rates that result in substantial computational savings. However, even if the gradient descent method generalizes well, going toward the optimal solution, the rate of convergence to this solution is much slower than the rate of convergence of a loss function itself with a fixed step size. The present study puts forward the generalized logistic loss function that involves the optimization of hyperparameters, which results in a faster convergence rate while keeping the same regret bound as the gradient descent method. The results of computational experiments on MNIST and Fashion MNIST benchmark datasets for image classification proved the viability of the proposed approach to reducing computational costs and outlined directions for future research.


The structure of Electronic Voting Machine (EVM) is an interconnected network of discrete components that record and count the votes of voters. The EVM system consists of four main subsystems which are Mother board of computer, Voting keys, Database storage system, power supply (AC and DC) along with various conditions of functioning as well as deficiency. The deficiency or failure of system is due to its components (hardware), software and human mismanagement. It is essential to reduce complexity of interconnected components and increase system reliability. Reliability analysis helps to identify technical situations that may affect the system and to predict the life of the system in future. The aim of this research paper is to analyze the reliability parameters of an EVM system using one of the approaches of computational intelligence, the neural network (NN). The probabilistic equations of system states and other reliability parameters are established for the proposed EVM model using neural network approach. It is useful for predicting various reliability parameters and improves the accuracy and consistency of parameters. To guarantee the reliability of the system, Back Propagation Neural Network (BPNN) architecture is used to learn a mechanism that can update the weights which produce optimal parameters values. Numerical examples are considered to authenticate the results of reliability, unreliability and profit function. To minimize the error and optimize the output in the form of reliability using gradient descent method, authors iterate repeatedly till the precision of 0.0001 error using MATLAB code. These parameters are of immense help in real time applications of Electronic Voting Machine during elections.


2015 ◽  
Vol 7 (2) ◽  
pp. 89-103 ◽  
Author(s):  
Jian Wang ◽  
Guoling Yang ◽  
Shan Liu ◽  
Jacek M. Zurada

Abstract Gradient descent method is one of the popular methods to train feedforward neural networks. Batch and incremental modes are the two most common methods to practically implement the gradient-based training for such networks. Furthermore, since generalization is an important property and quality criterion of a trained network, pruning algorithms with the addition of regularization terms have been widely used as an efficient way to achieve good generalization. In this paper, we review the convergence property and other performance aspects of recently researched training approaches based on different penalization terms. In addition, we show the smoothing approximation tricks when the penalty term is non-differentiable at origin.


2018 ◽  
Author(s):  
Kazunori D Yamada

ABSTRACTIn the deep learning era, stochastic gradient descent is the most common method used for optimizing neural network parameters. Among the various mathematical optimization methods, the gradient descent method is the most naive. Adjustment of learning rate is necessary for quick convergence, which is normally done manually with gradient descent. Many optimizers have been developed to control the learning rate and increase convergence speed. Generally, these optimizers adjust the learning rate automatically in response to learning status. These optimizers were gradually improved by incorporating the effective aspects of earlier methods. In this study, we developed a new optimizer: YamAdam. Our optimizer is based on Adam, which utilizes the first and second moments of previous gradients. In addition to the moment estimation system, we incorporated an advantageous part of AdaDelta, namely a unit correction system, into YamAdam. According to benchmark tests on some common datasets, our optimizer showed similar or faster convergent performance compared to the existing methods. YamAdam is an option as an alternative optimizer for deep learning.


2021 ◽  
Author(s):  
Fangyuan Yan ◽  
Juanli Li ◽  
Dong Miao ◽  
Qi Cao

Abstract A reliable braking system is an important guarantee for safe operation of mine hoist. In order to make full use of the monitoring data in the operation process of mine hoist, identify the operation status of the hoist, and further carry out fault diagnosis on it, the deep learning method was introduced into the fault diagnosis of the hoist, and a fault diagnosis method of hoist braking system based on convolution neural network has been proposed. Firstly, the working principle and fault mechanism of disc brake and its hydraulic station in hoist braking system are analyzed, and the monitoring parameters of this study are determined; then, based on massive monitoring data, the convolutional neural networks (CNN) is established, the one-dimensional signal collected by the sensor is transformed into two-dimensional image for coding, the neural network is trained by gradient descent method, and the network structure parameters are modified according to the training results. Finally, the fault diagnosis model is compared and verified by using the sample set based on the traditional back propagation neural network (BP) and CNN. The results show that the accuracy of CNN is higher than that of BP, and the accuracy rate can reach 99.375% after reducing the involvement between samples. This method can make full use of the monitoring data for diagnosis, without subjective intervention of experts, and improve the accuracy of diagnosis.


2012 ◽  
Vol 09 ◽  
pp. 432-439 ◽  
Author(s):  
MUHAMMAD ZUBAIR REHMAN ◽  
NAZRI MOHD. NAWI

Despite being widely used in the practical problems around the world, Gradient Descent Back-propagation algorithm comes with problems like slow convergence and convergence to local minima. Previous researchers have suggested certain modifications to improve the convergence in gradient Descent Back-propagation algorithm such as careful selection of input weights and biases, learning rate, momentum, network topology, activation function and value for 'gain' in the activation function. This research proposed an algorithm for improving the working performance of back-propagation algorithm which is 'Gradient Descent with Adaptive Momentum (GDAM)' by keeping the gain value fixed during all network trials. The performance of GDAM is compared with 'Gradient Descent with fixed Momentum (GDM)' and 'Gradient Descent Method with Adaptive Gain (GDM-AG)'. The learning rate is fixed to 0.4 and maximum epochs are set to 3000 while sigmoid activation function is used for the experimentation. The results show that GDAM is a better approach than previous methods with an accuracy ratio of 1.0 for classification problems like Wine Quality, Mushroom and Thyroid disease.


2021 ◽  
Vol 3 (1) ◽  
pp. 243-262
Author(s):  
Antoine Pirovano ◽  
Hippolyte Heuberger ◽  
Sylvain Berlemont ◽  
SaÏd Ladjal ◽  
Isabelle Bloch

Deep learning methods are widely used for medical applications to assist medical doctors in their daily routine. While performances reach expert’s level, interpretability (highlighting how and what a trained model learned and why it makes a specific decision) is the next important challenge that deep learning methods need to answer to be fully integrated in the medical field. In this paper, we address the question of interpretability in the context of whole slide images (WSI) classification with the formalization of the design of WSI classification architectures and propose a piece-wise interpretability approach, relying on gradient-based methods, feature visualization and multiple instance learning context. After training two WSI classification architectures on Camelyon-16 WSI dataset, highlighting discriminative features learned, and validating our approach with pathologists, we propose a novel manner of computing interpretability slide-level heat-maps, based on the extracted features, that improves tile-level classification performances. We measure the improvement using the tile-level AUC that we called Localization AUC, and show an improvement of more than 0.2. We also validate our results with a RemOve And Retrain (ROAR) measure. Then, after studying the impact of the number of features used for heat-map computation, we propose a corrective approach, relying on activation colocalization of selected features, that improves the performances and the stability of our proposed method.


Electronics ◽  
2022 ◽  
Vol 11 (1) ◽  
pp. 154
Author(s):  
Yuxin Ding ◽  
Miaomiao Shao ◽  
Cai Nie ◽  
Kunyang Fu

Deep learning methods have been applied to malware detection. However, deep learning algorithms are not safe, which can easily be fooled by adversarial samples. In this paper, we study how to generate malware adversarial samples using deep learning models. Gradient-based methods are usually used to generate adversarial samples. These methods generate adversarial samples case-by-case, which is very time-consuming to generate a large number of adversarial samples. To address this issue, we propose a novel method to generate adversarial malware samples. Different from gradient-based methods, we extract feature byte sequences from benign samples. Feature byte sequences represent the characteristics of benign samples and can affect classification decision. We directly inject feature byte sequences into malware samples to generate adversarial samples. Feature byte sequences can be shared to produce different adversarial samples, which can efficiently generate a large number of adversarial samples. We compare the proposed method with the randomly injecting and gradient-based methods. The experimental results show that the adversarial samples generated using our proposed method have a high successful rate.


2013 ◽  
Vol 442 ◽  
pp. 298-303
Author(s):  
He Xi Li ◽  
Tie Niu Yang ◽  
Xiao Xi Zheng

A robust method based on dual back-propagation neural networks (BPNN) is proposed to recognize light-emitting diode (LED) dies on wafer planes in vision-based LED sorting system. To improve the recognition accuracy of LED dies, particularly under the condition of LED die image degradation, a dual BPNN with 3 layers architecture is first constructed, in which the two BPNNs are trained respectively using the region features and edge features of die images based on batch gradient descent method, and then the outputs of the dual BPNN are fused according to Dempster-Shafer (D-S) theory to generate a final decision. Applying the trained dual BPNN as a classifier, the recognition experiment of a large-diameter LED wafer is completed, experimental results show that LED die recognition accuracy of the dual BPNN is higher than that of the single BPNN based on the region feature or edge feature and is robust to die image degradation, it can be used for LED die array location in vision-baesd LED sorting machines.


2021 ◽  
Author(s):  
Zhang Jian ◽  
Wanjuan Song

Abstract Image dehazing has always been a challenging topic in image processing. The development of deep learning methods, especially the Generative Adversarial Networks(GAN), provides a new way for image dehazing. In recent years, many deep learning methods based on GAN have been applied to image dehazing. However, GAN has two problems in image dehazing. Firstly, For haze image, haze not only reduces the quality of the image, but also blurs the details of the image. For Gan network, it is difficult for the generator to restore the details of the whole image while removing the haze. Secondly, GAN model is defined as a minimax problem, which weakens the loss function. It is difficult to distinguish whether GAN is making progress in the training process. Therefore, we propose a Guided Generative Adversarial Dehazing Network(GGADN). Different from other generation adversarial networks, GGADN adds a guided module on the generator. The guided module verifies the network of each layer of the generator. At the same time, the details of the map generated by each layer are strengthened. Network training is based on the pre-trained VGG feature model and L1-regularized gradient prior which is developed by new loss function parameters. From the dehazing results of synthetic images and real images, proposed method is better than the state-of-the-art dehazing methods.


Sign in / Sign up

Export Citation Format

Share Document