Deep Learning Theory and Software

Advances in Computer and Electrical Engineering - MatConvNet Deep Learning and iOS Mobile App Design for Pattern Recognition ◽

10.4018/978-1-7998-1554-9.ch002 ◽

2020 ◽

pp. 23-61

Keyword(s):

Deep Learning ◽

Loss Function ◽

Recognition Rate ◽

Back Propagation ◽

Descent Method ◽

Gradient Descent Method ◽

Learning Methods ◽

Adaptive Weights ◽

Gradient Based ◽

Self Learning

In the past decade, deep learning has achieved a significant breakthrough in development. In addition to the emergence of convolution, the most important is self-learning of deep neural networks. By self-learning methods, adaptive weights of kernels and built-in parameters or interconnections are automatically modified such that the error rate is reduced along the learning process, and the recognition rate is improved. Emulating mechanism of the brain, it can have accurate recognition ability after learning. One of the most important self-learning methods is back-propagation (BP). The current BP method is indeed a systematic way of calculating the gradient of the loss with respect to adaptive interconnections. The main core of the gradient descent method addresses on modifying the weights negatively proportional to the determined gradient of the loss function, subsequently reducing the error of the network response in comparison with the standard answer. The basic assumption for this type of the gradient-based self-learning is that the loss function is the first-order differential.

Download Full-text

Reducing computational costs in deep learning on almost linearly separable training data

Computer Optics ◽

10.18287/2412-6179-co-645 ◽

2020 ◽

Vol 44 (2) ◽

pp. 282-289

Author(s):

I.M. Kulikovsvkikh

Keyword(s):

Deep Learning ◽

Rate Of Convergence ◽

Loss Function ◽

Gradient Descent ◽

Gradient Methods ◽

Descent Method ◽

Training Data ◽

Classification Error ◽

Gradient Descent Method ◽

Computational Costs

Previous research in deep learning indicates that iterations of the gradient descent, over separable data converge toward the L2 maximum margin solution. Even in the absence of explicit regularization, the decision boundary still changes even if the classification error on training is equal to zero. This feature of the so-called “implicit regularization” allows gradient methods to use more aggressive learning rates that result in substantial computational savings. However, even if the gradient descent method generalizes well, going toward the optimal solution, the rate of convergence to this solution is much slower than the rate of convergence of a loss function itself with a fixed step size. The present study puts forward the generalized logistic loss function that involves the optimization of hyperparameters, which results in a faster convergence rate while keeping the same regret bound as the gradient descent method. The results of computational experiments on MNIST and Fashion MNIST benchmark datasets for image classification proved the viability of the proposed approach to reducing computational costs and outlined directions for future research.

Download Full-text

Assess Reliability Parameters of an Electronic Voting Machine using a Neural Network Technique

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.e1056.089620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 269-275

Keyword(s):

Neural Network ◽

Storage System ◽

Back Propagation ◽

Back Propagation Neural Network ◽

Descent Method ◽

Electronic Voting ◽

Gradient Descent Method ◽

Neural Network Approach ◽

Reliability Parameters ◽

Voting Machine

The structure of Electronic Voting Machine (EVM) is an interconnected network of discrete components that record and count the votes of voters. The EVM system consists of four main subsystems which are Mother board of computer, Voting keys, Database storage system, power supply (AC and DC) along with various conditions of functioning as well as deficiency. The deficiency or failure of system is due to its components (hardware), software and human mismanagement. It is essential to reduce complexity of interconnected components and increase system reliability. Reliability analysis helps to identify technical situations that may affect the system and to predict the life of the system in future. The aim of this research paper is to analyze the reliability parameters of an EVM system using one of the approaches of computational intelligence, the neural network (NN). The probabilistic equations of system states and other reliability parameters are established for the proposed EVM model using neural network approach. It is useful for predicting various reliability parameters and improves the accuracy and consistency of parameters. To guarantee the reliability of the system, Back Propagation Neural Network (BPNN) architecture is used to learn a mechanism that can update the weights which produce optimal parameters values. Numerical examples are considered to authenticate the results of reliability, unreliability and profit function. To minimize the error and optimize the output in the form of reliability using gradient descent method, authors iterate repeatedly till the precision of 0.0001 error using MATLAB code. These parameters are of immense help in real time applications of Electronic Voting Machine during elections.

Download Full-text

Convergence Analysis of Multilayer Feedforward Networks Trained with Penalty Terms: A Review

Journal of Applied Computer Science Methods ◽

10.1515/jacsm-2015-0011 ◽

2015 ◽

Vol 7 (2) ◽

pp. 89-103 ◽

Cited By ~ 2

Author(s):

Jian Wang ◽

Guoling Yang ◽

Shan Liu ◽

Jacek M. Zurada

Keyword(s):

Gradient Descent ◽

Convergence Property ◽

Quality Criterion ◽

Feedforward Neural Networks ◽

Descent Method ◽

Gradient Descent Method ◽

Penalty Term ◽

Network Pruning ◽

Gradient Based ◽

Pruning Algorithms

Abstract Gradient descent method is one of the popular methods to train feedforward neural networks. Batch and incremental modes are the two most common methods to practically implement the gradient-based training for such networks. Furthermore, since generalization is an important property and quality criterion of a trained network, pruning algorithms with the addition of regularization terms have been widely used as an efficient way to achieve good generalization. In this paper, we review the convergence property and other performance aspects of recently researched training approaches based on different penalization terms. In addition, we show the smoothing approximation tricks when the penalty term is non-differentiable at origin.

Download Full-text

Hyperparameter-free optimizer of stochastic gradient descent that incorporates unit correction and moment estimation

10.1101/348557 ◽

2018 ◽

Author(s):

Kazunori D Yamada

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Mathematical Optimization ◽

Descent Method ◽

Learning Rate ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Gradient Descent Method ◽

Moment Estimation ◽

Estimation System

ABSTRACTIn the deep learning era, stochastic gradient descent is the most common method used for optimizing neural network parameters. Among the various mathematical optimization methods, the gradient descent method is the most naive. Adjustment of learning rate is necessary for quick convergence, which is normally done manually with gradient descent. Many optimizers have been developed to control the learning rate and increase convergence speed. Generally, these optimizers adjust the learning rate automatically in response to learning status. These optimizers were gradually improved by incorporating the effective aspects of earlier methods. In this study, we developed a new optimizer: YamAdam. Our optimizer is based on Adam, which utilizes the first and second moments of previous gradients. In addition to the moment estimation system, we incorporated an advantageous part of AdaDelta, namely a unit correction system, into YamAdam. According to benchmark tests on some common datasets, our optimizer showed similar or faster convergent performance compared to the existing methods. YamAdam is an option as an alternative optimizer for deep learning.

Download Full-text

Fault diagnosis method of mine hoist braking system based on deep convolution neural network

10.21203/rs.3.rs-877486/v1 ◽

2021 ◽

Author(s):

Fangyuan Yan ◽

Juanli Li ◽

Dong Miao ◽

Qi Cao

Keyword(s):

Neural Network ◽

Fault Diagnosis ◽

Back Propagation ◽

Descent Method ◽

Convolution Neural Network ◽

Monitoring Data ◽

Gradient Descent Method ◽

Braking System ◽

Diagnosis Method ◽

Mine Hoist

Abstract A reliable braking system is an important guarantee for safe operation of mine hoist. In order to make full use of the monitoring data in the operation process of mine hoist, identify the operation status of the hoist, and further carry out fault diagnosis on it, the deep learning method was introduced into the fault diagnosis of the hoist, and a fault diagnosis method of hoist braking system based on convolution neural network has been proposed. Firstly, the working principle and fault mechanism of disc brake and its hydraulic station in hoist braking system are analyzed, and the monitoring parameters of this study are determined; then, based on massive monitoring data, the convolutional neural networks (CNN) is established, the one-dimensional signal collected by the sensor is transformed into two-dimensional image for coding, the neural network is trained by gradient descent method, and the network structure parameters are modified according to the training results. Finally, the fault diagnosis model is compared and verified by using the sample set based on the traditional back propagation neural network (BP) and CNN. The results show that the accuracy of CNN is higher than that of BP, and the accuracy rate can reach 99.375% after reducing the involvement between samples. This method can make full use of the monitoring data for diagnosis, without subjective intervention of experts, and improve the accuracy of diagnosis.

Download Full-text

STUDYING THE EFFECT OF ADAPTIVE MOMENTUM IN IMPROVING THE ACCURACY OF GRADIENT DESCENT BACK PROPAGATION ALGORITHM ON CLASSIFICATION PROBLEMS

International Journal of Modern Physics Conference Series ◽

10.1142/s201019451200551x ◽

2012 ◽

Vol 09 ◽

pp. 432-439 ◽

Cited By ~ 5

Author(s):

MUHAMMAD ZUBAIR REHMAN ◽

NAZRI MOHD. NAWI

Keyword(s):

Gradient Descent ◽

Back Propagation ◽

Activation Function ◽

Descent Method ◽

Learning Rate ◽

Gradient Descent Method ◽

Back Propagation Algorithm ◽

Classification Problems ◽

Propagation Algorithm ◽

Adaptive Momentum

Despite being widely used in the practical problems around the world, Gradient Descent Back-propagation algorithm comes with problems like slow convergence and convergence to local minima. Previous researchers have suggested certain modifications to improve the convergence in gradient Descent Back-propagation algorithm such as careful selection of input weights and biases, learning rate, momentum, network topology, activation function and value for 'gain' in the activation function. This research proposed an algorithm for improving the working performance of back-propagation algorithm which is 'Gradient Descent with Adaptive Momentum (GDAM)' by keeping the gain value fixed during all network trials. The performance of GDAM is compared with 'Gradient Descent with fixed Momentum (GDM)' and 'Gradient Descent Method with Adaptive Gain (GDM-AG)'. The learning rate is fixed to 0.4 and maximum epochs are set to 3000 while sigmoid activation function is used for the experimentation. The results show that GDAM is a better approach than previous methods with an accuracy ratio of 1.0 for classification problems like Wine Quality, Mushroom and Thyroid disease.

Download Full-text

Automatic Feature Selection for Improved Interpretability on Whole Slide Imaging

Machine Learning and Knowledge Extraction ◽

10.3390/make3010012 ◽

2021 ◽

Vol 3 (1) ◽

pp. 243-262

Author(s):

Antoine Pirovano ◽

Hippolyte Heuberger ◽

Sylvain Berlemont ◽

SaÏd Ladjal ◽

Isabelle Bloch

Keyword(s):

Deep Learning ◽

Multiple Instance Learning ◽

Daily Routine ◽

Learning Context ◽

Learning Methods ◽

Fully Integrated ◽

Gradient Based ◽

Important Challenge ◽

The Stability ◽

The Impact

Deep learning methods are widely used for medical applications to assist medical doctors in their daily routine. While performances reach expert’s level, interpretability (highlighting how and what a trained model learned and why it makes a specific decision) is the next important challenge that deep learning methods need to answer to be fully integrated in the medical field. In this paper, we address the question of interpretability in the context of whole slide images (WSI) classification with the formalization of the design of WSI classification architectures and propose a piece-wise interpretability approach, relying on gradient-based methods, feature visualization and multiple instance learning context. After training two WSI classification architectures on Camelyon-16 WSI dataset, highlighting discriminative features learned, and validating our approach with pathologists, we propose a novel manner of computing interpretability slide-level heat-maps, based on the extracted features, that improves tile-level classification performances. We measure the improvement using the tile-level AUC that we called Localization AUC, and show an improvement of more than 0.2. We also validate our results with a RemOve And Retrain (ROAR) measure. Then, after studying the impact of the number of features used for heat-map computation, we propose a corrective approach, relying on activation colocalization of selected features, that improves the performances and the stability of our proposed method.

Download Full-text

An Efficient Method for Generating Adversarial Malware Samples

Electronics ◽

10.3390/electronics11010154 ◽

2022 ◽

Vol 11 (1) ◽

pp. 154

Author(s):

Yuxin Ding ◽

Miaomiao Shao ◽

Cai Nie ◽

Kunyang Fu

Keyword(s):

Deep Learning ◽

Efficient Method ◽

Learning Algorithms ◽

Malware Detection ◽

Experimental Results ◽

Learning Models ◽

Learning Methods ◽

Gradient Based ◽

Novel Method

Deep learning methods have been applied to malware detection. However, deep learning algorithms are not safe, which can easily be fooled by adversarial samples. In this paper, we study how to generate malware adversarial samples using deep learning models. Gradient-based methods are usually used to generate adversarial samples. These methods generate adversarial samples case-by-case, which is very time-consuming to generate a large number of adversarial samples. To address this issue, we propose a novel method to generate adversarial malware samples. Different from gradient-based methods, we extract feature byte sequences from benign samples. Feature byte sequences represent the characteristics of benign samples and can affect classification decision. We directly inject feature byte sequences into malware samples to generate adversarial samples. Feature byte sequences can be shared to produce different adversarial samples, which can efficiently generate a large number of adversarial samples. We compare the proposed method with the randomly injecting and gradient-based methods. The experimental results show that the adversarial samples generated using our proposed method have a high successful rate.

Download Full-text

A Robust Method of LED Die Recognition Based on Dual BP Neural Networks

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.442.298 ◽

2013 ◽

Vol 442 ◽

pp. 298-303

Author(s):

He Xi Li ◽

Tie Niu Yang ◽

Xiao Xi Zheng

Keyword(s):

Neural Networks ◽

Recognition Accuracy ◽

Light Emitting Diode ◽

Large Diameter ◽

Back Propagation ◽

Descent Method ◽

Final Decision ◽

Gradient Descent Method ◽

Robust Method ◽

Image Degradation

A robust method based on dual back-propagation neural networks (BPNN) is proposed to recognize light-emitting diode (LED) dies on wafer planes in vision-based LED sorting system. To improve the recognition accuracy of LED dies, particularly under the condition of LED die image degradation, a dual BPNN with 3 layers architecture is first constructed, in which the two BPNNs are trained respectively using the region features and edge features of die images based on batch gradient descent method, and then the outputs of the dual BPNN are fused according to Dempster-Shafer (D-S) theory to generate a final decision. Applying the trained dual BPNN as a classifier, the recognition experiment of a large-diameter LED wafer is completed, experimental results show that LED die recognition accuracy of the dual BPNN is higher than that of the single BPNN based on the region feature or edge feature and is robust to die image degradation, it can be used for LED die array location in vision-baesd LED sorting machines.

Download Full-text

GGADN: Guided Generative Adversarial Dehazing Network

10.21203/rs.3.rs-386958/v1 ◽

2021 ◽

Author(s):

Zhang Jian ◽

Wanjuan Song

Keyword(s):

Deep Learning ◽

Loss Function ◽

Minimax Problem ◽

Feature Model ◽

Generative Adversarial Networks ◽

Image Dehazing ◽

Learning Methods ◽

Adversarial Networks ◽

Network Training

Abstract Image dehazing has always been a challenging topic in image processing. The development of deep learning methods, especially the Generative Adversarial Networks(GAN), provides a new way for image dehazing. In recent years, many deep learning methods based on GAN have been applied to image dehazing. However, GAN has two problems in image dehazing. Firstly, For haze image, haze not only reduces the quality of the image, but also blurs the details of the image. For Gan network, it is difficult for the generator to restore the details of the whole image while removing the haze. Secondly, GAN model is defined as a minimax problem, which weakens the loss function. It is difficult to distinguish whether GAN is making progress in the training process. Therefore, we propose a Guided Generative Adversarial Dehazing Network(GGADN). Different from other generation adversarial networks, GGADN adds a guided module on the generator. The guided module verifies the network of each layer of the generator. At the same time, the details of the map generated by each layer are strengthened. Network training is based on the pre-trained VGG feature model and L1-regularized gradient prior which is developed by new loss function parameters. From the dehazing results of synthetic images and real images, proposed method is better than the state-of-the-art dehazing methods.

Download Full-text