scholarly journals Enabling Training of Neural Networks on Noisy Hardware

2021 ◽  
Vol 4 ◽  
Author(s):  
Tayfun Gokmen

Deep neural networks (DNNs) are typically trained using the conventional stochastic gradient descent (SGD) algorithm. However, SGD performs poorly when applied to train networks on non-ideal analog hardware composed of resistive device arrays with non-symmetric conductance modulation characteristics. Recently we proposed a new algorithm, the Tiki-Taka algorithm, that overcomes this stringent symmetry requirement. Here we build on top of Tiki-Taka and describe a more robust algorithm that further relaxes other stringent hardware requirements. This more robust second version of the Tiki-Taka algorithm (referred to as TTv2) 1. decreases the number of device conductance states requirement from 1000s of states to only 10s of states, 2. increases the noise tolerance to the device conductance modulations by about 100x, and 3. increases the noise tolerance to the matrix-vector multiplication performed by the analog arrays by about 10x. Empirical simulation results show that TTv2 can train various neural networks close to their ideal accuracy even at extremely noisy hardware settings. TTv2 achieves these capabilities by complementing the original Tiki-Taka algorithm with lightweight and low computational complexity digital filtering operations performed outside the analog arrays. Therefore, the implementation cost of TTv2 compared to SGD and Tiki-Taka is minimal, and it maintains the usual power and speed benefits of using analog hardware for training workloads. Here we also show how to extract the neural network from the analog hardware once the training is complete for further model deployment. Similar to Bayesian model averaging, we form analog hardware compatible averages over the neural network weights derived from TTv2 iterates. This model average then can be transferred to another analog or digital hardware with notable improvements in test accuracy, transcending the trained model itself. In short, we describe an end-to-end training and model extraction technique for extremely noisy crossbar-based analog hardware that can be used to accelerate DNN training workloads and match the performance of full-precision SGD.

2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

EEG analysis aims to help scientists better understand the brain, help physicians diagnose and treatment choices of the brain-computer interface. Artificial neural networks are among the most effective learning algorithms to perform computing tasks similar to biological neurons in the human brain. In some problems, the neural network model's performance might significantly degrade and overfit due to some irrelevant features that negatively influence the model performance. Swarm optimization algorithms are robust techniques that can be implemented to find optimal solutions to such problems. In this paper, Grey Wolf Optimizer (GWO) and Particle Swarm Optimization (PSO) algorithms are applied for the feature selection and the training of a Feed-forward Neural Network (FFNN). The performance of the FFNN in terms of test accuracy, precision, recall, and F1_score is investigated. Furthermore, this research has implemented other five machine learning algorithms for the purpose of comparison. Experimental results prove that the neural network model outperforms all other algorithms via GWO.


Author(s):  
A.P. Karpenko ◽  
V.A. Ovchinnikov

The study aims to develop an algorithm and then software to synthesise noise that could be used to attack deep learning neural networks designed to classify images. We present the results of our analysis of methods for conducting this type of attacks. The synthesis of attack noise is stated as a problem of multidimensional constrained optimization. The main features of the attack noise synthesis algorithm proposed are as follows: we employ the clip function to take constraints on noise into account; we use the top-1 and top-5 classification error ratings as attack noise efficiency criteria; we train our neural networks using backpropagation and Adam's gradient descent algorithm; stochastic gradient descent is employed to solve the optimisation problem indicated above; neural network training also makes use of the augmentation technique. The software was developed in Python using the Pytorch framework to dynamically differentiate the calculation graph and runs under Ubuntu 18.04 and CentOS 7. Our IDE was Visual Studio Code. We accelerated the computation via CUDA executed on a NVIDIA Titan XP GPU. The paper presents the results of a broad computational experiment in synthesising non-universal and universal attack noise types for eight deep neural networks. We show that the attack algorithm proposed is able to increase the neural network error by eight times


2020 ◽  
Vol 2020 (10) ◽  
pp. 54-62
Author(s):  
Oleksii VASYLIEV ◽  

The problem of applying neural networks to calculate ratings used in banking in the decision-making process on granting or not granting loans to borrowers is considered. The task is to determine the rating function of the borrower based on a set of statistical data on the effectiveness of loans provided by the bank. When constructing a regression model to calculate the rating function, it is necessary to know its general form. If so, the task is to calculate the parameters that are included in the expression for the rating function. In contrast to this approach, in the case of using neural networks, there is no need to specify the general form for the rating function. Instead, certain neural network architecture is chosen and parameters are calculated for it on the basis of statistical data. Importantly, the same neural network architecture can be used to process different sets of statistical data. The disadvantages of using neural networks include the need to calculate a large number of parameters. There is also no universal algorithm that would determine the optimal neural network architecture. As an example of the use of neural networks to determine the borrower's rating, a model system is considered, in which the borrower's rating is determined by a known non-analytical rating function. A neural network with two inner layers, which contain, respectively, three and two neurons and have a sigmoid activation function, is used for modeling. It is shown that the use of the neural network allows restoring the borrower's rating function with quite acceptable accuracy.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Idris Kharroubi ◽  
Thomas Lim ◽  
Xavier Warin

AbstractWe study the approximation of backward stochastic differential equations (BSDEs for short) with a constraint on the gains process. We first discretize the constraint by applying a so-called facelift operator at times of a grid. We show that this discretely constrained BSDE converges to the continuously constrained one as the mesh grid converges to zero. We then focus on the approximation of the discretely constrained BSDE. For that we adopt a machine learning approach. We show that the facelift can be approximated by an optimization problem over a class of neural networks under constraints on the neural network and its derivative. We then derive an algorithm converging to the discretely constrained BSDE as the number of neurons goes to infinity. We end by numerical experiments.


Author(s):  
Saša Vasiljević ◽  
Jasna Glišović ◽  
Nadica Stojanović ◽  
Ivan Grujić

According to the World Health Organization, air pollution with PM10 and PM2.5 (PM-particulate matter) is a significant problem that can have serious consequences for human health. Vehicles, as one of the main sources of PM10 and PM2.5 emissions, pollute the air and the environment both by creating particles by burning fuel in the engine, and by wearing of various elements in some vehicle systems. In this paper, the authors conducted the prediction of the formation of PM10 and PM2.5 particles generated by the wear of the braking system using a neural network (Artificial Neural Networks (ANN)). In this case, the neural network model was created based on the generated particles that were measured experimentally, while the validity of the created neural network was checked by means of a comparative analysis of the experimentally measured amount of particles and the prediction results. The experimental results were obtained by testing on an inertial braking dynamometer, where braking was performed in several modes, that is under different braking parameters (simulated vehicle speed, brake system pressure, temperature, braking time, braking torque). During braking, the concentration of PM10 and PM2.5 particles was measured simultaneously. The total of 196 measurements were performed and these data were used for training, validation, and verification of the neural network. When it comes to simulation, a comparison of two types of neural networks was performed with one output and with two outputs. For each type, network training was conducted using three different algorithms of backpropagation methods. For each neural network, a comparison of the obtained experimental and simulation results was performed. More accurate prediction results were obtained by the single-output neural network for both particulate sizes, while the smallest error was found in the case of a trained neural network using the Levenberg-Marquardt backward propagation algorithm. The aim of creating such a prediction model is to prove that by using neural networks it is possible to predict the emission of particles generated by brake wear, which can be further used for modern traffic systems such as traffic control. In addition, this wear algorithm could be applied on other vehicle systems, such as a clutch or tires.


Electronics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1526 ◽  
Author(s):  
Choongmin Kim ◽  
Jacob A. Abraham ◽  
Woochul Kang ◽  
Jaeyong Chung

Crossbar-based neuromorphic computing to accelerate neural networks is a popular alternative to conventional von Neumann computing systems. It is also referred as processing-in-memory and in-situ analog computing. The crossbars have a fixed number of synapses per neuron and it is necessary to decompose neurons to map networks onto the crossbars. This paper proposes the k-spare decomposition algorithm that can trade off the predictive performance against the neuron usage during the mapping. The proposed algorithm performs a two-level hierarchical decomposition. In the first global decomposition, it decomposes the neural network such that each crossbar has k spare neurons. These neurons are used to improve the accuracy of the partially mapped network in the subsequent local decomposition. Our experimental results using modern convolutional neural networks show that the proposed method can improve the accuracy substantially within about 10% extra neurons.


1991 ◽  
Vol 45 (10) ◽  
pp. 1706-1716 ◽  
Author(s):  
Mark Glick ◽  
Gary M. Hieftje

Artificial neural networks were constructed for the classification of metal alloys based on their elemental constituents. Glow discharge-atomic emission spectra obtained with a photodiode array spectrometer were used in multivariate calibrations for 7 elements in 37 Ni-based alloys (different types) and 15 Fe-based alloys. Subsets of the two major classes formed calibration sets for stepwise multiple linear regression. The remaining samples were used to validate the calibration models. Reference data from the calibration sets were then pooled into a single set to train neural networks with different architectures and different training parameters. After the neural networks learned to discriminate correctly among alloy classes in the training set, their ability to classify samples in the testing set was measured. In general, the neural network approach performed slightly better than the K-nearest neighbor method, but it suffered from a hidden classification mechanism and nonunique solutions. The neural network methodology is discussed and compared with conventional sample-classification techniques, and multivariate calibration of glow discharge spectra is compared with conventional univariate calibration.


2016 ◽  
Vol 38 (2) ◽  
pp. 37-46 ◽  
Author(s):  
Mateusz Kaczmarek ◽  
Agnieszka Szymańska

Abstract Nonlinear structural mechanics should be taken into account in the practical design of reinforced concrete structures. Cracking is one of the major sources of nonlinearity. Description of deflection of reinforced concrete elements is a computational problem, mainly because of the difficulties in modelling the nonlinear stress-strain relationship of concrete and steel. In design practise, in accordance with technical rules (e.g., Eurocode 2), a simplified approach for reinforced concrete is used, but the results of simplified calculations differ from the results of experimental studies. Artificial neural network is a versatile modelling tool capable of making predictions of values that are difficult to obtain in numerical analysis. This paper describes the creation and operation of a neural network for making predictions of deflections of reinforced concrete beams at different load levels. In order to obtain a database of results, that is necessary for training and testing the neural network, a research on measurement of deflections in reinforced concrete beams was conducted by the authors in the Certified Research Laboratory of the Building Engineering Institute at Wrocław University of Science and Technology. The use of artificial neural networks is an innovation and an alternative to traditional methods of solving the problem of calculating the deflections of reinforced concrete elements. The results show the effectiveness of using artificial neural network for predicting the deflection of reinforced concrete beams, compared with the results of calculations conducted in accordance with Eurocode 2. The neural network model presented in this paper can acquire new data and be used for further analysis, with availability of more research results.


2014 ◽  
Vol 38 (6) ◽  
pp. 1681-1693 ◽  
Author(s):  
Braz Calderano Filho ◽  
Helena Polivanov ◽  
César da Silva Chagas ◽  
Waldir de Carvalho Júnior ◽  
Emílio Velloso Barroso ◽  
...  

Soil information is needed for managing the agricultural environment. The aim of this study was to apply artificial neural networks (ANNs) for the prediction of soil classes using orbital remote sensing products, terrain attributes derived from a digital elevation model and local geology information as data sources. This approach to digital soil mapping was evaluated in an area with a high degree of lithologic diversity in the Serra do Mar. The neural network simulator used in this study was JavaNNS and the backpropagation learning algorithm. For soil class prediction, different combinations of the selected discriminant variables were tested: elevation, declivity, aspect, curvature, curvature plan, curvature profile, topographic index, solar radiation, LS topographic factor, local geology information, and clay mineral indices, iron oxides and the normalized difference vegetation index (NDVI) derived from an image of a Landsat-7 Enhanced Thematic Mapper Plus (ETM+) sensor. With the tested sets, best results were obtained when all discriminant variables were associated with geological information (overall accuracy 93.2 - 95.6 %, Kappa index 0.924 - 0.951, for set 13). Excluding the variable profile curvature (set 12), overall accuracy ranged from 93.9 to 95.4 % and the Kappa index from 0.932 to 0.948. The maps based on the neural network classifier were consistent and similar to conventional soil maps drawn for the study area, although with more spatial details. The results show the potential of ANNs for soil class prediction in mountainous areas with lithological diversity.


Author(s):  
Daniel Roten ◽  
Kim B. Olsen

ABSTRACT We use deep learning to predict surface-to-borehole Fourier amplification functions (AFs) from discretized shear-wave velocity profiles. Specifically, we train a fully connected neural network and a convolutional neural network using mean AFs observed at ∼600 KiK-net vertical array sites. Compared with predictions based on theoretical SH 1D amplifications, the neural network (NN) results in up to 50% reduction of the mean squared log error between predictions and observations at sites not used for training. In the future, NNs may lead to a purely data-driven prediction of site response that is independent of proxies or simplifying assumptions.


Sign in / Sign up

Export Citation Format

Share Document