An adiabatic method to train binarized artificial neural networks

AbstractAn artificial neural network consists of neurons and synapses. Neuron gives output based on its input according to non-linear activation functions such as the Sigmoid, Hyperbolic Tangent (Tanh), or Rectified Linear Unit (ReLU) functions, etc.. Synapses connect the neuron outputs to their inputs with tunable real-valued weights. The most resource-demanding operations in realizing such neural networks are the multiplication and accumulate (MAC) operations that compute the dot product between real-valued outputs from neurons and the synapses weights. The efficiency of neural networks can be drastically enhanced if the neuron outputs and/or the weights can be trained to take binary values $$\pm 1$$ ± 1 only, for which the MAC can be replaced by the simple XNOR operations. In this paper, we demonstrate an adiabatic training method that can binarize the fully-connected neural networks and the convolutional neural networks without modifying the network structure and size. This adiabatic training method only requires very minimal changes in training algorithms, and is tested in the following four tasks: the recognition of hand-writing numbers using a usual fully-connected network, the cat-dog recognition and the audio recognition using convolutional neural networks, the image recognition with 10 classes (CIFAR-10) using ResNet-20 and VGG-Small networks. In all tasks, the performance of the binary neural networks trained by the adiabatic method are almost identical to the networks trained using the conventional ReLU or Sigmoid activations with real-valued activations and weights. This adiabatic method can be easily applied to binarize different types of networks, and will increase the computational efficiency considerably and greatly simplify the deployment of neural networks.

Download Full-text

An Adiabatic Method to Train Binarized Artificial Neural Networks

10.21203/rs.3.rs-515926/v1 ◽

2021 ◽

Author(s):

Jiang Xiao ◽

Yuansheng Zhao

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Training Algorithms ◽

Training Method ◽

Dense Network ◽

Activation Functions ◽

Hyperbolic Tangent ◽

Different Types ◽

Artificial Neural ◽

Hand Writing

Abstract An artificial neural network consists of neurons and synapses. Neuron gives output based on its input according to non-linear activation functions such as the Sigmoid, Hyperbolic Tangent (Tanh), or Rectified Linear Unit (reLU) functions, etc. Synapses connect the neuron outputs to their inputs with tunable real-valued weights. The most resource-demanding operations in realizing such neural networks are the multiplication and accumulate (MAC) operations that compute the dot product be- tween real-valued outputs from neurons and the synapses weights. The efficiency of neural networks can be drastically enhanced if the neuron outputs and/or the weights can be trained to take binary values ±1 only, for which the MAC can be replaced by the simple XOR operations. In this paper, we demonstrate an adiabatic training method that can successfully binarize the dense neural networks and the convolutional neural networks without modification in terms network structure and with very minimal change in training algorithms. This adiabatic training method is tested in the following four tasks: the recognition of hand-writing numbers using a usual dense network, the cat-dog recog- nition and the audio recognition using a convolutional neural networks, the image recognition with 10 classes (CIFAR-10) using ResNet20 and VGG-Small networks. In all tasks, the performance of the binary neural networks trained by the adiabatic method are almost identical to the networks trained using the conventional reLU or Sigmoid activations with real-valued activations and weights. This adiabatic method can be easily applied to binarize different types of networks, and will increase the computational efficiency considerably and greatly simplify the deployment of neural networks.

Download Full-text

KLASIFIKASI BATIK RIAU DENGAN MENGGUNAKAN CONVOLUTIONAL NEURAL NETWORKS (CNN)

Jurnal Ilmu Komputer ◽

10.33060/jik/2020/vol9.iss1.144 ◽

2020 ◽

Vol 9 (1) ◽

pp. 7-10

Author(s):

Hendry Fonda

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

18Th Century ◽

The Public ◽

Artificial Neural ◽

The Difference ◽

Fully Connected

ABSTRACT Riau batik is known since the 18th century and is used by royal kings. Riau Batik is made by using a stamp that is mixed with coloring and then printed on fabric. The fabric used is usually silk. As its development, comparing Javanese batik with riau batik Riau is very slowly accepted by the public. Convolutional Neural Networks (CNN) is a combination of artificial neural networks and deeplearning methods. CNN consists of one or more convolutional layers, often with a subsampling layer followed by one or more fully connected layers as a standard neural network. In the process, CNN will conduct training and testing of Riau batik so that a collection of batik models that have been classified based on the characteristics that exist in Riau batik can be determined so that images are Riau batik and non-Riau batik. Classification using CNN produces Riau batik and not Riau batik with an accuracy of 65%. Accuracy of 65% is due to basically many of the same motifs between batik and other batik with the difference lies in the color of the absorption in the batik riau. Kata kunci: Batik; Batik Riau; CNN; Image; Deep Learning ABSTRAK Batik Riau dikenal sejak abad ke 18 dan digunakan oleh bangsawan raja. Batik Riau dibuat dengan menggunakan cap yang dicampur dengan pewarna kemudian dicetak di kain. Kain yang digunakan biasanya sutra. Seiring perkembangannya, dibandingkan batik Jawa maka batik Riau sangat lambat diterima oleh masyarakat. Convolutional Neural Networks (CNN) merupakan kombinasi dari jaringan syaraf tiruan dan metode deeplearning. CNN terdiri dari satu atau lebih lapisan konvolutional, seringnya dengan suatu lapisan subsampling yang diikuti oleh satu atau lebih lapisan yang terhubung penuh sebagai standar jaringan syaraf. Dalam prosesnya CNN akan melakukan training dan testing terhadap batik Riau sehingga didapat kumpulan model batik yang telah terklasi fikasi berdasarkan ciri khas yang ada pada batik Riau sehingga dapat ditentukan gambar (image) yang merupakan batik Riau dan yang bukan merupakan batik Riau. Klasifikasi menggunakan CNN menghasilkan batik riau dan bukan batik riau dengan akurasi 65%. Akurasi 65% disebabkan pada dasarnya banyak motif yang sama antara batik riau dengan batik lainnya dengan perbedaan terletak pada warna cerap pada batik riau. Kata kunci: Batik; Batik Riau; CNN; Image; Deep Learning

Download Full-text

Performance Analysis of Artificial Neural Networks Training Algorithms and Activation Functions in Day-Ahead Base, Intermediate, and Peak Load Forecasting

Lecture Notes in Networks and Systems - Advances in Information and Communication ◽

10.1007/978-3-030-12385-7_23 ◽

2019 ◽

pp. 284-298

Author(s):

Lemuel Clark P. Velasco ◽

Noel R. Estoperez ◽

Renbert Jay R. Jayson ◽

Caezar Johnlery T. Sabijon ◽

Verlyn C. Sayles

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Performance Analysis ◽

Peak Load ◽

Load Forecasting ◽

Training Algorithms ◽

Activation Functions ◽

Artificial Neural

Download Full-text

Squeak and rattle noise classification using radial basis function neural networks

Noise Control Engineering Journal ◽

10.3397/1/376824 ◽

2020 ◽

Vol 68 (4) ◽

pp. 283-293

Author(s):

Oleksandr Pogorilyi ◽

Mohammad Fard ◽

John Davy ◽

Mechanical and Automotive Engineering, School ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

High Accuracy ◽

Training Method ◽

Vehicle Interior ◽

Trained Classifier ◽

Different Types ◽

Noise Classification ◽

Automatic Tool ◽

Multi Class Classification

In this article, an artificial neural network is proposed to classify short audio sequences of squeak and rattle (S&R) noises. The aim of the classification is to see how accurately the trained classifier can recognize different types of S&R sounds. Having a high accuracy model that can recognize audible S&R noises could help to build an automatic tool able to identify unpleasant vehicle interior sounds in a matter of seconds from a short audio recording of the sounds. In this article, the training method of the classifier is proposed, and the results show that the trained model can identify various classes of S&R noises: simple (binary clas- sification) and complex ones (multi class classification).

Download Full-text

Trigonometric Inference Providing Learning in Deep Neural Networks

Applied Sciences ◽

10.3390/app11156704 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6704

Author(s):

Jingyong Cai ◽

Masashi Takemoto ◽

Yuming Qiu ◽

Hironori Nakajo

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

Activation Function ◽

Trigonometric Approximation ◽

Model Parameters ◽

Training Algorithms ◽

Activation Functions ◽

Classical Training ◽

Sum Formula

Despite being heavily used in the training of deep neural networks (DNNs), multipliers are resource-intensive and insufficient in many different scenarios. Previous discoveries have revealed the superiority when activation functions, such as the sigmoid, are calculated by shift-and-add operations, although they fail to remove multiplications in training altogether. In this paper, we propose an innovative approach that can convert all multiplications in the forward and backward inferences of DNNs into shift-and-add operations. Because the model parameters and backpropagated errors of a large DNN model are typically clustered around zero, these values can be approximated by their sine values. Multiplications between the weights and error signals are transferred to multiplications of their sine values, which are replaceable with simpler operations with the help of the product to sum formula. In addition, a rectified sine activation function is utilized for further converting layer inputs into sine values. In this way, the original multiplication-intensive operations can be computed through simple add-and-shift operations. This trigonometric approximation method provides an efficient training and inference alternative for devices with insufficient hardware multipliers. Experimental results demonstrate that this method is able to obtain a performance close to that of classical training algorithms. The approach we propose sheds new light on future hardware customization research for machine learning.

Download Full-text

Efficiently inaccurate approximation of hyperbolic tangent used as transfer function in artificial neural networks

Neural Computing and Applications ◽

10.1007/s00521-021-05787-0 ◽

2021 ◽

Cited By ~ 1

Author(s):

T. E. Simos ◽

Ch. Tsitouras

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Transfer Function ◽

Hyperbolic Tangent ◽

Artificial Neural

Download Full-text

Optimal Artificial Neural Network Type Selection Method for Usage in Smart House Systems

Sensors ◽

10.3390/s21010047 ◽

2020 ◽

Vol 21 (1) ◽

pp. 47

Author(s):

Vasyl Teslyuk ◽

Artem Kazarian ◽

Natalia Kryvinska ◽

Ivan Tsmots

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Network ◽

Artificial Neural Networks ◽

Input Data ◽

Optimization Criterion ◽

Fuzzy Input ◽

Different Types ◽

Smart House ◽

Artificial Neural

In the process of the “smart” house systems work, there is a need to process fuzzy input data. The models based on the artificial neural networks are used to process fuzzy input data from the sensors. However, each artificial neural network has a certain advantage and, with a different accuracy, allows one to process different types of data and generate control signals. To solve this problem, a method of choosing the optimal type of artificial neural network has been proposed. It is based on solving an optimization problem, where the optimization criterion is an error of a certain type of artificial neural network determined to control the corresponding subsystem of a “smart” house. In the process of learning different types of artificial neural networks, the same historical input data are used. The research presents the dependencies between the types of neural networks, the number of inner layers of the artificial neural network, the number of neurons on each inner layer, the error of the settings parameters calculation of the relative expected results.

Download Full-text

A novel structured sparse fully connected layer in convolutional neural networks

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6213 ◽

2021 ◽

Author(s):

Naoki Matsumura ◽

Yasuaki Ito ◽

Koji Nakano ◽

Akihiko Kasagi ◽

Tsuguchika Tabaru

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Fully Connected

Download Full-text

Structural Damage Identification of Composite Rotors Based on Fully Connected Neural Networks and Convolutional Neural Networks

Sensors ◽

10.3390/s21062005 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2005

Author(s):

Veronika Scholz ◽

Peter Winkler ◽

Andreas Hornig ◽

Maik Gude ◽

Angelos Filippatos

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Composite Structures ◽

Damage Identification ◽

Vibration Response ◽

Damage Initiation ◽

Matrix Cracks ◽

Operational Life ◽

Multiple Data Sets ◽

Fully Connected

Damage identification of composite structures is a major ongoing challenge for a secure operational life-cycle due to the complex, gradual damage behaviour of composite materials. Especially for composite rotors in aero-engines and wind-turbines, a cost-intensive maintenance service has to be performed in order to avoid critical failure. A major advantage of composite structures is that they are able to safely operate after damage initiation and under ongoing damage propagation. Therefore, a robust, efficient diagnostic damage identification method would allow monitoring the damage process with intervention occurring only when necessary. This study investigates the structural vibration response of composite rotors by applying machine learning methods and the ability to identify, localise and quantify the present damage. To this end, multiple fully connected neural networks and convolutional neural networks were trained on vibration response spectra from damaged composite rotors with barely visible damage, mostly matrix cracks and local delaminations using dimensionality reduction and data augmentation. A databank containing 720 simulated test cases with different damage states is used as a basis for the generation of multiple data sets. The trained models are tested using k-fold cross validation and they are evaluated based on the sensitivity, specificity and accuracy. Convolutional neural networks perform slightly better providing a performance accuracy of up to 99.3% for the damage localisation and quantification.

Download Full-text

Improved Effort and Cost Estimation Model Using Artificial Neural Networks and Taguchi Method with Different Activation Functions

Entropy ◽

10.3390/e23070854 ◽

2021 ◽

Vol 23 (7) ◽

pp. 854

Author(s):

Nevena Rankovic ◽

Dragica Rankovic ◽

Mirjana Ivanovic ◽

Ljubomir Lazic

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Cost Estimation ◽

Time Estimation ◽

Effort Estimation ◽

Activation Functions ◽

Estimation Model ◽

Wide Range ◽

Software Product ◽

Artificial Neural

Software estimation involves meeting a huge number of different requirements, such as resource allocation, cost estimation, effort estimation, time estimation, and the changing demands of software product customers. Numerous estimation models try to solve these problems. In our experiment, a clustering method of input values to mitigate the heterogeneous nature of selected projects was used. Additionally, homogeneity of the data was achieved with the fuzzification method, and we proposed two different activation functions inside a hidden layer, during the construction of artificial neural networks (ANNs). In this research, we present an experiment that uses two different architectures of ANNs, based on Taguchi’s orthogonal vector plans, to satisfy the set conditions, with additional methods and criteria for validation of the proposed model, in this approach. The aim of this paper is the comparative analysis of the obtained results of mean magnitude relative error (MMRE) values. At the same time, our goal is also to find a relatively simple architecture that minimizes the error value while covering a wide range of different software projects. For this purpose, six different datasets are divided into four chosen clusters. The obtained results show that the estimation of diverse projects by dividing them into clusters can contribute to an efficient, reliable, and accurate software product assessment. The contribution of this paper is in the discovered solution that enables the execution of a small number of iterations, which reduces the execution time and achieves the minimum error.

Download Full-text