A High-Precision Implementation of the Sigmoid Activation Function for Computing-in-Memory Architecture

Computing-In-Memory (CIM), based on non-von Neumann architecture, has lately received significant attention due to its lower overhead in delay and higher energy efficiency in convolutional and fully-connected neural network computing. Growing works have given the priority to researching the array of memory and peripheral circuits to achieve multiply-and-accumulate (MAC) operation, but not enough attention has been paid to the high-precision hardware implementation of non-linear layers up to now, which still causes time overhead and power consumption. Sigmoid is a widely used non-linear activation function and most of its studies provided an approximation of the function expression rather than totally matched, inevitably leading to considerable error. To address this issue, we propose a high-precision circuit implementation of the sigmoid, matching the expression exactly for the first time. The simulation results with the SMIC 40 nm process suggest that the proposed circuit implemented high-precision sigmoid perfectly achieves the properties of the ideal sigmoid, showing the maximum error and average error between the proposed simulated sigmoid and ideal sigmoid is 2.74% and 0.21%, respectively. In addition, a multi-layer convolutional neural network based on CIM architecture employing the simulated high-precision sigmoid activation function verifies the similar recognition accuracy on the test database of handwritten digits compared to utilize the ideal sigmoid in software, with online training achieving 97.06% and with offline training achieving 97.74%.

Download Full-text

Back Propagation Neural Network(BPNN) and Sigmoid Activation Function in Multi-Layer Networks

Academic Journal of Nawroz University ◽

10.25007/ajnu.v8n4a464 ◽

2019 ◽

Vol 8 (4) ◽

pp. 216

Author(s):

Renas Rajab Asaad ◽

Rasan I. Ali

Keyword(s):

Neural Network ◽

Logic Gate ◽

Back Propagation ◽

Main Idea ◽

Activation Function ◽

Back Propagation Neural Network ◽

Linear Threshold ◽

Non Linear ◽

Hidden Layer ◽

Sigmoid Activation Function

Back propagation neural network are known for computing the problems that cannot easily be computed (huge datasets analysis or training) in artificial neural networks. The main idea of this paper is to implement XOR logic gate by ANNs using back propagation neural network for back propagation of errors, and sigmoid activation function. This neural network to map non-linear threshold gate. The non-linear used to classify binary inputs (x1, x2) and passing it through hidden layer for computing coefficient_errors and gradient_errors (Cerrors, Gerrors), after computing errors by (ei = Output_desired- Output_actual) the weights and thetas (ΔWji = (α)(Xj)(gi), Δϴj = (α)(-1)(gi)) are changing according to errors. Sigmoid activation function is = sig(x)=1/(1+e-x) and Derivation of sigmoid is = dsig(x) = sig(x)(1-sig(x)). The sig(x) and Dsig(x) is between 1 to 0.

Download Full-text

SCORING MODELING BASED ON NEURAL NETWORKS FOR DETERMINING A BANK BORROWER'S RATING

Economy of Ukraine ◽

10.15407/economyukr.2020.10.054 ◽

2020 ◽

Vol 2020 (10) ◽

pp. 54-62

Author(s):

Oleksii VASYLIEV ◽

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Statistical Data ◽

Activation Function ◽

Decision Making Process ◽

Neural Network Architecture ◽

Acceptable Accuracy ◽

The Neural Network ◽

Sigmoid Activation Function

The problem of applying neural networks to calculate ratings used in banking in the decision-making process on granting or not granting loans to borrowers is considered. The task is to determine the rating function of the borrower based on a set of statistical data on the effectiveness of loans provided by the bank. When constructing a regression model to calculate the rating function, it is necessary to know its general form. If so, the task is to calculate the parameters that are included in the expression for the rating function. In contrast to this approach, in the case of using neural networks, there is no need to specify the general form for the rating function. Instead, certain neural network architecture is chosen and parameters are calculated for it on the basis of statistical data. Importantly, the same neural network architecture can be used to process different sets of statistical data. The disadvantages of using neural networks include the need to calculate a large number of parameters. There is also no universal algorithm that would determine the optimal neural network architecture. As an example of the use of neural networks to determine the borrower's rating, a model system is considered, in which the borrower's rating is determined by a known non-analytical rating function. A neural network with two inner layers, which contain, respectively, three and two neurons and have a sigmoid activation function, is used for modeling. It is shown that the use of the neural network allows restoring the borrower's rating function with quite acceptable accuracy.

Download Full-text

All-optical recurrent neural network with sigmoid activation function

Optical Fiber Communication Conference (OFC) 2020 ◽

10.1364/ofc.2020.w3a.5 ◽

2020 ◽

Author(s):

George Mourgias-Alexandris ◽

George Dabos ◽

Nikolaos Passalis ◽

Anastasios Tefas ◽

Angelina Totovic ◽

...

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Activation Function ◽

All Optical ◽

Sigmoid Activation Function

Download Full-text

Memristor Based Binary Convolutional Neural Network Architecture With Configurable Neurons

Frontiers in Neuroscience ◽

10.3389/fnins.2021.639526 ◽

2021 ◽

Vol 15 ◽

Author(s):

Lixing Huang ◽

Jietao Diao ◽

Hongshan Nie ◽

Wei Wang ◽

Zhiwei Li ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

High Precision ◽

Network Architecture ◽

Large Scale ◽

Network Performance ◽

Recognition Accuracy ◽

Activation Function ◽

Data Set ◽

Neuron Activation

The memristor-based convolutional neural network (CNN) gives full play to the advantages of memristive devices, such as low power consumption, high integration density, and strong network recognition capability. Consequently, it is very suitable for building a wearable embedded application system and has broad application prospects in image classification, speech recognition, and other fields. However, limited by the manufacturing process of memristive devices, high-precision weight devices are currently difficult to be applied in large-scale. In the same time, high-precision neuron activation function also further increases the complexity of network hardware implementation. In response to this, this paper proposes a configurable full-binary convolutional neural network (CFB-CNN) architecture, whose inputs, weights, and neurons are all binary values. The neurons are proportionally configured to two modes for different non-ideal situations. The architecture performance is verified based on the MNIST data set, and the influence of device yield and resistance fluctuations under different neuron configurations on network performance is also analyzed. The results show that the recognition accuracy of the 2-layer network is about 98.2%. When the yield rate is about 64% and the hidden neuron mode is configured as −1 and +1, namely ±1 MD, the CFB-CNN architecture achieves about 91.28% recognition accuracy. Whereas the resistance variation is about 26% and the hidden neuron mode configuration is 0 and 1, namely 01 MD, the CFB-CNN architecture gains about 93.43% recognition accuracy. Furthermore, memristors have been demonstrated as one of the most promising devices in neuromorphic computing for its synaptic plasticity. Therefore, the CFB-CNN architecture based on memristor is SNN-compatible, which is verified using the number of pulses to encode pixel values in this paper.

Download Full-text

High-Precision Temperature Inversion Algorithm for Correlative Microwave Radiometer

Sensors ◽

10.3390/s21165336 ◽

2021 ◽

Vol 21 (16) ◽

pp. 5336

Author(s):

Jie Liu ◽

Kai Zhang ◽

Jingyan Ma ◽

Qiang Wu ◽

Zhenlin Sun ◽

...

Keyword(s):

Neural Network ◽

Brightness Temperature ◽

Mean Square Error ◽

High Precision ◽

Phase Error ◽

Microwave Radiometer ◽

Maximum Error ◽

Temperature Inversion ◽

Mean Square ◽

Inversion Algorithm

In order to achieve high precision from non-contact temperature measurement, the hardware structure of a broadband correlative microwave radiometer, calibration algorithm, and temperature inversion algorithm are innovatively designed in this paper. The correlative radiometer is much more sensitive than a full power radiometer, but its accuracy is challenging to improve due to relatively large phase error. In this study, an error correction algorithm is designed, which reduces the phase error from 69.08° to 4.02°. Based on integral calibration on the microwave temperature measuring system with a known radiation source, the linear relationship between the output voltage and the brightness temperature of the object is obtained. Since the metal aluminum plate, antenna, and transmission line will have a non-linear influence on the receiver system, their temperature characteristics and the brightness temperature of the object are used as the inputs of the neural network to obtain a higher accuracy of inversion temperature. The temperature prediction mean square error of a back propagation (BP) neural network is 0.629 °C, and its maximum error is 3.351 °C. This paper innovatively proposed the high-precision PSO-LM-BP temperature inversion algorithm. According to the global search ability of the particle swarm optimization (PSO) algorithm, the initial weight of the network can be determined effectively, and the Levenberg–Marquardt (LM) algorithm makes use of the second derivative information, which has higher convergence accuracy and iteration efficiency. The mean square error of the PSO-LM-BP temperature inversion algorithm is 0.002 °C, and its maximum error is 0.209 °C.

Download Full-text

Prediction of Humid Air Heat Exchanger Performance Using Artificial Neural Networks

10.1115/imece1999-1087 ◽

1999 ◽

Author(s):

Arturo Pacheco-Vega ◽

Mihir Sen ◽

K. T. Yang ◽

Rodney L. McClain

Keyword(s):

Neural Network ◽

Heat Exchanger ◽

Activation Function ◽

Humid Air ◽

The Neural Network ◽

Artificial Neural ◽

Fin Tube ◽

Experimental Values ◽

Sigmoid Activation Function ◽

Better Than

Abstract In the present study we apply an artificial neural network to predict the operation of a humid air-water fin-tube compact heat exchanger. The network configuration is of the feedforward type with a sigmoid activation function and a backpropagation algorithm. Published experimental data, corresponding to humid air flowing over the heat exchanger tubes and water flowing inside them, are used to train the neural network. After training with known experimental values of the humid-air flow rates, dry-bulb and wet-bulb inlet temperatures for various geometrical configurations, the j-factor and heat transfer rate predictions of the network were tested against the experimental values. Comparisons were made with published predictions of power-law correlations which were obtained from the same data. The results demonstrate that the neural network is able to predict the performance of this heat exchanger much better than the correlations.

Download Full-text

On Transformative Adaptive Activation Functions in Neural Networks for Gene Expression Inference

10.1101/587287 ◽

2019 ◽

Cited By ~ 1

Author(s):

Vladimír Kunc ◽

Jiří Kléma

Keyword(s):

Neural Network ◽

Gene Expression ◽

Neural Networks ◽

Expression Profiling ◽

Mean Absolute Error ◽

Absolute Error ◽

Activation Function ◽

Activation Functions ◽

Hyperbolic Tangent ◽

Sigmoid Activation Function

AbstractMotivationGene expression profiling was made cheaper by the NIH LINCS program that profiles only ~1, 000 selected landmark genes and uses them to reconstruct the whole profile. The D–GEX method employs neural networks to infer the whole profile. However, the original D–GEX can be further significantly improved.ResultsWe have analyzed the D–GEX method and determined that the inference can be improved using a logistic sigmoid activation function instead of the hyperbolic tangent. Moreover, we propose a novel transformative adaptive activation function that improves the gene expression inference even further and which generalizes several existing adaptive activation functions. Our improved neural network achieves average mean absolute error of 0.1340 which is a significant improvement over our reimplementation of the original D–GEX which achieves average mean absolute error 0.1637

Download Full-text

Keys for Action: An Efficient Keyframe-Based Approach for 3D Action Recognition Using a Deep Neural Network

Sensors ◽

10.3390/s20082226 ◽

2020 ◽

Vol 20 (8) ◽

pp. 2226

Author(s):

Hashim Yasin ◽

Mazhar Hussain ◽

Andreas Weber

Keyword(s):

Neural Network ◽

Action Recognition ◽

Deep Neural Network ◽

State Of The Art ◽

Activation Function ◽

Action Model ◽

Motion Sequence ◽

Orientation Information ◽

Original Motion ◽

Sigmoid Activation Function

In this paper, we propose a novel and efficient framework for 3D action recognition using a deep learning architecture. First, we develop a 3D normalized pose space that consists of only 3D normalized poses, which are generated by discarding translation and orientation information. From these poses, we extract joint features and employ them further in a Deep Neural Network (DNN) in order to learn the action model. The architecture of our DNN consists of two hidden layers with the sigmoid activation function and an output layer with the softmax function. Furthermore, we propose a keyframe extraction methodology through which, from a motion sequence of 3D frames, we efficiently extract the keyframes that contribute substantially to the performance of the action. In this way, we eliminate redundant frames and reduce the length of the motion. More precisely, we ultimately summarize the motion sequence, while preserving the original motion semantics. We only consider the remaining essential informative frames in the process of action recognition, and the proposed pipeline is sufficiently fast and robust as a result. Finally, we evaluate our proposed framework intensively on publicly available benchmark Motion Capture (MoCap) datasets, namely HDM05 and CMU. From our experiments, we reveal that our proposed scheme significantly outperforms other state-of-the-art approaches.

Download Full-text

An application of multilayer neural network on hepatitis disease diagnosis using approximations of sigmoid activation function

Dicle Medical Journal / Dicle Tip Dergisi ◽

10.5798/diclemedj.0921.2015.02.0550 ◽

2015 ◽

Vol 42 (2) ◽

Cited By ~ 6

Author(s):

Onursal Çetin ◽

Feyzullah Temurtaş ◽

Şenol Gülgönül

Keyword(s):

Neural Network ◽

Activation Function ◽

Disease Diagnosis ◽

Sigmoid Activation Function

Download Full-text

Accelerating Convolutional Neural Network Using Discrete Orthogonal Transforms

10.36227/techrxiv.14593686 ◽

2021 ◽

Author(s):

Eduardo Reis ◽

Rachid Benlamri

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Activation Function ◽

Learning Rate ◽

Step Size ◽

Linear Operation ◽

Input Size ◽

Non Linear ◽

Discrete Orthogonal ◽

Complex Dataset

<div> <div> <div> <div> <p>All experiments are implemented in Python, using the PyTorch and the Torch-DCT libraries under the Google Colab environment. The Intel(R) Xeon(R) CPU @ 2.00GHz and a Tesla V100-SXM2-16GB GPU were assignment to the Google Colab runtime when profiling the DOT models. It should be noted that the current stable version of the PyTorch library, version 1.8.1, offers only the implementation of the FFT algorithm. Therefore, the implementations of the Hartley and Cosine transforms, listed in Table 1, are not implemented using the same optimizations (algorithm and code wise) adopted in the FFT. We benchmark the DOT methods using the LENET-5 network shown in Figure 10. The ReLU activation function is adopted a non-linear operation across the entire architecture. In this network, the convolutional operations have a kernel of size K = 5. The convolution is of type “valid”, i.e., padding is not applied to the input. Hence the output size M of each layer is smaller than its input size N, that is M=N−K+1. The optimizers used in our experiments are Adam, SGD, SGD with Momentum of 0.9, and RMSProp with α = 0.99. The StepLR scheduler is used with a step size of 20 epochs and a γ = 0.5. We train our model for 40 epochs using a mini-batch of size 128 and a learning rate of 0.001. Five datasets are used in order to benchmark the proposed DOT methods. Among them, we have the MNIST dataset and some variants of the MNIST dataset such as EMNIST, KMNIST and Fashion-MNIST. Additionally, a more complex dataset, CIFAR-10 is also used in our benchmark.</p> </div> </div> </div> </div>

Download Full-text