scholarly journals A comparison of activation functions in multilayer neural network for predicting the production and consumption of electricity power

Author(s):  
Abdulwahed Salam ◽  
Abdelaaziz El Hibaoui ◽  
Abdulgabbar Saif

Predicting electricity power is an important task, which helps power utilities in improving their systems’ performance in terms of effectiveness, productivity, management and control. Several researches had introduced this task using three main models: engineering, statistical and artificial intelligence. Based on the experiments, which used artificial intelligence models, multilayer neural networks model has proven its success in predicting many evaluation datasets. However, the performance of this model depends mainly on the type of activation function. Therefore, this paper introduces an experimental study for investigating the performance of the multilayer neural networks model with respect to different activation functions and different depths of hidden layers. The experiments in this paper cover the comparison among eleven activation functions using four benchmark electricity datasets. The activation functions under examination are sigmoid, hyperbolic tangent, SoftSign, SoftPlus, ReLU, Leak ReLU, Gaussian, ELU, SELU, Swish and Adjust-Swish. Experimental results show that ReLU and Leak ReLU activation functions outperform their counterparts in all datasets.

Author(s):  
Joshua Bensemann ◽  
Qiming Bao ◽  
Gaël Gendron ◽  
Tim Hartill ◽  
Michael Witbrock

Processes occurring in brains, a.k.a. biological neural networks, can and have been modeled within artificial neural network architectures. Due to this, we have conducted a review of research on the phenomenon of blindsight in an attempt to generate ideas for artificial intelligence models. Blindsight can be considered as a diminished form of visual experience. If we assume that artificial networks have no form of visual experience, then deficits caused by blindsight give us insights into the processes occurring within visual experience that we can incorporate into artificial neural networks. This paper has been structured into three parts. Section 2 is a review of blindsight research, looking specifically at the errors occurring during this condition compared to normal vision. Section 3 identifies overall patterns from Sec. 2 to generate insights for computational models of vision. Section 4 demonstrates the utility of examining biological research to inform artificial intelligence research by examining computational models of visual attention relevant to one of the insights generated in Sec. 3. The research covered in Sec. 4 shows that incorporating one of our insights into computational vision does benefit those models. Future research will be required to determine whether our other insights are as valuable.


2019 ◽  
Vol 12 (3) ◽  
pp. 156-161 ◽  
Author(s):  
Aman Dureja ◽  
Payal Pahwa

Background: In making the deep neural network, activation functions play an important role. But the choice of activation functions also affects the network in term of optimization and to retrieve the better results. Several activation functions have been introduced in machine learning for many practical applications. But which activation function should use at hidden layer of deep neural networks was not identified. Objective: The primary objective of this analysis was to describe which activation function must be used at hidden layers for deep neural networks to solve complex non-linear problems. Methods: The configuration for this comparative model was used by using the datasets of 2 classes (Cat/Dog). The number of Convolutional layer used in this network was 3 and the pooling layer was also introduced after each layer of CNN layer. The total of the dataset was divided into the two parts. The first 8000 images were mainly used for training the network and the next 2000 images were used for testing the network. Results: The experimental comparison was done by analyzing the network by taking different activation functions on each layer of CNN network. The validation error and accuracy on Cat/Dog dataset were analyzed using activation functions (ReLU, Tanh, Selu, PRelu, Elu) at number of hidden layers. Overall the Relu gave best performance with the validation loss at 25th Epoch 0.3912 and validation accuracy at 25th Epoch 0.8320. Conclusion: It is found that a CNN model with ReLU hidden layers (3 hidden layers here) gives best results and improve overall performance better in term of accuracy and speed. These advantages of ReLU in CNN at number of hidden layers are helpful to effectively and fast retrieval of images from the databases.


Author(s):  
Volodymyr Shymkovych ◽  
Sergii Telenyk ◽  
Petro Kravets

AbstractThis article introduces a method for realizing the Gaussian activation function of radial-basis (RBF) neural networks with their hardware implementation on field-programmable gaits area (FPGAs). The results of modeling of the Gaussian function on FPGA chips of different families have been presented. RBF neural networks of various topologies have been synthesized and investigated. The hardware component implemented by this algorithm is an RBF neural network with four neurons of the latent layer and one neuron with a sigmoid activation function on an FPGA using 16-bit numbers with a fixed point, which took 1193 logic matrix gate (LUTs—LookUpTable). Each hidden layer neuron of the RBF network is designed on an FPGA as a separate computing unit. The speed as a total delay of the combination scheme of the block RBF network was 101.579 ns. The implementation of the Gaussian activation functions of the hidden layer of the RBF network occupies 106 LUTs, and the speed of the Gaussian activation functions is 29.33 ns. The absolute error is ± 0.005. The Spartan 3 family of chips for modeling has been used to get these results. Modeling on chips of other series has been also introduced in the article. RBF neural networks of various topologies have been synthesized and investigated. Hardware implementation of RBF neural networks with such speed allows them to be used in real-time control systems for high-speed objects.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Albert T. Young ◽  
Kristen Fernandez ◽  
Jacob Pfau ◽  
Rasika Reddy ◽  
Nhat Anh Cao ◽  
...  

AbstractArtificial intelligence models match or exceed dermatologists in melanoma image classification. Less is known about their robustness against real-world variations, and clinicians may incorrectly assume that a model with an acceptable area under the receiver operating characteristic curve or related performance metric is ready for clinical use. Here, we systematically assessed the performance of dermatologist-level convolutional neural networks (CNNs) on real-world non-curated images by applying computational “stress tests”. Our goal was to create a proxy environment in which to comprehensively test the generalizability of off-the-shelf CNNs developed without training or evaluation protocols specific to individual clinics. We found inconsistent predictions on images captured repeatedly in the same setting or subjected to simple transformations (e.g., rotation). Such transformations resulted in false positive or negative predictions for 6.5–22% of skin lesions across test datasets. Our findings indicate that models meeting conventionally reported metrics need further validation with computational stress tests to assess clinic readiness.


2021 ◽  
Vol 11 (15) ◽  
pp. 6704
Author(s):  
Jingyong Cai ◽  
Masashi Takemoto ◽  
Yuming Qiu ◽  
Hironori Nakajo

Despite being heavily used in the training of deep neural networks (DNNs), multipliers are resource-intensive and insufficient in many different scenarios. Previous discoveries have revealed the superiority when activation functions, such as the sigmoid, are calculated by shift-and-add operations, although they fail to remove multiplications in training altogether. In this paper, we propose an innovative approach that can convert all multiplications in the forward and backward inferences of DNNs into shift-and-add operations. Because the model parameters and backpropagated errors of a large DNN model are typically clustered around zero, these values can be approximated by their sine values. Multiplications between the weights and error signals are transferred to multiplications of their sine values, which are replaceable with simpler operations with the help of the product to sum formula. In addition, a rectified sine activation function is utilized for further converting layer inputs into sine values. In this way, the original multiplication-intensive operations can be computed through simple add-and-shift operations. This trigonometric approximation method provides an efficient training and inference alternative for devices with insufficient hardware multipliers. Experimental results demonstrate that this method is able to obtain a performance close to that of classical training algorithms. The approach we propose sheds new light on future hardware customization research for machine learning.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Xinran Wang ◽  
Liang Wang ◽  
Hong Bu ◽  
Ningning Zhang ◽  
Meng Yue ◽  
...  

AbstractProgrammed death ligand-1 (PD-L1) expression is a key biomarker to screen patients for PD-1/PD-L1-targeted immunotherapy. However, a subjective assessment guide on PD-L1 expression of tumor-infiltrating immune cells (IC) scoring is currently adopted in clinical practice with low concordance. Therefore, a repeatable and quantifiable PD-L1 IC scoring method of breast cancer is desirable. In this study, we propose a deep learning-based artificial intelligence-assisted (AI-assisted) model for PD-L1 IC scoring. Three rounds of ring studies (RSs) involving 31 pathologists from 10 hospitals were carried out, using the current guideline in the first two rounds (RS1, RS2) and our AI scoring model in the last round (RS3). A total of 109 PD-L1 (Ventana SP142) immunohistochemistry (IHC) stained images were assessed and the role of the AI-assisted model was evaluated. With the assistance of AI, the scoring concordance across pathologists was boosted to excellent in RS3 (0.950, 95% confidence interval (CI): 0.936–0.962) from moderate in RS1 (0.674, 95% CI: 0.614–0.735) and RS2 (0.736, 95% CI: 0.683–0.789). The 2- and 4-category scoring accuracy were improved by 4.2% (0.959, 95% CI: 0.953–0.964) and 13% (0.815, 95% CI: 0.803–0.827) (p < 0.001). The AI results were generally accepted by pathologists with 61% “fully accepted” and 91% “almost accepted”. The proposed AI-assisted method can help pathologists at all levels to improve the PD-L1 assay (SP-142) IC assessment in breast cancer in terms of both accuracy and concordance. The AI tool provides a scheme to standardize the PD-L1 IC scoring in clinical practice.


Sign in / Sign up

Export Citation Format

Share Document