scholarly journals Evolution of Activation Functions: An Empirical Investigation

2021 ◽  
Vol 1 (2) ◽  
pp. 1-36
Author(s):  
Andrew Nader ◽  
Danielle Azar

The hyper-parameters of a neural network are traditionally designed through a time-consuming process of trial and error that requires substantial expert knowledge. Neural Architecture Search algorithms aim to take the human out of the loop by automatically finding a good set of hyper-parameters for the problem at hand. These algorithms have mostly focused on hyper-parameters such as the architectural configurations of the hidden layers and the connectivity of the hidden neurons, but there has been relatively little work on automating the search for completely new activation functions, which are one of the most crucial hyperparameters to choose. There are some widely used activation functions nowadays that are simple and work well, but nonetheless, there has been some interest in finding better activation functions. The work in the literature has mostly focused on designing new activation functions by hand or choosing from a set of predefined functions while this work presents an evolutionary algorithm to automate the search for completely new activation functions. We compare these new evolved activation functions to other existing and commonly used activation functions. The results are favorable and are obtained from averaging the performance of the activation functions found over 30 runs, with experiments being conducted on 10 different datasets and architectures to ensure the statistical robustness of the study.

2019 ◽  
Vol 1 (1) ◽  
pp. p8
Author(s):  
Jamilu Auwalu Adamu

One of the objectives of this paper is to incorporate fat-tail effects into, for instance, Sigmoid in order to introduce Transparency and Stability into the existing stochastic Activation Functions. Secondly, according to the available literature reviewed, the existing set of Activation Functions were introduced into the Deep learning Artificial Neural Network through the “Window” not properly through the “Legitimate Door” since they are “Trial and Error “and “Arbitrary Assumptions”, thus, the Author proposed a “Scientific Facts”, “Definite Rules: Jameel’s Stochastic ANNAF Criterion”, and a “Lemma” to substitute not necessarily replace the existing set of stochastic Activation Functions, for instance, the Sigmoid among others. This research is expected to open the “Black-Box” of Deep Learning Artificial Neural networks. The author proposed a new set of advanced optimized fat-tailed Stochastic Activation Functions EMANATED from the AI-ML-Purified Stocks Data  namely; the Log – Logistic (3P) Probability Distribution (1st), Cauchy Probability Distribution (2nd), Pearson 5 (3P) Probability Distribution (3rd), Burr (4P) Probability Distribution (4th), Fatigue Life (3P) Probability Distribution (5th), Inv. Gaussian (3P) Probability Distribution (6th), Dagum (4P) Probability Distribution (7th), and Lognormal (3P) Probability Distribution (8th) for the successful conduct of both Forward and Backward Propagations of Deep Learning Artificial Neural Network. However, this paper did not check the Monotone Differentiability of the proposed distributions. Appendix A, B, and C presented and tested the performances of the stressed Sigmoid and the Optimized Activation Functions using Stocks Data (2014-1991) of Microsoft Corporation (MSFT), Exxon Mobil (XOM), Chevron Corporation (CVX), Honda Motor Corporation (HMC), General Electric (GE), and U.S. Fundamental Macroeconomic Parameters, the results were found fascinating. Thus, guarantee, the first three distributions are excellent Activation Functions to successfully conduct any Stock Deep Learning Artificial Neural Network. Distributions Number 4 to 8 are also good Advanced Optimized Activation Functions. Generally, this research revealed that the Advanced Optimized Activation Functions satisfied Jameel’s ANNAF Stochastic Criterion depends on the Referenced Purified AI Data Set, Time Change and Area of Application which is against the existing “Trial and Error “and “Arbitrary Assumptions” of Sigmoid, Tanh, Softmax, ReLu, and Leaky ReLu.


Author(s):  
Eduardo Masato Iyoda ◽  
◽  
Kaoru Hirota ◽  
Fernando J. Von Zuben ◽  

A nonparametric neural architecture called the Sigma-Pi Cascade extended Hybrid Neural Network σπ-(CHNN) is proposed to extend approximation capabilities in neural architectures such as Projection Pursuit Learning (PPL) and Hybrid Neural Networks (HNN). Like PPL and HNN, σπ-CHNN also uses distinct activation functions in its neurons but, unlike these previous neural architectures, it may consider multiplicative operators in its hidden neurons, enabling it to extract higher-order information from given data. σπ-CHNN uses arbitrary connectivity patterns among neurons. An evolutionary learning algorithm combined with a conjugate gradient algorithm is proposed to automatically design the topology and weights of σπ-CHNN. σπ-CHNN performance is evaluated in five benchmark regression problems. Results show that σπ-CHNN provides competitive performance compared to PPL and HNN in most problems, either in computational requirements to implement the proposed neural architecture or in approximation accuracy. In some problems, σπ-CHNN reduces the approximation error on the order of 10-1 compared to PPL and HNN, whereas in other cases it achieves the same approximation error as these neural architectures but uses a smaller number of hidden neurons (usually 1 hidden neuron less than PPL and HNN).


2021 ◽  
Vol 2 (1) ◽  
pp. 1-25
Author(s):  
Yongsen Ma ◽  
Sheheryar Arshad ◽  
Swetha Muniraju ◽  
Eric Torkildson ◽  
Enrico Rantala ◽  
...  

In recent years, Channel State Information (CSI) measured by WiFi is widely used for human activity recognition. In this article, we propose a deep learning design for location- and person-independent activity recognition with WiFi. The proposed design consists of three Deep Neural Networks (DNNs): a 2D Convolutional Neural Network (CNN) as the recognition algorithm, a 1D CNN as the state machine, and a reinforcement learning agent for neural architecture search. The recognition algorithm learns location- and person-independent features from different perspectives of CSI data. The state machine learns temporal dependency information from history classification results. The reinforcement learning agent optimizes the neural architecture of the recognition algorithm using a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM). The proposed design is evaluated in a lab environment with different WiFi device locations, antenna orientations, sitting/standing/walking locations/orientations, and multiple persons. The proposed design has 97% average accuracy when testing devices and persons are not seen during training. The proposed design is also evaluated by two public datasets with accuracy of 80% and 83%. The proposed design needs very little human efforts for ground truth labeling, feature engineering, signal processing, and tuning of learning parameters and hyperparameters.


Author(s):  
Jonathan George ◽  
Armin Mehrabian ◽  
Rubab Amin ◽  
Paul R. Prucnal ◽  
Tarek El-Ghazawi ◽  
...  

2014 ◽  
Vol 667 ◽  
pp. 60-63
Author(s):  
Wei Guo ◽  
Zhen Ji Zhang

A performance evaluation system of finance transportation projects is mainly researched, in which the sub-module of the highway projects evaluation, waterway projects evaluation, Passenger stations projects evaluation, Energy saving projects evaluation are incorporated. In addition, the expert knowledge are inserted in the system, the multi-layer neural network and fuzzy-set theory are used to implement Performance Evaluation system of Finance invest Transportation Projects, and the feasibility and effectiveness of the evaluation system are finally verified by practice.


Sign in / Sign up

Export Citation Format

Share Document