On Transformative Adaptive Activation Functions in Neural Networks for Gene Expression Inference

AbstractMotivationGene expression profiling was made cheaper by the NIH LINCS program that profiles only ~1, 000 selected landmark genes and uses them to reconstruct the whole profile. The D–GEX method employs neural networks to infer the whole profile. However, the original D–GEX can be further significantly improved.ResultsWe have analyzed the D–GEX method and determined that the inference can be improved using a logistic sigmoid activation function instead of the hyperbolic tangent. Moreover, we propose a novel transformative adaptive activation function that improves the gene expression inference even further and which generalizes several existing adaptive activation functions. Our improved neural network achieves average mean absolute error of 0.1340 which is a significant improvement over our reimplementation of the original D–GEX which achieves average mean absolute error 0.1637

Download Full-text

On transformative adaptive activation functions in neural networks for gene expression inference

PLoS ONE ◽

10.1371/journal.pone.0243915 ◽

2021 ◽

Vol 16 (1) ◽

pp. e0243915

Author(s):

Vladimír Kunc ◽

Jiří Kléma

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Mean Absolute Error ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Cost Effective ◽

Absolute Error ◽

Activation Function ◽

Training Procedure ◽

Activation Functions

Gene expression profiling was made more cost-effective by the NIH LINCS program that profiles only ∼1, 000 selected landmark genes and uses them to reconstruct the whole profile. The D–GEX method employs neural networks to infer the entire profile. However, the original D–GEX can be significantly improved. We propose a novel transformative adaptive activation function that improves the gene expression inference even further and which generalizes several existing adaptive activation functions. Our improved neural network achieves an average mean absolute error of 0.1340, which is a significant improvement over our reimplementation of the original D–GEX, which achieves an average mean absolute error of 0.1637. The proposed transformative adaptive function enables a significantly more accurate reconstruction of the full gene expression profiles with only a small increase in the complexity of the model and its training procedure compared to other methods.

Download Full-text

On tower and checkerboard neural network architectures for gene expression inference

BMC Genomics ◽

10.1186/s12864-020-06821-6 ◽

2020 ◽

Vol 21 (S5) ◽

Author(s):

Vladimír Kunc ◽

Jiří Kléma

Keyword(s):

Neural Network ◽

Gene Expression ◽

Expression Profiling ◽

Mean Absolute Error ◽

Absolute Error ◽

Computational Method ◽

Network Architectures ◽

Memory Footprint ◽

Training Protocol

Abstract Background One possible approach how to economically facilitate gene expression profiling is to use the L1000 platform which measures the expression of ∼1,000 landmark genes and uses a computational method to infer the expression of another ∼10,000 genes. One such method for the gene expression inference is a D–GEX which employs neural networks. Results We propose two novel D–GEX architectures that significantly improve the quality of the inference by increasing the capacity of a network without any increase in the number of trained parameters. The architectures partition the network into individual towers. Our best proposed architecture — a checkerboard architecture with a skip connection and five towers — together with minor changes in the training protocol improves the average mean absolute error of the inference from 0.134 to 0.128. Conclusions Our proposed approach increases the gene expression inference accuracy without increasing the number of weights of the model and thus without increasing the memory footprint of the model that is limiting its usage.

Download Full-text

SCORING MODELING BASED ON NEURAL NETWORKS FOR DETERMINING A BANK BORROWER'S RATING

Economy of Ukraine ◽

10.15407/economyukr.2020.10.054 ◽

2020 ◽

Vol 2020 (10) ◽

pp. 54-62

Author(s):

Oleksii VASYLIEV ◽

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Statistical Data ◽

Activation Function ◽

Decision Making Process ◽

Neural Network Architecture ◽

Acceptable Accuracy ◽

The Neural Network ◽

Sigmoid Activation Function

The problem of applying neural networks to calculate ratings used in banking in the decision-making process on granting or not granting loans to borrowers is considered. The task is to determine the rating function of the borrower based on a set of statistical data on the effectiveness of loans provided by the bank. When constructing a regression model to calculate the rating function, it is necessary to know its general form. If so, the task is to calculate the parameters that are included in the expression for the rating function. In contrast to this approach, in the case of using neural networks, there is no need to specify the general form for the rating function. Instead, certain neural network architecture is chosen and parameters are calculated for it on the basis of statistical data. Importantly, the same neural network architecture can be used to process different sets of statistical data. The disadvantages of using neural networks include the need to calculate a large number of parameters. There is also no universal algorithm that would determine the optimal neural network architecture. As an example of the use of neural networks to determine the borrower's rating, a model system is considered, in which the borrower's rating is determined by a known non-analytical rating function. A neural network with two inner layers, which contain, respectively, three and two neurons and have a sigmoid activation function, is used for modeling. It is shown that the use of the neural network allows restoring the borrower's rating function with quite acceptable accuracy.

Download Full-text

Deep neural network based on generalized neo-fuzzy neurons and its learning based on backpropagation

Artificial Intelligence ◽

10.15407/jai2021.01.032 ◽

2021 ◽

Vol 26 (jai2021.26(1)) ◽

pp. 32-41

Author(s):

Bodyanskiy Y ◽

◽

Antonenko T ◽

Keyword(s):

Neural Network ◽

Neural Networks ◽

Learning Process ◽

Activation Function ◽

Point Of View ◽

Basic Unit ◽

Theoretical Point ◽

Activation Functions ◽

Approximation Properties ◽

Network Training

Modern approaches in deep neural networks have a number of issues related to the learning process and computational costs. This article considers the architecture grounded on an alternative approach to the basic unit of the neural network. This approach achieves optimization in the calculations and gives rise to an alternative way to solve the problems of the vanishing and exploding gradient. The main issue of the article is the usage of the deep stacked neo-fuzzy system, which uses a generalized neo-fuzzy neuron to optimize the learning process. This approach is non-standard from a theoretical point of view, so the paper presents the necessary mathematical calculations and describes all the intricacies of using this architecture from a practical point of view. From a theoretical point, the network learning process is fully disclosed. Derived all necessary calculations for the use of the backpropagation algorithm for network training. A feature of the network is the rapid calculation of the derivative for the activation functions of neurons. This is achieved through the use of fuzzy membership functions. The paper shows that the derivative of such function is a constant, and this is a reason for the statement of increasing in the optimization rate in comparison with neural networks which use neurons with more common activation functions (ReLU, sigmoid). The paper highlights the main points that can be improved in further theoretical developments on this topic. In general, these issues are related to the calculation of the activation function. The proposed methods cope with these points and allow approximation using the network, but the authors already have theoretical justifications for improving the speed and approximation properties of the network. The results of the comparison of the proposed network with standard neural network architectures are shown

Download Full-text

Interval universal approximation for neural networks

Proceedings of the ACM on Programming Languages ◽

10.1145/3498675 ◽

2022 ◽

Vol 6 (POPL) ◽

pp. 1-29

Author(s):

Zi Wang ◽

Aws Albarghouthi ◽

Gautam Prakriya ◽

Somesh Jha

Keyword(s):

Neural Network ◽

Neural Networks ◽

Approximation Problem ◽

Activation Function ◽

Constructive Proof ◽

Universal Approximation ◽

Activation Functions ◽

Complete Problems ◽

Robust Network ◽

Interval Approximation

To verify safety and robustness of neural networks, researchers have successfully applied abstract interpretation , primarily using the interval abstract domain. In this paper, we study the theoretical power and limits of the interval domain for neural-network verification. First, we introduce the interval universal approximation (IUA) theorem. IUA shows that neural networks not only can approximate any continuous function f (universal approximation) as we have known for decades, but we can find a neural network, using any well-behaved activation function, whose interval bounds are an arbitrarily close approximation of the set semantics of f (the result of applying f to a set of inputs). We call this notion of approximation interval approximation . Our theorem generalizes the recent result of Baader et al. from ReLUs to a rich class of activation functions that we call squashable functions . Additionally, the IUA theorem implies that we can always construct provably robust neural networks under ℓ ∞ -norm using almost any practical activation function. Second, we study the computational complexity of constructing neural networks that are amenable to precise interval analysis. This is a crucial question, as our constructive proof of IUA is exponential in the size of the approximation domain. We boil this question down to the problem of approximating the range of a neural network with squashable activation functions. We show that the range approximation problem (RA) is a Δ 2 -intermediate problem, which is strictly harder than NP -complete problems, assuming coNP ⊄ NP . As a result, IUA is an inherently hard problem : No matter what abstract domain or computational tools we consider to achieve interval approximation, there is no efficient construction of such a universal approximator. This implies that it is hard to construct a provably robust network, even if we have a robust network to start with.

Download Full-text

Global Convergence of Delayed Neural Network Systems

International Journal of Neural Systems ◽

10.1142/s0129065703001534 ◽

2003 ◽

Vol 13 (03) ◽

pp. 193-204 ◽

Cited By ~ 50

Author(s):

Wenlian Lu ◽

Libin Rong ◽

Tianping Chen

Keyword(s):

Neural Network ◽

Neural Networks ◽

Global Convergence ◽

Time Delays ◽

Global Exponential Stability ◽

Sufficient Condition ◽

Strict Monotonicity ◽

Activation Functions ◽

Hyperbolic Tangent ◽

Network Systems

In this paper, without assuming the boundedness, strict monotonicity and differentiability of the activation functions, we utilize a new Lyapunov function to analyze the global convergence of a class of neural networks models with time delays. A new sufficient condition guaranteeing the existence, uniqueness and global exponential stability of the equilibrium point is derived. This stability criterion imposes constraints on the feedback matrices independently of the delay parameters. The result is compared with some previous works. Furthermore, the condition may be less restrictive in the case that the activation functions are hyperbolic tangent.

Download Full-text

Overview of Configuring Adaptive Activation Functions for Deep Neural Networks - A Comparative Study

Journal of Ubiquitous Computing and Communication Technologies - December 2019 ◽

10.36548/jucct.2021.1.002 ◽

2021 ◽

Vol 3 (1) ◽

pp. 10-22

Author(s):

Wang Haoxiang ◽

Smys S

Keyword(s):

Neural Network ◽

Neural Networks ◽

Learning Process ◽

Deep Neural Networks ◽

Research Work ◽

Nonlinear Function ◽

Activation Function ◽

Classification Error ◽

Activation Functions ◽

Research Article

Recently, the deep neural networks (DNN) have demonstrated many performances in the pattern recognition paradigm. The research studies on DNN include depth layer networks, filters, training and testing datasets. Deep neural network is providing many solutions for nonlinear partial differential equations (PDE). This research article comprises of many activation functions for each neuron. Besides, these activation networks are allowing many neurons within the neuron networks. In this network, the multitude of the functions will be selected between node by node to minimize the classification error. This is the reason for selecting the adaptive activation function for deep neural networks. Therefore, the activation functions are adapted with every neuron on the network, which is used to reduce the classification error during the process. This research article discusses the scaling factor for activation function that provides better optimization for the process in the dynamic changes of procedure. The proposed adaptive activation function has better learning capability than fixed activation function in any neural network. The research articles compare the convergence rate, early training function, and accuracy between existing methods. Besides, this research work provides improvements in debt ideas of the learning process of various neural networks. This learning process works and tests the solution available in the domain of various frequency bands. In addition to that, both forward and inverse problems of the parameters in the overriding equation will be identified. The proposed method is very simple architecture and efficiency, robustness, and accuracy will be high when considering the nonlinear function. The overall classification performance will be improved in the resulting networks, which have been trained with common datasets. The proposed work is compared with the recent findings in neuroscience research and proved better performance.

Download Full-text

Maxout Networks for Visual Recognition

International Journal of Multimedia Data Engineering and Management ◽

10.4018/ijmdem.2019100101 ◽

2019 ◽

Vol 10 (4) ◽

pp. 1-25

Author(s):

Gabriel Castaneda ◽

Paul Morris ◽

Taghi M Khoshgoftaar

Keyword(s):

Neural Networks ◽

Visual Recognition ◽

Activation Function ◽

Entity Recognition ◽

Activation Functions ◽

Hyperbolic Tangent ◽

Rectified Linear Unit ◽

Facial Identification ◽

Leaky Rectified Linear Unit ◽

Better Than

This study investigates the effectiveness of multiple maxout activation variants on image classification, facial identification and verification tasks using convolutional neural networks. A network with maxout activation has a higher number of trainable parameters compared to networks with traditional activation functions. However, it is not clear if the activation function itself or the increase in the number of trainable parameters is responsible for yielding the best performance on different entity recognition tasks. This article investigates if an increase in the number of convolutional filters on the rectified linear unit activation performs equal-to or better-than maxout networks. Our experiments compare rectified linear unit, leaky rectified linear unit, scaled exponential linear unit, and hyperbolic tangent to four maxout variants. Throughout the experiments, we found that on average, across all datasets, the rectified linear unit networks perform better than any maxout activation when the number of convolutional filters is increased six times.

Download Full-text

Efficient Quantization for Neural Networks with Binary Weights and Low Bitwidth Activations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013854 ◽

2019 ◽

Vol 33 ◽

pp. 3854-3861

Author(s):

Kun Huang ◽

Bingbing Ni ◽

Xiaokang Yang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Network ◽

Activation Function ◽

Limited Resources ◽

Portable Devices ◽

Training Procedure ◽

Activation Functions ◽

First Time ◽

And Training

Quantization has shown stunning efficiency on deep neural network, especially for portable devices with limited resources. Most existing works uncritically extend weight quantization methods to activations. However, we take the view that best performance can be obtained by applying different quantization methods to weights and activations respectively. In this paper, we design a new activation function dubbed CReLU from the quantization perspective and further complement this design with appropriate initialization method and training procedure. Moreover, we develop a specific quantization strategy in which we formulate the forward and backward approximation of weights with binary values and quantize the activations to low bitwdth using linear or logarithmic quantizer. We show, for the first time, our final quantized model with binary weights and ultra low bitwidth activations outperforms the previous best models by large margins on ImageNet as well as achieving nearly a 10.85× theoretical speedup with ResNet-18. Furthermore, ablation experiments and theoretical analysis demonstrate the effectiveness and robustness of CReLU in comparison with other activation functions.

Download Full-text

Stacked Heterogeneous Neural Networks for Time Series Forecasting

Mathematical Problems in Engineering ◽

10.1155/2010/373648 ◽

2010 ◽

Vol 2010 ◽

pp. 1-20 ◽

Cited By ~ 6

Author(s):

Florin Leon ◽

Mihai Horia Zaharia

Keyword(s):

Neural Network ◽

Neural Networks ◽

Time Series ◽

Case Studies ◽

Multilayer Perceptron ◽

Neural Model ◽

Activation Function ◽

Time Series Forecasting ◽

The Other ◽

Activation Functions

A hybrid model for time series forecasting is proposed. It is a stacked neural network, containing one normal multilayer perceptron with bipolar sigmoid activation functions, and the other with an exponential activation function in the output layer. As shown by the case studies, the proposed stacked hybrid neural model performs well on a variety of benchmark time series. The combination of weights of the two stack components that leads to optimal performance is also studied.

Download Full-text