Trigonometric Inference Providing Learning in Deep Neural Networks

Despite being heavily used in the training of deep neural networks (DNNs), multipliers are resource-intensive and insufficient in many different scenarios. Previous discoveries have revealed the superiority when activation functions, such as the sigmoid, are calculated by shift-and-add operations, although they fail to remove multiplications in training altogether. In this paper, we propose an innovative approach that can convert all multiplications in the forward and backward inferences of DNNs into shift-and-add operations. Because the model parameters and backpropagated errors of a large DNN model are typically clustered around zero, these values can be approximated by their sine values. Multiplications between the weights and error signals are transferred to multiplications of their sine values, which are replaceable with simpler operations with the help of the product to sum formula. In addition, a rectified sine activation function is utilized for further converting layer inputs into sine values. In this way, the original multiplication-intensive operations can be computed through simple add-and-shift operations. This trigonometric approximation method provides an efficient training and inference alternative for devices with insufficient hardware multipliers. Experimental results demonstrate that this method is able to obtain a performance close to that of classical training algorithms. The approach we propose sheds new light on future hardware customization research for machine learning.

Download Full-text

Analysis of Non-Linear Activation Functions for Classification Tasks Using Convolutional Neural Networks

Recent Patents on Computer Science ◽

10.2174/2213275911666181025143029 ◽

2019 ◽

Vol 12 (3) ◽

pp. 156-161 ◽

Cited By ~ 3

Author(s):

Aman Dureja ◽

Payal Pahwa

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Activation Function ◽

Primary Objective ◽

Experimental Comparison ◽

Activation Functions ◽

Practical Applications ◽

Network Activation ◽

Non Linear ◽

Hidden Layer

Background: In making the deep neural network, activation functions play an important role. But the choice of activation functions also affects the network in term of optimization and to retrieve the better results. Several activation functions have been introduced in machine learning for many practical applications. But which activation function should use at hidden layer of deep neural networks was not identified. Objective: The primary objective of this analysis was to describe which activation function must be used at hidden layers for deep neural networks to solve complex non-linear problems. Methods: The configuration for this comparative model was used by using the datasets of 2 classes (Cat/Dog). The number of Convolutional layer used in this network was 3 and the pooling layer was also introduced after each layer of CNN layer. The total of the dataset was divided into the two parts. The first 8000 images were mainly used for training the network and the next 2000 images were used for testing the network. Results: The experimental comparison was done by analyzing the network by taking different activation functions on each layer of CNN network. The validation error and accuracy on Cat/Dog dataset were analyzed using activation functions (ReLU, Tanh, Selu, PRelu, Elu) at number of hidden layers. Overall the Relu gave best performance with the validation loss at 25th Epoch 0.3912 and validation accuracy at 25th Epoch 0.8320. Conclusion: It is found that a CNN model with ReLU hidden layers (3 hidden layers here) gives best results and improve overall performance better in term of accuracy and speed. These advantages of ReLU in CNN at number of hidden layers are helpful to effectively and fast retrieval of images from the databases.

Download Full-text

Differential Equation Units: Learning Functional Forms of Activation Functions from Data

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6065 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6030-6037

Author(s):

MohamadAli Torkamani ◽

Shiv Shankar ◽

Amirmohammad Rooshenas ◽

Phillip Wallis

Keyword(s):

Differential Equation ◽

Neural Networks ◽

Deep Neural Networks ◽

Functional Form ◽

Activation Function ◽

The Other ◽

Superior Performance ◽

Activation Functions ◽

Functional Forms ◽

Nonlinear Activation Function

Most deep neural networks use simple, fixed activation functions, such as sigmoids or rectified linear units, regardless of domain or network structure. We introduce differential equation units (DEUs), an improvement to modern neural networks, which enables each neuron to learn a particular nonlinear activation function from a family of solutions to an ordinary differential equation. Specifically, each neuron may change its functional form during training based on the behavior of the other parts of the network. We show that using neurons with DEU activation functions results in a more compact network capable of achieving comparable, if not superior, performance when compared to much larger networks.

Download Full-text

Overview of Configuring Adaptive Activation Functions for Deep Neural Networks - A Comparative Study

Journal of Ubiquitous Computing and Communication Technologies - December 2019 ◽

10.36548/jucct.2021.1.002 ◽

2021 ◽

Vol 3 (1) ◽

pp. 10-22

Author(s):

Wang Haoxiang ◽

Smys S

Keyword(s):

Neural Network ◽

Neural Networks ◽

Learning Process ◽

Deep Neural Networks ◽

Research Work ◽

Nonlinear Function ◽

Activation Function ◽

Classification Error ◽

Activation Functions ◽

Research Article

Recently, the deep neural networks (DNN) have demonstrated many performances in the pattern recognition paradigm. The research studies on DNN include depth layer networks, filters, training and testing datasets. Deep neural network is providing many solutions for nonlinear partial differential equations (PDE). This research article comprises of many activation functions for each neuron. Besides, these activation networks are allowing many neurons within the neuron networks. In this network, the multitude of the functions will be selected between node by node to minimize the classification error. This is the reason for selecting the adaptive activation function for deep neural networks. Therefore, the activation functions are adapted with every neuron on the network, which is used to reduce the classification error during the process. This research article discusses the scaling factor for activation function that provides better optimization for the process in the dynamic changes of procedure. The proposed adaptive activation function has better learning capability than fixed activation function in any neural network. The research articles compare the convergence rate, early training function, and accuracy between existing methods. Besides, this research work provides improvements in debt ideas of the learning process of various neural networks. This learning process works and tests the solution available in the domain of various frequency bands. In addition to that, both forward and inverse problems of the parameters in the overriding equation will be identified. The proposed method is very simple architecture and efficiency, robustness, and accuracy will be high when considering the nonlinear function. The overall classification performance will be improved in the resulting networks, which have been trained with common datasets. The proposed work is compared with the recent findings in neuroscience research and proved better performance.

Download Full-text

Broad Autoencoder Features Learning for Classification Problem

International Journal of Cognitive Informatics and Natural Intelligence ◽

10.4018/ijcini.20211001oa10 ◽

2021 ◽

Vol 15 (4) ◽

pp. 0-0

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Classification Problem ◽

Activation Function ◽

Activation Functions ◽

Classification Problems ◽

Stacked Autoencoders ◽

Learned Features ◽

Sigmoid Functions ◽

Nonlinear Mappings

Activation functions such as Tanh and Sigmoid functions are widely used in Deep Neural Networks (DNNs) and pattern classification problems. To take advantages of different activation functions, the Broad Autoencoder Features (BAF) is proposed in this work. The BAF consists of four parallel-connected Stacked Autoencoders (SAEs) and each of them uses a different activation function, including Sigmoid, Tanh, ReLU, and Softplus. The final learned features can merge such features by various nonlinear mappings from original input features with such a broad setting. This helps to excavate more information from the original input features. Experimental results show that the BAF yields better-learned features and classification performances.

Download Full-text

Stochastic Selection of Activation Layers for Convolutional Neural Networks

Sensors ◽

10.3390/s20061626 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1626 ◽

Cited By ~ 1

Author(s):

Loris Nanni ◽

Alessandra Lumini ◽

Stefano Ghidoni ◽

Gianluca Maguolo

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Activation Function ◽

Computational Time ◽

Activation Functions ◽

Practical Applications ◽

Functional Blocks ◽

Stochastic Selection ◽

Selection Of

In recent years, the field of deep learning has achieved considerable success in pattern recognition, image segmentation, and many other classification fields. There are many studies and practical applications of deep learning on images, video, or text classification. Activation functions play a crucial role in discriminative capabilities of the deep neural networks and the design of new “static” or “dynamic” activation functions is an active area of research. The main difference between “static” and “dynamic” functions is that the first class of activations considers all the neurons and layers as identical, while the second class learns parameters of the activation function independently for each layer or even each neuron. Although the “dynamic” activation functions perform better in some applications, the increased number of trainable parameters requires more computational time and can lead to overfitting. In this work, we propose a mixture of “static” and “dynamic” activation functions, which are stochastically selected at each layer. Our idea for model design is based on a method for changing some layers along the lines of different functional blocks of the best performing CNN models, with the aim of designing new models to be used as stand-alone networks or as a component of an ensemble. We propose to replace each activation layer of a CNN (usually a ReLU layer) by a different activation function stochastically drawn from a set of activation functions: in this way, the resulting CNN has a different set of activation function layers.

Download Full-text

Ti3C2Tx MXene Enabled All-Optical Nonlinear Activation Function for On-Chip Photonic Deep Neural Networks

10.21203/rs.3.rs-919901/v1 ◽

2021 ◽

Author(s):

Adir Hazan ◽

Barak Ratzker ◽

Danzhen Zhang ◽

Aviad Katiyi ◽

Nachum Frage ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

Activation Function ◽

Physical Structure ◽

Major Step ◽

Promising Alternative ◽

Silicon Waveguide ◽

All Optical ◽

Nonlinear Activation Function

Abstract Neural networks are one of the first major milestones in developing artificial intelligence systems. The utilisation of integrated photonics in neural networks offers a promising alternative approach to microelectronic and hybrid optical-electronic implementations due to improvements in computational speed and low energy consumption in machine-learning tasks. However, at present, most of the neural network hardware systems are still electronic-based due to a lack of optical realisation of the nonlinear activation function. Here, we experimentally demonstrate two novel approaches for implementing an all-optical neural nonlinear activation function based on utilising unique light-matter interactions in 2D Ti3C2Tx (MXene) in the infrared (IR) range in two configurations: 1) a saturable absorber made of MXene thin film, and 2) a silicon waveguide with MXene flakes overlayer. These configurations may serve as nonlinear units in photonic neural networks, while their nonlinear transfer function can be flexibly designed to optimise the performance of different neuromorphic tasks, depending on the operating wavelength. The proposed configurations are reconfigurable and can therefore be adjusted for various applications without the need to modify the physical structure. We confirm the capability and feasibility of the obtained results in machine-learning applications via an Modified National Institute of Standards and Technology (MNIST) handwritten digit classifications task, with near 99% accuracy. Our developed concept for an all-optical neuron is expected to constitute a major step towards the realization of all-optically implemented deep neural networks.

Download Full-text

Broad Autoencoder Features Learning for Classification Problem

International Journal of Cognitive Informatics and Natural Intelligence ◽

10.4018/ijcini.20211001.oa23 ◽

2021 ◽

Vol 15 (4) ◽

pp. 1-15

Author(s):

Ting Wang ◽

Wing W. Y. Ng ◽

Wendi Li ◽

Sam Kwong

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Classification Problem ◽

Activation Function ◽

Activation Functions ◽

Classification Problems ◽

Stacked Autoencoders ◽

Learned Features ◽

Sigmoid Functions ◽

Nonlinear Mappings

Download Full-text

Hardware implementation of radial-basis neural networks with Gaussian activation functions on FPGA

Neural Computing and Applications ◽

10.1007/s00521-021-05706-3 ◽

2021 ◽

Author(s):

Volodymyr Shymkovych ◽

Sergii Telenyk ◽

Petro Kravets

Keyword(s):

Neural Networks ◽

Hardware Implementation ◽

Gaussian Function ◽

Activation Function ◽

Rbf Neural Networks ◽

Activation Functions ◽

Rbf Network ◽

Combination Scheme ◽

Radial Basis ◽

Hidden Layer

AbstractThis article introduces a method for realizing the Gaussian activation function of radial-basis (RBF) neural networks with their hardware implementation on field-programmable gaits area (FPGAs). The results of modeling of the Gaussian function on FPGA chips of different families have been presented. RBF neural networks of various topologies have been synthesized and investigated. The hardware component implemented by this algorithm is an RBF neural network with four neurons of the latent layer and one neuron with a sigmoid activation function on an FPGA using 16-bit numbers with a fixed point, which took 1193 logic matrix gate (LUTs—LookUpTable). Each hidden layer neuron of the RBF network is designed on an FPGA as a separate computing unit. The speed as a total delay of the combination scheme of the block RBF network was 101.579 ns. The implementation of the Gaussian activation functions of the hidden layer of the RBF network occupies 106 LUTs, and the speed of the Gaussian activation functions is 29.33 ns. The absolute error is ± 0.005. The Spartan 3 family of chips for modeling has been used to get these results. Modeling on chips of other series has been also introduced in the article. RBF neural networks of various topologies have been synthesized and investigated. Hardware implementation of RBF neural networks with such speed allows them to be used in real-time control systems for high-speed objects.

Download Full-text

414 Deep Neural Networks: A Survey Tool for Obstructive Sleep Apnea Prediction

SLEEP ◽

10.1093/sleep/zsab072.413 ◽

2021 ◽

Vol 44 (Supplement_2) ◽

pp. A164-A164

Author(s):

Pahnwat Taweesedt ◽

JungYoon Kim ◽

Jaehyun Park ◽

Jangwoon Park ◽

Munish Sharma ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Obstructive Sleep Apnea ◽

Sleep Apnea ◽

Deep Neural Networks ◽

Support Vector ◽

Learning Models ◽

Obstructive Sleep ◽

Screening Questionnaires ◽

Machine Learning Models

Abstract Introduction Obstructive sleep apnea (OSA) is a common sleep-related breathing disorder with an estimation of one billion people. Full-night polysomnography is considered the gold standard for OSA diagnosis. However, it is time-consuming, expensive and is not readily available in many parts of the world. Many screening questionnaires and scores have been proposed for OSA prediction with high sensitivity and low specificity. The present study is intended to develop models with various machine learning techniques to predict the severity of OSA by incorporating features from multiple questionnaires. Methods Subjects who underwent full-night polysomnography in Torr sleep center, Texas and completed 5 OSA screening questionnaires/scores were included. OSA was diagnosed by using Apnea-Hypopnea Index ≥ 5. We trained five different machine learning models including Deep Neural Networks with the scaled principal component analysis (DNN-PCA), Random Forest (RF), Adaptive Boosting classifier (ABC), and K-Nearest Neighbors classifier (KNC) and Support Vector Machine Classifier (SVMC). Training:Testing subject ratio of 65:35 was used. All features including demographic data, body measurement, snoring and sleepiness history were obtained from 5 OSA screening questionnaires/scores (STOP-BANG questionnaires, Berlin questionnaires, NoSAS score, NAMES score and No-Apnea score). Performance parametrics were used to compare between machine learning models. Results Of 180 subjects, 51.5 % of subjects were male with mean (SD) age of 53.6 (15.1). One hundred and nineteen subjects were diagnosed with OSA. Area Under the Receiver Operating Characteristic Curve (AUROC) of DNN-PCA, RF, ABC, KNC, SVMC, STOP-BANG questionnaire, Berlin questionnaire, NoSAS score, NAMES score, and No-Apnea score were 0.85, 0.68, 0.52, 0.74, 0.75, 0.61, 0.63, 0,61, 0.58 and 0,58 respectively. DNN-PCA showed the highest AUROC with sensitivity of 0.79, specificity of 0.67, positive-predictivity of 0.93, F1 score of 0.86, and accuracy of 0.77. Conclusion Our result showed that DNN-PCA outperforms OSA screening questionnaires, scores and other machine learning models. Support (if any):

Download Full-text

A Survey on Bias in Deep NLP

Applied Sciences ◽

10.3390/app11073184 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3184

Author(s):

Ismael Garrido-Muñoz ◽

Arturo Montejo-Ráez ◽

Fernando Martínez-Santiago ◽

L. Alfonso Ureña-López

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Natural Language Processing ◽

Probability Distribution ◽

Natural Language ◽

Network Design ◽

Language Processing ◽

Deep Neural Networks ◽

Learning Processes ◽

Relevant Issue

Deep neural networks are hegemonic approaches to many machine learning areas, including natural language processing (NLP). Thanks to the availability of large corpora collections and the capability of deep architectures to shape internal language mechanisms in self-supervised learning processes (also known as “pre-training”), versatile and performing models are released continuously for every new network design. These networks, somehow, learn a probability distribution of words and relations across the training collection used, inheriting the potential flaws, inconsistencies and biases contained in such a collection. As pre-trained models have been found to be very useful approaches to transfer learning, dealing with bias has become a relevant issue in this new scenario. We introduce bias in a formal way and explore how it has been treated in several networks, in terms of detection and correction. In addition, available resources are identified and a strategy to deal with bias in deep NLP is proposed.

Download Full-text