scholarly journals All-optical nonlinear activation function for photonic neural networks [Invited]

2018 ◽  
Vol 8 (12) ◽  
pp. 3851 ◽  
Author(s):  
Mario Miscuglio ◽  
Armin Mehrabian ◽  
Zibo Hu ◽  
Shaimaa I. Azzam ◽  
Jonathan George ◽  
...  
2021 ◽  
Author(s):  
Adir Hazan ◽  
Barak Ratzker ◽  
Danzhen Zhang ◽  
Aviad Katiyi ◽  
Nachum Frage ◽  
...  

Abstract Neural networks are one of the first major milestones in developing artificial intelligence systems. The utilisation of integrated photonics in neural networks offers a promising alternative approach to microelectronic and hybrid optical-electronic implementations due to improvements in computational speed and low energy consumption in machine-learning tasks. However, at present, most of the neural network hardware systems are still electronic-based due to a lack of optical realisation of the nonlinear activation function. Here, we experimentally demonstrate two novel approaches for implementing an all-optical neural nonlinear activation function based on utilising unique light-matter interactions in 2D Ti3C2Tx (MXene) in the infrared (IR) range in two configurations: 1) a saturable absorber made of MXene thin film, and 2) a silicon waveguide with MXene flakes overlayer. These configurations may serve as nonlinear units in photonic neural networks, while their nonlinear transfer function can be flexibly designed to optimise the performance of different neuromorphic tasks, depending on the operating wavelength. The proposed configurations are reconfigurable and can therefore be adjusted for various applications without the need to modify the physical structure. We confirm the capability and feasibility of the obtained results in machine-learning applications via an Modified National Institute of Standards and Technology (MNIST) handwritten digit classifications task, with near 99% accuracy. Our developed concept for an all-optical neuron is expected to constitute a major step towards the realization of all-optically implemented deep neural networks.


1999 ◽  
Vol 11 (5) ◽  
pp. 1069-1077 ◽  
Author(s):  
Danilo P. Mandic ◽  
Jonathon A. Chambers

A relationship between the learning rate η in the learning algorithm, and the slope β in the nonlinear activation function, for a class of recurrent neural networks (RNNs) trained by the real-time recurrent learning algorithm is provided. It is shown that an arbitrary RNN can be obtained via the referent RNN, with some deterministic rules imposed on its weights and the learning rate. Such relationships reduce the number of degrees of freedom when solving the nonlinear optimization task of finding the optimal RNN parameters.


2021 ◽  
Author(s):  
Ting Yu ◽  
Xiaoxuan Ma ◽  
Ernest Pastor ◽  
Jonathan George ◽  
Simon Wall ◽  
...  

Abstract Deeplearning algorithms are revolutionising many aspects of modern life. Typically, they are implemented in CMOS-based hardware with severely limited memory access times and inefficient data-routing. All-optical neural networks without any electro-optic conversions could alleviate these shortcomings. However, an all-optical nonlinear activation function, which is a vital building block for optical neural networks, needs to be developed efficiently on-chip. Here, we introduce and demonstrate both optical synapse weighting and all-optical nonlinear thresholding using two different effects in one single chalcogenide material. We show how the structural phase transitions in a wide-bandgap phase-change material enables storing the neural network weights via non-volatile photonic memory, whilst resonant bond destabilisation is used as a nonlinear activation threshold without changing the material. These two different transitions within chalcogenides enable programmable neural networks with near-zero static power consumption once trained, in addition to picosecond delays performing inference tasks not limited by wire charging that limit electrical circuits; for instance, we show that nanosecond-order weight programming and near-instantaneous weight updates enable accurate inference tasks within 20 picoseconds in a 3-layer all-optical neural network. Optical neural networks that bypass electro-optic conversion altogether hold promise for network-edge machine learning applications where decision-making in real-time are critical, such as for autonomous vehicles or navigation systems such as signal pre-processing of LIDAR systems.


2020 ◽  
Vol 34 (04) ◽  
pp. 6030-6037
Author(s):  
MohamadAli Torkamani ◽  
Shiv Shankar ◽  
Amirmohammad Rooshenas ◽  
Phillip Wallis

Most deep neural networks use simple, fixed activation functions, such as sigmoids or rectified linear units, regardless of domain or network structure. We introduce differential equation units (DEUs), an improvement to modern neural networks, which enables each neuron to learn a particular nonlinear activation function from a family of solutions to an ordinary differential equation. Specifically, each neuron may change its functional form during training based on the behavior of the other parts of the network. We show that using neurons with DEU activation functions results in a more compact network capable of achieving comparable, if not superior, performance when compared to much larger networks.


Filomat ◽  
2020 ◽  
Vol 34 (15) ◽  
pp. 5009-5018
Author(s):  
Lei Ding ◽  
Lin Xiao ◽  
Kaiqing Zhou ◽  
Yonghong Lan ◽  
Yongsheng Zhang

Compared to the linear activation function, a suitable nonlinear activation function can accelerate the convergence speed. Based on this finding, we propose two modified Zhang neural network (ZNN) models using different nonlinear activation functions to tackle the complex-valued systems of linear equation (CVSLE) problems in this paper. To fulfill this goal, we first propose a novel neural network called NRNN-SBP model by introducing the sign-bi-power activation function. Then, we propose another novel neural network called NRNN-IRN model by introducing the tunable activation function. Finally, simulative results demonstrate that the convergence speed of NRNN-SBP and the NRNN-IRN is faster than that of the FTRNN model. On the other hand, these results also reveal that different nonlinear activation function will have a different effect on the convergence rate for different CVSLE problems.


2020 ◽  
Vol 2020 (10) ◽  
pp. 54-62
Author(s):  
Oleksii VASYLIEV ◽  

The problem of applying neural networks to calculate ratings used in banking in the decision-making process on granting or not granting loans to borrowers is considered. The task is to determine the rating function of the borrower based on a set of statistical data on the effectiveness of loans provided by the bank. When constructing a regression model to calculate the rating function, it is necessary to know its general form. If so, the task is to calculate the parameters that are included in the expression for the rating function. In contrast to this approach, in the case of using neural networks, there is no need to specify the general form for the rating function. Instead, certain neural network architecture is chosen and parameters are calculated for it on the basis of statistical data. Importantly, the same neural network architecture can be used to process different sets of statistical data. The disadvantages of using neural networks include the need to calculate a large number of parameters. There is also no universal algorithm that would determine the optimal neural network architecture. As an example of the use of neural networks to determine the borrower's rating, a model system is considered, in which the borrower's rating is determined by a known non-analytical rating function. A neural network with two inner layers, which contain, respectively, three and two neurons and have a sigmoid activation function, is used for modeling. It is shown that the use of the neural network allows restoring the borrower's rating function with quite acceptable accuracy.


Sign in / Sign up

Export Citation Format

Share Document