scholarly journals Integrated photonic FFT for photonic tensor operations towards efficient and high-speed neural networks

Nanophotonics ◽  
2020 ◽  
Vol 9 (13) ◽  
pp. 4097-4108 ◽  
Author(s):  
Moustafa Ahmed ◽  
Yas Al-Hadeethi ◽  
Ahmed Bakry ◽  
Hamed Dalir ◽  
Volker J. Sorger

AbstractThe technologically-relevant task of feature extraction from data performed in deep-learning systems is routinely accomplished as repeated fast Fourier transforms (FFT) electronically in prevalent domain-specific architectures such as in graphics processing units (GPU). However, electronics systems are limited with respect to power dissipation and delay, due to wire-charging challenges related to interconnect capacitance. Here we present a silicon photonics-based architecture for convolutional neural networks that harnesses the phase property of light to perform FFTs efficiently by executing the convolution as a multiplication in the Fourier-domain. The algorithmic executing time is determined by the time-of-flight of the signal through this photonic reconfigurable passive FFT ‘filter’ circuit and is on the order of 10’s of picosecond short. A sensitivity analysis shows that this optical processor must be thermally phase stabilized corresponding to a few degrees. Furthermore, we find that for a small sample number, the obtainable number of convolutions per {time, power, and chip area) outperforms GPUs by about two orders of magnitude. Lastly, we show that, conceptually, the optical FFT and convolution-processing performance is indeed directly linked to optoelectronic device-level, and improvements in plasmonics, metamaterials or nanophotonics are fueling next generation densely interconnected intelligent photonic circuits with relevance for edge-computing 5G networks by processing tensor operations optically.

2018 ◽  
Vol 11 (11) ◽  
pp. 4621-4635 ◽  
Author(s):  
Istvan Z. Reguly ◽  
Daniel Giles ◽  
Devaraj Gopinathan ◽  
Laure Quivy ◽  
Joakim H. Beck ◽  
...  

Abstract. In this paper, we present the VOLNA-OP2 tsunami model and implementation; a finite-volume non-linear shallow-water equation (NSWE) solver built on the OP2 domain-specific language (DSL) for unstructured mesh computations. VOLNA-OP2 is unique among tsunami solvers in its support for several high-performance computing platforms: central processing units (CPUs), the Intel Xeon Phi, and graphics processing units (GPUs). This is achieved in a way that the scientific code is kept separate from various parallel implementations, enabling easy maintainability. It has already been used in production for several years; here we discuss how it can be integrated into various workflows, such as a statistical emulator. The scalability of the code is demonstrated on three supercomputers, built with classical Xeon CPUs, the Intel Xeon Phi, and NVIDIA P100 GPUs. VOLNA-OP2 shows an ability to deliver productivity as well as performance and portability to its users across a number of platforms.


Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8237
Author(s):  
Jan Matuszewski ◽  
Dymitr Pietrow

With the increasing complexity of the electromagnetic environment and continuous development of radar technology we can expect a large number of modern radars using agile waveforms to appear on the battlefield in the near future. Effectively identifying these radar signals in electronic warfare systems only by relying on traditional recognition models poses a serious challenge. In response to the above problem, this paper proposes a recognition method of emitted radar signals with agile waveforms based on the convolutional neural network (CNN). These signals are measured in the electronic recognition receivers and processed into digital data, after which they undergo recognition. The implementation of this system is presented in a simulation environment with the help of a signal generator that has the ability to make changes in signal signatures earlier recognized and written in the emitter database. This article contains a description of the software’s components, learning subsystem and signal generator. The problem of teaching neural networks with the use of the graphics processing units and the way of choosing the learning coefficients are also outlined. The correctness of the CNN operation was tested using a simulation environment that verified the operation’s effectiveness in a noisy environment and in conditions where many radar signals that interfere with each other are present. The effectiveness results of the applied solutions and the possibilities of developing the method of learning and processing algorithms are presented by means of tables and appropriate figures. The experimental results demonstrate that the proposed method can effectively solve the problem of recognizing raw radar signals with agile time waveforms, and achieve correct probability of recognition at the level of 92–99%.


Author(s):  
Antonio Seoane ◽  
Alberto Jaspe

Graphics Processing Units (GPUs) have been evolving very fast, turning into high performance programmable processors. Though GPUs have been designed to compute graphics algorithms, their power and flexibility makes them a very attractive platform for generalpurpose computing. In the last years they have been used to accelerate calculations in physics, computer vision, artificial intelligence, database operations, etc. (Owens, 2007). In this paper an approach to general purpose computing with GPUs is made, followed by a description of artificial intelligence algorithms based on Artificial Neural Networks (ANN) and Evolutionary Computation (EC) accelerated using GPU.


Author(s):  
Anmol Chaudhary ◽  
Kuldeep Singh Chouhan ◽  
Jyoti Gajrani ◽  
Bhavna Sharma

In the last decade, deep learning has seen exponential growth due to rise in computational power as a result of graphics processing units (GPUs) and a large amount of data due to the democratization of the internet and smartphones. This chapter aims to throw light on both the theoretical aspects of deep learning and its practical aspects using PyTorch. The chapter primarily discusses new technologies using deep learning and PyTorch in detail. The chapter discusses the advantages of using PyTorch compared to other deep learning libraries. The chapter discusses some of the practical applications like image classification and machine translation. The chapter also discusses the various frameworks built with the help of PyTorch. PyTorch consists of various models that increases its flexibility and accessibility to a greater extent. As a result, many frameworks built on top of PyTorch are discussed in this chapter. The authors believe that this chapter will help readers in getting a better understanding of deep learning making neural networks using PyTorch.


Author(s):  
Masafumi Niwano ◽  
Katsuhiro L Murata ◽  
Ryo Adachi ◽  
Sili Wang ◽  
Yutaro Tachibana ◽  
...  

Abstract We developed a high-speed image reduction pipeline using Graphics Processing Units (GPUs) as hardware accelerators. Astronomers desire to detect the emission measure counterpart of gravitational-wave sources as soon as possible and to share in the systematic follow-up observation. Therefore, high-speed image processing is important. We developed a new image-reduction pipeline for our robotic telescope system, which uses a GPU via the Python package CuPy for high-speed image processing. As a result, the new pipeline has increased in processing speed by more than 40 times compared with the current one, while maintaining the same functions.


Sign in / Sign up

Export Citation Format

Share Document