Data Flow Oriented Hardware Design of RNS-based Polynomial Multiplication for SHE Acceleration

This paper presents a hardware implementation of a Residue Polynomial Multiplier (RPM), designed to accelerate the full Residue Number System (RNS) variant of the Fan-Vercauteren scheme proposed by Bajard et al. [BEHZ16]. Our design speeds up polynomial multiplication via a Negative Wrapped Convolution (NWC) which locally computes the required RNS channel dependent twiddle factors. Compared to related works, this design is more versatile regarding the addressable parameter sets for the BFV scheme. This is mainly brought by our proposed twiddle factor generator that makes the design BRAM utilization independent of the RNS basis size, with a negligible communication bandwidth usage for non-payload data. Furthermore, the generalization of a DFT hardware generator is explored in order to generate RNS friendly NTT architectures. This approach helps us to validate our RPM design over parameter sets from the work of Halevi et al. [HPS18]. For the depth-20 setting, we achieve an estimated speed up for the residue polynomial multiplications greater than 76 during ciphertexts multiplication, and greater than 16 during relinearization. It thus results in a single-threaded Mult&Relin ciphertext operation in 109.4 ms (×3.19 faster than [HPS18]) with RPM counting for less than 15% of the new computation time. Our RPM design scales up with reasonable use of hardware resources and realistic bandwidth requirements. It can also be exploited for other RNS based implementations of RLWE cryptosystems.

Download Full-text

Perspective and Opportunities of Modulo 2n−1 Multipliers in Residue Number System: A Review

Journal of Circuits System and Computers ◽

10.1142/s0218126620300081 ◽

2020 ◽

Vol 29 (11) ◽

pp. 2030008

Author(s):

Raj Kumar ◽

Ritesh Kumar Jaiswal ◽

Ram Awadh Mishra

Keyword(s):

Word Length ◽

Dynamic Range ◽

High Dynamic Range ◽

Number System ◽

Residue Number System ◽

System A ◽

Essential Components ◽

Residue Number ◽

Speed Up ◽

Computational Circuits

Modulo multiplier has been attracting considerable attention as one of the essential components of residue number system (RNS)-based computational circuits. This paper contributes a comprehensive review in the design of modulo [Formula: see text] multipliers for the first time. The modulo multipliers can be implemented using ROM (look-up-table) as well as VLSI components (memoryless); however, the former is preferable for lower word-length and later for larger word-length. The modular and parallelism properties of RNS are used to improve the performance of memoryless multipliers. Moreover, a Booth-encoding algorithm is used to speed-up the multipliers. Also, an advanced modulo [Formula: see text] multiplier based on redundant RNS (RRNS) could be further chosen for very high dynamic range. These perspectives of modulo [Formula: see text] multipliers have been extensively studied for recent state-of-the-art and analyzed using Synopsis design compiler tool.

Download Full-text

Hardware Implementation of Video Processing Device using Residue Number System

2019 42nd International Conference on Telecommunications and Signal Processing (TSP) ◽

10.1109/tsp.2019.8768827 ◽

2019 ◽

Cited By ~ 1

Author(s):

Dmitrii I. Kaplun ◽

Nikolai I. Chervyakov ◽

Pavel A. Lyakhov ◽

Andrey S. Ionisyan ◽

Maria V. Valueva ◽

...

Keyword(s):

Video Processing ◽

Hardware Implementation ◽

Number System ◽

Residue Number System ◽

Residue Number ◽

Processing Device

Download Full-text

Hardware Implementation of Fir Filter Based on Number-theoretic Fast Fourier Transform in Residue Number System

Open Engineering Sciences Journal ◽

10.2174/2352628901401010001 ◽

2014 ◽

Vol 1 (1) ◽

pp. 1-6 ◽

Cited By ~ 5

Author(s):

V. M. Amerbaev ◽

R. A. Soloviev ◽

D. V. Telpukhov

Keyword(s):

Fourier Transform ◽

Fast Fourier Transform ◽

Hardware Implementation ◽

Number System ◽

Residue Number System ◽

Fir Filter ◽

Residue Number

Download Full-text

HARDWARE IMPLEMENTATION OF VIDEO PROCESSING DEVICE USING RESIDUE NUMBER SYSTEM

Sovremennaya nauka i innovatsii ◽

10.37493/2307-910x.2021.1.2 ◽

2021 ◽

pp. 15-21

Author(s):

Pavel Alekseyevich Lyakhov ◽

Andrey Sergeevich Ionisyan ◽

Violetta Vladimirovna Masaeva ◽

Maria Vasilevna Valueva

Keyword(s):

Video Processing ◽

Hardware Implementation ◽

Number System ◽

Residue Number System ◽

Residue Number ◽

Processing Device

Download Full-text

An improved hardware implementation of the one hot Residue Number System

2014 Fifth Argentine Symposium and Conference on Embedded Systems SASE/CASE 2014 ◽

10.1109/sase-case.2014.6914463 ◽

2014 ◽

Author(s):

Carlos Arturo Gayoso ◽

Claudio Gonzalez ◽

Leonardo Arnone ◽

Miguel Rabini ◽

Jorge Castineira Moreira

Keyword(s):

Hardware Implementation ◽

Number System ◽

Residue Number System ◽

Residue Number ◽

The One

Download Full-text

Notice of Retraction Designing of Vedic Based Modulo Multiplication in Residue Number System

International Journal of Reconfigurable and Embedded Systems (IJRES) ◽

10.11591/ijres.v7.i2.pp67-73 ◽

2018 ◽

Vol 7 (2) ◽

pp. 67

Author(s):

Shamim Akhter ◽

Divya Bareja ◽

Satyendra Kumar

Keyword(s):

Comparative Analysis ◽

Input Data ◽

Computation Time ◽

Number System ◽

Residue Number System ◽

Expert Committee ◽

Direct Computation ◽

Very Old ◽

Residue Number ◽

Mathematical Operations

Notice of Retraction----------------------------------------------------------------------- After careful and considered review of the content of this paper by a duly constituted expert committee, this paper has been found to be in violation of IAES's Publication Principles. We hereby retract the content of this paper. Reasonable effort should be made to remove all past references to this paper. The presenting author of this paper has the option to appeal this decision by contacting [email protected].-----------------------------------------------------------------------Residue Number System (RNS) is a very old number system which was proposed in 1500 AD. Parallel nature for mathematical operations in RNS results in faster computation. This paper deals with designing of modulo multiplication in RNS. Direct computation of |AB|m, requires multiplier to get A.B first and then Mod-m calculator to get the final result. We have used Vedic technique along with RNS to improve the computation time for modulo multiplication. This paper is aimed at designing and analysis of modulo multiplier for special moduli set like 3, 5 and 7. Comparative analysis in terms of area and delay is performed for input data size (N=8, 16 and 32-bit) between proposed technique and direct computation using Xilinx ISE 14.1. Design is also been compared using Synopsys Design Compiler with 32 nm Std_Cell Library. It is found that proposed technique is more efficient in terms of speed when input data size increases.

Download Full-text

Influences of Hardware Implementation on a High Speed Digital Adaptive Filter Using the Residue Number System.

10.21236/ada123146 ◽

1982 ◽

Author(s):

Kimberly D. Weinmann

Keyword(s):

Adaptive Filter ◽

High Speed ◽

Hardware Implementation ◽

Number System ◽

Residue Number System ◽

Residue Number

Download Full-text

Hardware Implementation of Convolutional Neural Networks Based on Residue Number System

2020 Moscow Workshop on Electronic and Networking Technologies (MWENT) ◽

10.1109/mwent47943.2020.9067498 ◽

2020 ◽

Author(s):

Roman Soloviev ◽

Dmitry Telpukhov ◽

Ilya Mkrtchan ◽

Alexander Kustov ◽

Alexander Stempkovskiy

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Hardware Implementation ◽

Number System ◽

Residue Number System ◽

Residue Number

Download Full-text

Hardware implementation of a convolutional neural network using calculations in the residue number system

Computer Optics ◽

10.18287/2412-6179-2019-43-5-857-868 ◽

2019 ◽

Vol 43 (5) ◽

pp. 857-868 ◽

Cited By ~ 3

Author(s):

N.I. Chervyakov ◽

P.A. Lyakhov ◽

N.N. Nagornov ◽

M.V. Valueva ◽

G.V. Valuev

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Network Architecture ◽

Hardware Implementation ◽

Traditional Approach ◽

Number System ◽

Residue Number System ◽

Quantization Noise ◽

The Neural Network ◽

Residue Number

Modern convolutional neural networks architectures are very resource intensive which limits the possibilities for their wide practical application. We propose a convolutional neural network architecture in which the neural network is divided into hardware and software parts to increase performance and reduce the cost of implementation resources. We also propose to use the residue number system in the hardware part to implement the convolutional layer of the neural network for resource costs reducing. A numerical method for quantizing the filters coefficients of a convolutional network layer is proposed to minimize the influence of quantization noise on the calculation result in the residue number system and determine the bit-width of the filters coefficients. This method is based on scaling the coefficients by a fixed number of bits and rounding up and down. The operations used make it possible to reduce resources in hardware implementation due to the simplifying of their execution. All calculations in the convolutional layer are performed on numbers in a fixed-point format. Software simulations using Matlab 2017b showed that convolutional neural network with a minimum number of layers can be quickly and successfully trained. Hardware implementation using the field-programmable gate array Kintex7 xc7k70tfbg484-2 showed that the use of residue number system in the convolutional layer of the neural network reduces the hardware costs by 32.6% compared with the traditional approach based on the two’s complement representation. The research results can be applied to create effective video surveillance systems, for recognizing handwriting, individuals, objects and terrain.

Download Full-text