Efficient architectures for implementing montgomery modular multiplication and RSA modular exponentiation on reconfigurable logic

This paper presents an FPGA implementation of the most critical operations of Public Key Cryptography (PKC), namely the Modular Exponentiation (ME) and the Modular Multiplication (MM). Both operations are integrated as Programmable System on Chip (PSoC) where the processor Microblaze of Xilinx is used for flexibility. Our objective is to achieve a best trade-off between time execution, occupied area and flexibility. The implementation of these operations on such environment requires taking into account several criteria. Indeed, the Hardware (HW) architectures data bus should be smaller than the input data length. The design must be scalable to support different security levels. The implementation achieves optimums execution time and HW resources number. In order to satisfy these constraints, Montgomery Power Ladder (MPL) and Montgomery Modular Multiplication (MMM) algorithms are utilized for the ME and the MM implementations as HW accelerators, respectively. Our implementation approach is based on the digit-serial method for performing the basic arithmetic operations. Efficient parallel and pipeline strategies are developed at the digit level for the optimization of the execution times. The application for 1024-bits data length shows that the MMM run in 6.24[Formula: see text][Formula: see text]s and requires 647 slices. The ME is executed in 6.75[Formula: see text]ms using 2881 slices.

Download Full-text

HIGH PERFORMANCE MONTGOMERY MODULAR MULTIPLIER WITH A NEW RECODING METHOD

Journal of Circuits System and Computers ◽

10.1142/s0218126611007438 ◽

2011 ◽

Vol 20 (03) ◽

pp. 531-548 ◽

Cited By ~ 1

Author(s):

KOOROUSH MANOCHEHRI ◽

BABAK SADEGHIYAN ◽

SAADAT POURMOZAFARI

Keyword(s):

High Performance ◽

Hardware Implementation ◽

Computer Arithmetic ◽

Public Key Cryptography ◽

Modular Exponentiation ◽

Modular Multiplication ◽

Area Reduction ◽

Montgomery Modular Multiplication ◽

Modular Multiplier ◽

Reduction In Area

Modular calculations are widely used in many applications, especially in public key cryptography. Such operations are very time consuming, due to their long operands. To improve the performance of these calculations, many methods have been introduced. Montgomery modular multiplication is an example of such a solution to enhance the performance of modular multiplication and modular exponentiation. The radix-2 version of this method is simple and fast for hardware implementation, where multi-operand adders are required for its implementation. So far, Carry-Save-Adder (CSA) gives the best performance for multi-addition. In this paper, we propose a new recoding method for the Montgomery modular multiplier to enhance its performance. This is done through replacing CSA blocks with new blocks that have better performances than CSA in multi-addition calculations. With this replacement, we can theoretically have up to 40% reduction in area gates. In our experiments, we obtained 5.8% area reduction and 3% speed improvement in a hardware implementation. The idea behind our proposed method is the use of bitwise subtraction operator, where no carry propagation is needed. This recoding method of operands can also be used in many aspects of computer arithmetic, algorithms and computational hardware, such as multiplication, exponentiation and etc., in order to enhance their performances.

Download Full-text

Montgomery modular-multiplication method and systolic arrays suitable for modular exponentiation

Electronics and Communications in Japan (Part III Fundamental Electronic Science) ◽

10.1002/ecjc.4430770304 ◽

1994 ◽

Vol 77 (3) ◽

pp. 40-51 ◽

Cited By ~ 9

Author(s):

Keiichi Iwamura ◽

Tsutomu Matsumoto ◽

Hideki Imai

Keyword(s):

Systolic Arrays ◽

Modular Exponentiation ◽

Modular Multiplication ◽

Montgomery Modular Multiplication

Download Full-text

Montgomery Modular Multiplication on Reconfigurable Hardware: Systolic versus Multiplexed Implementation

International Journal of Reconfigurable Computing ◽

10.1155/2011/127147 ◽

2011 ◽

Vol 2011 ◽

pp. 1-10 ◽

Cited By ~ 12

Author(s):

Guilherme Perin ◽

Daniel Gomes Mesquita ◽

João Baptista Martins

Keyword(s):

State Of The Art ◽

Operation Mode ◽

Modular Exponentiation ◽

Modular Multiplication ◽

Systolic Architecture ◽

One Dimensional ◽

Processing Elements ◽

Montgomery Modular Multiplication ◽

Cryptographic Algorithms ◽

High Radix

This paper describes a comparison of two Montgomery modular multiplication architectures: a systolic and a multiplexed. Both implementations target FPGA devices. The modular multiplication is employed in modular exponentiation processes, which are the most important operations of some public-key cryptographic algorithms, including the most popular of them, the RSA. The proposed systolic architecture presents a high-radix implementation with a one-dimensional array of Processing Elements. The multiplexed implementation is a new alternative and is composed of multiplier blocks in parallel with the new simplified Processing Elements, and it provides a pipelined operation mode. We compare thetime×areaefficiency for both architectures as well as an RSA application. The systolic implementation can run the 1024 bits RSA decryption process in just 3.23 ms, and the multiplexed architecture executes the same operation in 4.36 ms, but the second approach saves up to 28% of logical resources. These results are competitive with the state-of-the-art performance.

Download Full-text

Efficient FPGA Implementation of Modular Multiplication and Exponentiation

Malaysian Journal of Computing and Applied Mathematics ◽

10.37231/myjcam.2020.3.1.37 ◽

2020 ◽

Vol 3 (1) ◽

pp. 1-13

Author(s):

M Issad ◽

M Anane ◽

B Boudraa ◽

A M Bellemou ◽

N Anane

Keyword(s):

Execution Time ◽

Public Key Cryptography ◽

Fpga Implementation ◽

Public Key ◽

Modular Exponentiation ◽

Modular Multiplication ◽

Montgomery Modular Multiplication ◽

Implementation Approach ◽

On Chip ◽

Data Length

This paper presents an FPGA implementation of the most critical operations of Public Key Cryptography (PKC), namely the Modular Exponentiation (ME) and the Modular Multiplication (MM). Both operations are integrated in Hardware (HW) as Programmable System on Chip (PSoC). The processor Microblaze of Xilinx is used for flexibility. Our objective is to achieve a best trade-off between execution time, occupied area and flexibility. In order to satisfy this constraint, Montgomery Power Ladder and Montgomery Modular Multiplication (MMM) algorithms are utilized for the ME and for the MM implementations as HW accelerators, respectively. Our implementation approach is based on the digit-serial method for performing the basic arithmetic operations. Efficient parallel and pipeline strategies are developed at the digit level for the optimization of the execution time. The application for 1024-bits data length shows that the MMM run in 6.24 µs and requires 647 slices. The ME is executed in 6.75 ms, using 2881 slices.

Download Full-text

Fast Montgomery Modular Multiplication and Squaring on Embedded Processors

IEICE Transactions on Communications ◽

10.1587/transcom.2016ebp3189 ◽

2017 ◽

Vol E100.B (5) ◽

pp. 680-690 ◽

Cited By ~ 1

Author(s):

Yang LI ◽

Jinlin WANG ◽

Xuewen ZENG ◽

Xiaozhou YE

Keyword(s):

Embedded Processors ◽

Modular Multiplication ◽

Montgomery Modular Multiplication

Download Full-text

An Efficient Signed Digit Montgomery Modular Multiplication Algorithm

Microelectronics Journal ◽

10.1016/j.mejo.2021.105099 ◽

2021 ◽

pp. 105099

Author(s):

ShiLei Zhao ◽

Hai Huang ◽

ZhiWei Liu ◽

Bin Yu ◽

Bo Yu

Keyword(s):

Modular Multiplication ◽

Multiplication Algorithm ◽

Montgomery Modular Multiplication

Download Full-text

Timing attacks and local timing attacks against Barrett’s modular multiplication algorithm

Journal of Cryptographic Engineering ◽

10.1007/s13389-020-00254-3 ◽

2021 ◽

Author(s):

Johannes Mittmann ◽

Werner Schindler

Keyword(s):

Side Channel ◽

Modular Exponentiation ◽

Modular Multiplication ◽

Multiplication Algorithm ◽

Diffie Hellman ◽

Mathematical Difficulties ◽

Execution Times ◽

Stochastic Properties ◽

Timing Attacks ◽

Theoretical Results

AbstractMontgomery’s and Barrett’s modular multiplication algorithms are widely used in modular exponentiation algorithms, e.g. to compute RSA or ECC operations. While Montgomery’s multiplication algorithm has been studied extensively in the literature and many side-channel attacks have been detected, to our best knowledge no thorough analysis exists for Barrett’s multiplication algorithm. This article closes this gap. For both Montgomery’s and Barrett’s multiplication algorithm, differences of the execution times are caused by conditional integer subtractions, so-called extra reductions. Barrett’s multiplication algorithm allows even two extra reductions, and this feature increases the mathematical difficulties significantly. We formulate and analyse a two-dimensional Markov process, from which we deduce relevant stochastic properties of Barrett’s multiplication algorithm within modular exponentiation algorithms. This allows to transfer the timing attacks and local timing attacks (where a second side-channel attack exhibits the execution times of the particular modular squarings and multiplications) on Montgomery’s multiplication algorithm to attacks on Barrett’s algorithm. However, there are also differences. Barrett’s multiplication algorithm requires additional attack substeps, and the attack efficiency is much more sensitive to variations of the parameters. We treat timing attacks on RSA with CRT, on RSA without CRT, and on Diffie–Hellman, as well as local timing attacks against these algorithms in the presence of basis blinding. Experiments confirm our theoretical results.

Download Full-text