High Performance Modular Multiplication for SIDH

Modular calculations are widely used in many applications, especially in public key cryptography. Such operations are very time consuming, due to their long operands. To improve the performance of these calculations, many methods have been introduced. Montgomery modular multiplication is an example of such a solution to enhance the performance of modular multiplication and modular exponentiation. The radix-2 version of this method is simple and fast for hardware implementation, where multi-operand adders are required for its implementation. So far, Carry-Save-Adder (CSA) gives the best performance for multi-addition. In this paper, we propose a new recoding method for the Montgomery modular multiplier to enhance its performance. This is done through replacing CSA blocks with new blocks that have better performances than CSA in multi-addition calculations. With this replacement, we can theoretically have up to 40% reduction in area gates. In our experiments, we obtained 5.8% area reduction and 3% speed improvement in a hardware implementation. The idea behind our proposed method is the use of bitwise subtraction operator, where no carry propagation is needed. This recoding method of operands can also be used in many aspects of computer arithmetic, algorithms and computational hardware, such as multiplication, exponentiation and etc., in order to enhance their performances.

Download Full-text

Design of N-Term Scalable High-Performance Modular Multiplication Operator on GF (2m)

2021 IEEE 4th International Conference on Electronics Technology (ICET) ◽

10.1109/icet51757.2021.9451142 ◽

2021 ◽

Author(s):

Benjun Zhang ◽

Ning Wu ◽

Fang Zhou ◽

Fen Ge ◽

Caixian Fei

Keyword(s):

High Performance ◽

Multiplication Operator ◽

Modular Multiplication

Download Full-text

A Hardware-Accelerated ECDLP with High-Performance Modular Multiplication

International Journal of Reconfigurable Computing ◽

10.1155/2012/439021 ◽

2012 ◽

Vol 2012 ◽

pp. 1-14 ◽

Cited By ~ 4

Author(s):

Lyndon Judge ◽

Suvarna Mane ◽

Patrick Schaumont

Keyword(s):

Elliptic Curve ◽

Elliptic Curve Cryptography ◽

High Performance ◽

Design Space ◽

Discrete Logarithm ◽

Public Key Cryptography ◽

Modular Multiplication ◽

Polynomial Representation ◽

Prime Field ◽

Modular Multiplier

Elliptic curve cryptography (ECC) has become a popular public key cryptography standard. The security of ECC is due to the difficulty of solving the elliptic curve discrete logarithm problem (ECDLP). In this paper, we demonstrate a successful attack on ECC over prime field using the Pollard rho algorithm implemented on a hardware-software cointegrated platform. We propose a high-performance architecture for multiplication over prime field using specialized DSP blocks in the FPGA. We characterize this architecture by exploring the design space to determine the optimal integer basis for polynomial representation and we demonstrate an efficient mapping of this design to multiple standard prime field elliptic curves. We use the resulting modular multiplier to demonstrate low-latency multiplications for curves secp112r1 and P-192. We apply our modular multiplier to implement a complete attack on secp112r1 using a Nallatech FSB-Compute platform with Virtex-5 FPGA. The measured performance of the resulting design is 114 cycles per Pollard rho step at 100 MHz, which gives 878 K iterations per second per ECC core. We extend this design to a multicore ECDLP implementation that achieves 14.05 M iterations per second with 16 parallel point addition cores.

Download Full-text