High radix montgomery modular multiplication on FPGA

Author(s):  
Anane Mohamed ◽  
Anane Nadjia
2011 ◽  
Vol 2011 ◽  
pp. 1-10 ◽  
Author(s):  
Guilherme Perin ◽  
Daniel Gomes Mesquita ◽  
João Baptista Martins

This paper describes a comparison of two Montgomery modular multiplication architectures: a systolic and a multiplexed. Both implementations target FPGA devices. The modular multiplication is employed in modular exponentiation processes, which are the most important operations of some public-key cryptographic algorithms, including the most popular of them, the RSA. The proposed systolic architecture presents a high-radix implementation with a one-dimensional array of Processing Elements. The multiplexed implementation is a new alternative and is composed of multiplier blocks in parallel with the new simplified Processing Elements, and it provides a pipelined operation mode. We compare thetime×areaefficiency for both architectures as well as an RSA application. The systolic implementation can run the 1024 bits RSA decryption process in just 3.23 ms, and the multiplexed architecture executes the same operation in 4.36 ms, but the second approach saves up to 28% of logical resources. These results are competitive with the state-of-the-art performance.


2017 ◽  
Vol E100.B (5) ◽  
pp. 680-690 ◽  
Author(s):  
Yang LI ◽  
Jinlin WANG ◽  
Xuewen ZENG ◽  
Xiaozhou YE

2019 ◽  
Vol 28 (03) ◽  
pp. 1950037 ◽  
Author(s):  
A. Bellemou ◽  
N. Benblidia ◽  
M. Anane ◽  
M. Issad

In this paper, we present Microblaze-based parallel architectures of Elliptic Curve Scalar Multiplication (ECSM) computation for embedded Elliptic Curve Cryptosystem (ECC) on Xilinx FPGA. The proposed implementations support arbitrary Elliptic Curve (EC) forms defined over large prime field ([Formula: see text]) with different security-level sizes. ECSM is performed using Montgomery Power Ladder (MPL) algorithm in Chudnovsky projective coordinates system. At the low abstraction level, Montgomery Modular Multiplication (MMM) is considered as the critical operation. It is implemented within a hardware Accelerator MMM (AccMMM) core based on the modified high radix, [Formula: see text] MMM algorithm. The efficiency of our parallel implementations is achieved by the combination of the mixed SW/HW approach with Multi Processor System on Programmable Chip (MPSoPC) design. The integration of multi MicroBlaze processor in single architecture allows not only the flexibility of the overall system but also the exploitation of the parallelism in ECSM computation with several degrees. The Virtex-5 parallel implementations of 256-bit and 521-bis ECSM computations run at 100[Formula: see text]MHZ frequency and consume between 2,739 and 6,533 slices, 22 and 72 RAMs and between 16 and 48 DSP48E cores. For the considered security-level sizes, the delays to perform single ECSM are between 115[Formula: see text]ms and 14.72[Formula: see text]ms.


Sign in / Sign up

Export Citation Format

Share Document