scholarly journals High Performance Reconfigurable Elliptic Curve Cipher Processor Implementation

2021 ◽  
Vol 2136 (1) ◽  
pp. 012043
Author(s):  
Jian Zhang ◽  
Liting Niu

Abstract Elliptic Curve Encryption (ECC) has been widely used in the field of digital signatures in communication security. ECC standards and the diversification of application scenarios put forward higher requirements for the flexibility of ECC processors. Therefore, it is necessary to design a flexible and reconfigurable processor to adapt to changing standards. The cryptographic processor chip designed in this paper supports the choice of prime and binary fields, supports the maximum key length of 576 bits, uses microcode programming to achieve reconfigurable function, and significantly improves the flexibility of the dedicated cryptographic processor. At the same time, the speed of modular multiplication and modular division can be greatly improved under the condition of keeping the low level of hardware resources through a carefully designed modular unit of operation. After using FPGA for hardware implementation, it is configured into a 256-bit key length. The highest clock frequency of this design can reach 55.7MHz, occupying 12425LUTS. Compared with a similar design, the performance is also greatly improved. After MALU module optimization design, modular multiplication module division also has significant advantages in computing time consumption.

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Yong Xiao ◽  
Weibin Lin ◽  
Yun Zhao ◽  
Chao Cui ◽  
Ziwen Cai

Teleoperated robotic systems are those in which human operators control remote robots through a communication network. The deployment and integration of teleoperated robot’s systems in the medical operation have been hampered by many issues, such as safety concerns. Elliptic curve cryptography (ECC), an asymmetric cryptographic algorithm, is widely applied to practical applications because its far significantly reduced key length has the same level of security as RSA. The efficiency of ECC on GF (p) is dictated by two critical factors, namely, modular multiplication (MM) and point multiplication (PM) scheduling. In this paper, the high-performance ECC architecture of SM2 is presented. MM is composed of multiplication and modular reduction (MR) in the prime field. A two-stage modular reduction (TSMR) algorithm in the SCA-256 prime field is introduced to achieve low latency, which avoids more iterative subtraction operations than traditional algorithms. To cut down the run time, a schedule is put forward when exploiting the parallelism of multiplication and MR inside PM. Synthesized with a 0.13 um CMOS standard cell library, the proposed processor consumes 341.98k gate areas, and each PM takes 0.092 ms.


2012 ◽  
Vol 2012 ◽  
pp. 1-14 ◽  
Author(s):  
Lyndon Judge ◽  
Suvarna Mane ◽  
Patrick Schaumont

Elliptic curve cryptography (ECC) has become a popular public key cryptography standard. The security of ECC is due to the difficulty of solving the elliptic curve discrete logarithm problem (ECDLP). In this paper, we demonstrate a successful attack on ECC over prime field using the Pollard rho algorithm implemented on a hardware-software cointegrated platform. We propose a high-performance architecture for multiplication over prime field using specialized DSP blocks in the FPGA. We characterize this architecture by exploring the design space to determine the optimal integer basis for polynomial representation and we demonstrate an efficient mapping of this design to multiple standard prime field elliptic curves. We use the resulting modular multiplier to demonstrate low-latency multiplications for curves secp112r1 and P-192. We apply our modular multiplier to implement a complete attack on secp112r1 using a Nallatech FSB-Compute platform with Virtex-5 FPGA. The measured performance of the resulting design is 114 cycles per Pollard rho step at 100 MHz, which gives 878 K iterations per second per ECC core. We extend this design to a multicore ECDLP implementation that achieves 14.05 M iterations per second with 16 parallel point addition cores.


Author(s):  
Amine Mrabet ◽  
Nadia El-Mrabet ◽  
Ronan Lashermes ◽  
Jean-Baptiste Rigaud ◽  
Belgacem Bouallegue ◽  
...  

Author(s):  
Deepika Bansal ◽  
Bal Chand Nagar ◽  
Brahamdeo Prasad Singh ◽  
Ajay Kumar

Background & Objective: In this paper, a modified pseudo domino configuration has been proposed to improve the leakage power consumption and Power Delay Product (PDP) of dynamic logic using Carbon Nanotube MOSFETs (CN-MOSFETs). The simulations for proposed and published domino circuits are verified by using Synopsys HSPICE simulator with 32nm CN-MOSFET technology which is provided by Stanford. Methods: The simulation results of the proposed technique are validated for improvement of wide fan-in domino OR gate as a benchmark circuit at 500 MHz clock frequency. Results: The proposed configuration is suitable for cascading of the high performance wide fan-in circuits without any charge sharing. Conclusion: The performance analysis of 8-input OR gate demonstrate that the proposed circuit provides lower static and dynamic power consumption up to 62 and 40% respectively, and PDP improvement is 60% as compared to standard domino circuit.


2015 ◽  
Vol 2015 ◽  
pp. 1-20
Author(s):  
Gongyu Wang ◽  
Greg Stitt ◽  
Herman Lam ◽  
Alan George

Field-programmable gate arrays (FPGAs) provide a promising technology that can improve performance of many high-performance computing and embedded applications. However, unlike software design tools, the relatively immature state of FPGA tools significantly limits productivity and consequently prevents widespread adoption of the technology. For example, the lengthy design-translate-execute (DTE) process often must be iterated to meet the application requirements. Previous works have enabled model-based, design-space exploration to reduce DTE iterations but are limited by a lack of accurate model-based prediction of key design parameters, the most important of which is clock frequency. In this paper, we present a core-level modeling and design (CMD) methodology that enables modeling of FPGA applications at an abstract level and yet produces accurate predictions of parameters such as clock frequency, resource utilization (i.e., area), and latency. We evaluate CMD’s prediction methods using several high-performance DSP applications on various families of FPGAs and show an average clock-frequency prediction error of 3.6%, with a worst-case error of 20.4%, compared to the best of existing high-level prediction methods, 13.9% average error with 48.2% worst-case error. We also demonstrate how such prediction enables accurate design-space exploration without coding in a hardware-description language (HDL), significantly reducing the total design time.


Sign in / Sign up

Export Citation Format

Share Document