scholarly journals CFNTT: Scalable Radix-2/4 NTT Multiplication Architecture with an Efficient Conflict-free Memory Mapping Scheme

Author(s):  
Xiangren Chen ◽  
Bohan Yang ◽  
Shouyi Yin ◽  
Shaojun Wei ◽  
Leibo Liu

Number theoretic transform (NTT) is widely utilized to speed up polynomial multiplication, which is the critical computation bottleneck in a lot of cryptographic algorithms like lattice-based post-quantum cryptography (PQC) and homomorphic encryption (HE). One of the tendency for NTT hardware architecture is to support diverse security parameters and meet resource constraints on different computing platforms. Thus flexibility and Area-Time Product (ATP) become two crucial metrics in NTT hardware design. The flexibility of NTT in terms of different vector sizes and moduli can be obtained directly. Whereas the varying strides in memory access of in-place NTT render the design for different radix and number of parallel butterfly units a tough problem. This paper proposes an efficient conflict-free memory mapping scheme that supports the configuration for both multiple parallel butterfly units and arbitrary radix of NTT. Compared to other approaches, this scheme owns broader applicability and facilitates the parallelization of non-radix-2 NTT hardware design. Based on this scheme, we propose a scalable radix-2 and radix-4 NTT multiplication architecture by algorithm-hardware co-design. A dedicated schedule method is leveraged to reduce the number of modular additions/subtractions and modular multiplications in radix-4 butterfly unit by 20% and 33%, respectively. To avoid the bit-reversed cost and save memory footprint in arbitrary radix NTT/INTT, we put forward a general method by rearranging the loop structure and reusing the twiddle factors. The hardware-level optimization is achieved by excavating the symmetric operators in radix-4 butterfly unit, which saves almost 50% hardware resources compared to a straightforward implementation. Through experimental results and theoretical analysis, we point out that the radix-4 NTT with the same number of parallel butterfly units outperforms the radix-2 NTT in terms of area-time performance in the interleaved memory system. This advantage is enlarged when increasing the number of parallel butterfly units. For example, when processing 1024 14-bit points NTT with 8 parallel butterfly units, the ATP of LUT/FF/DSP/BRAM n radix-4 NTT core is approximately 2.2 × /1.2 × /1.1 × /1.9 × less than that of the radix-2 NTT core on a similar FPGA platform.

2021 ◽  
Vol 13 (4) ◽  
pp. 94
Author(s):  
Haokun Fang ◽  
Quan Qian

Privacy protection has been an important concern with the great success of machine learning. In this paper, it proposes a multi-party privacy preserving machine learning framework, named PFMLP, based on partially homomorphic encryption and federated learning. The core idea is all learning parties just transmitting the encrypted gradients by homomorphic encryption. From experiments, the model trained by PFMLP has almost the same accuracy, and the deviation is less than 1%. Considering the computational overhead of homomorphic encryption, we use an improved Paillier algorithm which can speed up the training by 25–28%. Moreover, comparisons on encryption key length, the learning network structure, number of learning clients, etc. are also discussed in detail in the paper.


2018 ◽  
Vol 16 (06) ◽  
pp. 1850052
Author(s):  
Y. H. Lee ◽  
M. Khalil-Hani ◽  
M. N. Marsono

While physical realization of practical large-scale quantum computers is still ongoing, theoretical research of quantum computing applications is facilitated on classical computing platforms through simulation and emulation methods. Nevertheless, the exponential increase in resource requirement with the increase in the number of qubits is an inherent issue in classical modeling of quantum systems. In the effort to alleviate the critical scalability issue in existing FPGA emulation works, a novel FPGA-based quantum circuit emulation framework based on Heisenberg representation is proposed in this paper. Unlike previous works that are restricted to the emulations of quantum circuits of small qubit sizes, the proposed FPGA emulation framework can scale-up to 120-qubit on Altera Stratix IV FPGA for the stabilizer circuit case study while providing notable speed-up over the equivalent simulation model.


Author(s):  
Mohamed Elhoseny ◽  
Ahmed Farouk ◽  
Josep Batle ◽  
Abdulaziz Shehab ◽  
Aboul Ella Hassanien

WSN as a new category of computer-based computing platforms and network structures is showing new applications in different areas such as environmental monitoring, health care and military applications. Although there are a lot of secure image processing schemas designed for image transmission over a network, the limited resources and the dynamic environment make it invisible to be used with Wireless Sensor Networks (WSNs). In addition, the current secure data transmission schemas in WSN are concentrated on the text data and are not applicable for image transmission's applications. Furthermore, secure image transmission is a big challenging issue in WSNs especially for the application that uses image as its main data such as military applications. The reason why is because the limited resources of the sensor nodes which are usually deployed in unattended environments. This chapter introduces a secure image processing and transmission schema in WSN using Elliptic Curve Cryptography (ECC) and Homomorphic Encryption (HE).


Author(s):  
Scott Ames ◽  
Muthuramakrishnan Venkitasubramaniam ◽  
Alex Page ◽  
Ovunc Kocabas ◽  
Tolga Soyata

Extending cloud computing to medical software, where the hospitals rent the software from the provider sounds like a natural evolution for cloud computing. One problem with cloud computing, though, is ensuring the medical data privacy in applications such as long term health monitoring. Previously proposed solutions based on Fully Homomorphic Encryption (FHE) completely eliminate privacy concerns, but are extremely slow to be practical. Our key proposition in this paper is a new approach to applying FHE into the data that is stored in the cloud. Instead of using the existing circuit-based programming models, we propose a solution based on Branching Programs. While this restricts the type of data elements that FHE can be applied to, it achieves dramatic speed-up as compared to traditional circuit-based methods. Our claims are proven with simulations applied to real ECG data.


2015 ◽  
Vol 809-810 ◽  
pp. 1462-1467 ◽  
Author(s):  
Krzysztof Kalinowski ◽  
Iwona Paprocka

The searching state space in scheduling of real manufacturing systems with discrete and multi-assortment production is discussed in this paper. The production load is represented by a directed and/or graph called “the aggregated graph of operations planning of the set of orders”. It determines the order of operations, according to which they will be inserted into a schedule. This order must always comply with all assumed precedence and resource constraints and also with given scheduling strategy of a production order. In the elaborated representation the complex products structures and alternative routes of their realization are also considered. The most important issues related to searching this space are discussed in this paper. These include: a general method for searching the graph, sequencing of parallel processes and operations using schedule generation schemes and selection of routes variants.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
P. Nannipieri ◽  
S. Di Matteo ◽  
L. Zulberti ◽  
F. Albicocchi ◽  
S. Saponara ◽  
...  

2021 ◽  
Vol 11 (3) ◽  
pp. 256-261
Author(s):  
Binh Kieu-Do-Nguyen ◽  
◽  
Cuong Pham-Quoc ◽  
Cong-Kha Pham

There is no denying that Bioinformatics is one of the most important realms for our forthcoming development. As a demonstration of this fact, a plethora of new algorithms that were published over the last decade. Those significantly boost up the processes of biological analysis, especially for DNA alignment. Despite their undeniable contributions, it is still far more to state that DNA alignment has already achieved the ideal performance. In this work, we focus on the DNA alignment system which is based on our improved BWA-MEM algorithm that we have already published. Besides that, we also propose some optimization methods which was applied in order to improve the performance as well as the stability of our entire system. The system offers a speed-up by 46.52x when compared with the other computing platforms.


2017 ◽  
Vol 1 (1) ◽  
pp. 1
Author(s):  
D. Chandravathi ◽  
P.V. Lakshmi

This paper aims to provide security of data in the Cloud using Multiplicative Homomorphic Approach. Encryption process is done with RSA algorithm. In this RSA algorithm, Shor’s algorithm is used for generating Public key Component, which enhances the security. Shor’s algorithm plays as important role in generating public key. Plain Text Message is encrypted with Public Key to generate Cipher Text and for decryption Chinese Remainder Theorem (CRT) is used to speed up the computations. By doing so, it shows how the CRT representation of numbers in Zn can be used to perform modular exponentiation about much more efficiently using three extra values pre-computed from the prime factors of n. Hence, security is enhanced in the cloud provider.


Sign in / Sign up

Export Citation Format

Share Document