A survey on software implementation of lightweight block ciphers for IoT devices

Author(s):  
Abdullah Sevin ◽  
Abdu Ahmed Osman Mohammed
Mathematics ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. 1894
Author(s):  
SangWoo An ◽  
YoungBeom Kim ◽  
Hyeokdong Kwon ◽  
Hwajeong Seo ◽  
Seog Chung Seo

With the development of information and communication technology, various types of Internet of Things (IoT) devices have widely been used for convenient services. Many users with their IoT devices request various services to servers. Thus, the amount of users’ personal information that servers need to protect has dramatically increased. To quickly and safely protect users’ personal information, it is necessary to optimize the speed of the encryption process. Since it is difficult to provide the basic services of the server while encrypting a large amount of data in the existing CPU, several parallel optimization methods using Graphics Processing Units (GPUs) have been considered. In this paper, we propose several optimization techniques using GPU for efficient implementation of lightweight block cipher algorithms on the server-side. As the target algorithm, we select high security and light weight (HIGHT), Lightweight Encryption Algorithm (LEA), and revised CHAM, which are Add-Rotate-Xor (ARX)-based block ciphers, because they are used widely on IoT devices. We utilize the features of the counter (CTR) operation mode to reduce unnecessary memory copying and operations in the GPU environment. Besides, we optimize the memory usage by making full use of GPU’s on-chip memory such as registers and shared memory and implement the core function of each target algorithm with inline PTX assembly codes for maximizing the performance. With the application of our optimization methods and handcrafted PTX codes, we achieve excellent encryption throughput of 468, 2593, and 3063 Gbps for HIGHT, LEA, and revised CHAM on RTX 2070 NVIDIA GPU, respectively. In addition, we present optimized implementations of Counter Mode Based Deterministic Random Bit Generator (CTR_DRBG), which is one of the widely used deterministic random bit generators to provide a large amount of random data to the connected IoT devices. We apply several optimization techniques for maximizing the performance of CTR_DRBG, and we achieve 52.2, 24.8, and 34.2 times of performance improvement compared with CTR_DRBG implementation on CPU-side when HIGHT-64/128, LEA-128/128, and CHAM-128/128 are used as underlying block cipher algorithm of CTR_DRBG, respectively.


2021 ◽  
Vol 11 (6) ◽  
pp. 2548
Author(s):  
JinGyo Song ◽  
Seog Chung Seo

With the advancement of 5G mobile telecommunication, various IoT (Internet of Things) devices communicate massive amounts of data by being connected to wireless networks. Since this wireless communication is vulnerable to hackers via data leakage during communication, the transmitted data should be encrypted through block ciphers to protect the data during communication. In addition, in order to encrypt the massive amounts of data securely, it is essential to apply one of secure mode of operation. Among them, CTR (CounTeR) mode is the most widely used in industrial applications. However, these IoT devices have limited resources of computing and memory compared to typical computers, so that it is challenging to process cryptographic algorithms that have computation-intensive tasks in IoT devices at high speed. Thus, it is required that cryptographic algorithms are optimized in IoT devices. In other words, optimizing cryptographic operations on these IoT devices is not only basic but also an essential effort in order to build secure IoT-based service systems. For efficient encryption on IoT devices, even though several ARX (Add-Rotate-XOR)-based ciphers have been proposed, it still necessary to improve the performance of encryption for smooth and secure IoT services. In this article, we propose the first parallel implementations of CTR mode of ARX-based ciphers: LEA (Lightweight Encryption Algorithm), HIGHT (high security and light weight), and revised CHAM on the ARMv8 platform, a popular microcontroller in various IoT applications. For the parallel implementation, we propose an efficient data parallelism technique and register scheduling, which maximizes the usage of vector registers. Through proposed techniques, we process the maximum amount of encryption simultaneously by utilizing all vector registers. Namely, in the case of HIGHT and revised CHAM-64/128 (resp. LEA, revised CHAM-128/128, and CHAM-128/256), we can execute 48 (resp. 24) encryptions simultaneously. In addition, we optimize the process of CTR mode by pre-computing and using the intermediate value of some initial rounds by utilizing the property that the nonce part of CTR mode input is fixed during encryptions. Through the pre-computation table, CTR mode is optimized up until round 4 in LEA, round 5 in HIGHT, and round 7 in revised CHAM. With the proposed parallel processing technique, our software provides 3.09%, 5.26%, and 9.52% of improved performance in LEA, HIGHT, and revised CHAM-64/128, respectively, compared to the existing parallel works in ARM-based MCU. Furthermore, with the proposed CTR mode optimization technique, our software provides the most improved performance with 8.76%, 8.62%, and 15.87% in LEA-CTR, HIGHT-CTR, and revised CHAM-CTR, respectively. This work is the fastest implementation of CTR mode on ARMv8 architecture to the best of our knowledge.


2021 ◽  
Vol 3 (2) ◽  
pp. 58-65
Author(s):  
Ya. R. Sovyn ◽  
◽  
V. V. Khoma ◽  

The article is devoted to the issues of increasing the security and efficiency of software implementation for the symmetric block ciphers. For the implementation of cryptoalgorithms on low-end CPUs (8/16/32-bit microcontrollers), it is important to provide increased resistance to power consumption analysis attacks. With regard to the implementation of ciphers on high-end CPUs (x86, ARM Cortex-A), it is important to eliminate the vulnerability primarily to timing and cache attacks. The authors used a bitslice approach to securely implement block ciphers, which has potential advantages such as high speed and low computing resources. However, the known bitsliced methods have a significant limitation, since they work with deterministic S-Boxes or arbitrary S-Boxes of smaller sizes. The paper proposes a new heuristic method for bitsliced representation of cryptographic 8×8 S-Boxes containing randomly generated values. These values defy description using algebraic expressions. The method is based on the decomposition of the truth table, which describes the S-Box, into two parts. One part of the table forms logical masks, and the other is split into bit vectors. To find a logical description of these vectors an exhaustive search is used. After finding the description of all vectors, these two parts of the table are combined into one using logical operations. The use of this method oriented on software implementation in the logical basis {AND, OR, XOR, NOT} ensures the minimization of arbitrary 8×8 S-Boxes. The proposed method can be implemented using standard logical instructions on any 8/16/32/64-bit processors. It is also possible to use logical SIMD instructions from the SSE, AVX, AVX-512 extensions for x86-64 processors, which provides high performance due to the use of long registers. The corresponding software has been developed that implements the method of searching for bitsliced representations of a given S-Box, and also automatically generates C++ code for it based on SSE, AVX and AVX-512 instructions. The effectiveness of the method on the S-Box of known block ciphers, in particular the Ukrainian encryption standard "Kalyna", has been investigated. It was found that the developed algorithm requires almost half as many gates for the bitsliced description of an arbitrary S-Box than the best of known algorithm (370 gates versus 680, respectively). For ciphers that use two or four S-Box tables, joint minimization can yield up to 330 or 300 gates per table, respectively. Keywords: bitslicing; S-Box; logical minimization; SIMD; x86-64 CPU; software implementation; block ciphers.


Author(s):  
Guruh Fajar Shidik ◽  
Edi Jaya Kusuma ◽  
Safira Nuraisha ◽  
Pulung Nurtantio Andono

2017 ◽  
Author(s):  
JOSEPH YIU

The increasing need for security in microcontrollers Security has long been a significant challenge in microcontroller applications(MCUs). Traditionally, many microcontroller systems did not have strong security measures against remote attacks as most of them are not connected to the Internet, and many microcontrollers are deemed to be cheap and simple. With the growth of IoT (Internet of Things), security in low cost microcontrollers moved toward the spotlight and the security requirements of these IoT devices are now just as critical as high-end systems due to:


Nowadays, Thailand is stepping into an aging society. This research purposes developing the intelligence walking stick for the elderly in terms of the health care system by applied the IoT devices and biometric sensors in a real-time system. The heart rate, blood pressure, oxygen saturation, and temperature were measured at the finger of the elderly that holding the intelligence walking stick. All data can monitor and display on mobile devices. The intelligence walking stick system was evaluated by twenty users who are five experts and fifteen elderly in Ratchaburi province. As a result of the mean value at 4.88 and 4.85 by experts and elderly, respectively. It could be said that the development of intelligence walking stick by using IoT can help and improve the daily living of the elderly at the highest level.


Sign in / Sign up

Export Citation Format

Share Document