scholarly journals Heuristic method for bitsliced representation of randomly generated 8×8 cryptographic S-Box

2021 ◽  
Vol 3 (2) ◽  
pp. 58-65
Author(s):  
Ya. R. Sovyn ◽  
◽  
V. V. Khoma ◽  

The article is devoted to the issues of increasing the security and efficiency of software implementation for the symmetric block ciphers. For the implementation of cryptoalgorithms on low-end CPUs (8/16/32-bit microcontrollers), it is important to provide increased resistance to power consumption analysis attacks. With regard to the implementation of ciphers on high-end CPUs (x86, ARM Cortex-A), it is important to eliminate the vulnerability primarily to timing and cache attacks. The authors used a bitslice approach to securely implement block ciphers, which has potential advantages such as high speed and low computing resources. However, the known bitsliced methods have a significant limitation, since they work with deterministic S-Boxes or arbitrary S-Boxes of smaller sizes. The paper proposes a new heuristic method for bitsliced representation of cryptographic 8×8 S-Boxes containing randomly generated values. These values defy description using algebraic expressions. The method is based on the decomposition of the truth table, which describes the S-Box, into two parts. One part of the table forms logical masks, and the other is split into bit vectors. To find a logical description of these vectors an exhaustive search is used. After finding the description of all vectors, these two parts of the table are combined into one using logical operations. The use of this method oriented on software implementation in the logical basis {AND, OR, XOR, NOT} ensures the minimization of arbitrary 8×8 S-Boxes. The proposed method can be implemented using standard logical instructions on any 8/16/32/64-bit processors. It is also possible to use logical SIMD instructions from the SSE, AVX, AVX-512 extensions for x86-64 processors, which provides high performance due to the use of long registers. The corresponding software has been developed that implements the method of searching for bitsliced representations of a given S-Box, and also automatically generates C++ code for it based on SSE, AVX and AVX-512 instructions. The effectiveness of the method on the S-Box of known block ciphers, in particular the Ukrainian encryption standard "Kalyna", has been investigated. It was found that the developed algorithm requires almost half as many gates for the bitsliced description of an arbitrary S-Box than the best of known algorithm (370 gates versus 680, respectively). For ciphers that use two or four S-Box tables, joint minimization can yield up to 330 or 300 gates per table, respectively. Keywords: bitslicing; S-Box; logical minimization; SIMD; x86-64 CPU; software implementation; block ciphers.

2021 ◽  
Vol 12 (7) ◽  
pp. 350-357
Author(s):  
S. S. Shevelev ◽  

A device has been developed that performs logical and arithmetic operations, which can be used to create high-performance, high-speed computing systems. Specialized blocks perform logical operations: AND, OR, NOT, arith­metic operations: addition and subtraction of binary numbers. Arithmetic operations are performed in direct fixed-point codes. The device is presented in the form of a structural scheme, structural and functional schemes of blocks and an algorithm for the operation of the device.


Author(s):  
Hwajeong Seo ◽  
Zhe Liu ◽  
Patrick Longa ◽  
Zhi Hu

We present high-speed implementations of the post-quantum supersingular isogeny Diffie-Hellman key exchange (SIDH) and the supersingular isogeny key encapsulation (SIKE) protocols for 32-bit ARMv7-A processors with NEON support. The high performance of our implementations is mainly due to carefully optimized multiprecision and modular arithmetic that finely integrates both ARM and NEON instructions in order to reduce the number of pipeline stalls and memory accesses, and a new Montgomery reduction technique that combines the use of the UMAAL instruction with a variant of the hybrid-scanning approach. In addition, we present efficient implementations of SIDH and SIKE for 64-bit ARMv8-A processors, based on a high-speed Montgomery multiplication that leverages the power of 64-bit instructions. Our experimental results consolidate the practicality of supersingular isogeny-based protocols for many real-world applications. For example, a full key-exchange execution of SIDHp503 is performed in about 176 million cycles on an ARM Cortex-A15 from the ARMv7-A family (i.e., 88 milliseconds @2.0GHz). On an ARM Cortex-A72 from the ARMv8-A family, the same operation can be carried out in about 90 million cycles (i.e., 45 milliseconds @1.992GHz). All our software is protected against timing and cache attacks. The techniques for modular multiplication presented in this work have broad applications to other cryptographic schemes.


Author(s):  
Gregor Leander ◽  
Thorben Moos ◽  
Amir Moradi ◽  
Shahram Rasoolzadeh

We introduce SPEEDY, a family of ultra low-latency block ciphers. We mix engineering expertise into each step of the cipher’s design process in order to create a secure encryption primitive with an extremely low latency in CMOS hardware. The centerpiece of our constructions is a high-speed 6-bit substitution box whose coordinate functions are realized as two-level NAND trees. In contrast to other low-latency block ciphers such as PRINCE, PRINCEv2, MANTIS and QARMA, we neither constrain ourselves by demanding decryption at low overhead, nor by requiring a super low area or energy. This freedom together with our gate- and transistor-level considerations allows us to create an ultra low-latency cipher which outperforms all known solutions in single-cycle encryption speed. Our main result, SPEEDY-6-192, is a 6-round 192-bit block and 192-bit key cipher which can be executed faster in hardware than any other known encryption primitive (including Gimli in Even-Mansour scheme and the Orthros pseudorandom function) and offers 128-bit security. One round more, i.e., SPEEDY-7-192, provides full 192-bit security. SPEEDY primarily targets hardware security solutions embedded in high-end CPUs, where area and energy restrictions are secondary while high performance is the number one priority.


2012 ◽  
Vol 2 (2) ◽  
pp. 251-268 ◽  
Author(s):  
Naoki Nishikawa ◽  
Keisuke Iwai ◽  
Takakazu Kurokawa

2018 ◽  
Vol 2018 ◽  
pp. 1-8 ◽  
Author(s):  
Dan Li ◽  
Jiazhe Chen ◽  
An Wang ◽  
Xiaoyun Wang

Low Entropy Masking Schemes (LEMS) are countermeasure techniques to mitigate the high performance overhead of masked hardware and software implementations of symmetric block ciphers by reducing the entropy of the mask sets. The security of LEMS depends on the choice of the mask sets. Previous research mainly focused on searching balanced mask sets for hardware implementations. In this paper, we find that those balanced mask sets may have vulnerabilities in terms of absolute difference when applied in software implemented LEMS. The experiments verify that such vulnerabilities certainly make the software LEMS implementations insecure. To fix the vulnerabilities, we present a selection criterion to choose the mask sets. When some feasible mask sets are already picked out by certain searching algorithms, our selection criterion could be a reference factor to help decide on a more secure one for software LEMS.


Author(s):  
N. Yoshimura ◽  
K. Shirota ◽  
T. Etoh

One of the most important requirements for a high-performance EM, especially an analytical EM using a fine beam probe, is to prevent specimen contamination by providing a clean high vacuum in the vicinity of the specimen. However, in almost all commercial EMs, the pressure in the vicinity of the specimen under observation is usually more than ten times higher than the pressure measured at the punping line. The EM column inevitably requires the use of greased Viton O-rings for fine movement, and specimens and films need to be exchanged frequently and several attachments may also be exchanged. For these reasons, a high speed pumping system, as well as a clean vacuum system, is now required. A newly developed electron microscope, the JEM-100CX features clean high vacuum in the vicinity of the specimen, realized by the use of a CASCADE type diffusion pump system which has been essentially improved over its predeces- sorD employed on the JEM-100C.


Author(s):  
Marc H. Peeters ◽  
Max T. Otten

Over the past decades, the combination of energy-dispersive analysis of X-rays and scanning electron microscopy has proved to be a powerful tool for fast and reliable elemental characterization of a large variety of specimens. The technique has evolved rapidly from a purely qualitative characterization method to a reliable quantitative way of analysis. In the last 5 years, an increasing need for automation is observed, whereby energy-dispersive analysers control the beam and stage movement of the scanning electron microscope in order to collect digital X-ray images and perform unattended point analysis over multiple locations.The Philips High-speed Analysis of X-rays system (PHAX-Scan) makes use of the high performance dual-processor structure of the EDAX PV9900 analyser and the databus structure of the Philips series 500 scanning electron microscope to provide a highly automated, user-friendly and extremely fast microanalysis system. The software that runs on the hardware described above was specifically designed to provide the ultimate attainable speed on the system.


Author(s):  
M. T. Postek ◽  
A. E. Vladar

One of the major advancements applied to scanning electron microscopy (SEM) during the past 10 years has been the development and application of digital imaging technology. Advancements in technology, notably the availability of less expensive, high-density memory chips and the development of high speed analog-to-digital converters, mass storage and high performance central processing units have fostered this revolution. Today, most modern SEM instruments have digital electronics as a standard feature. These instruments, generally have 8 bit or 256 gray levels with, at least, 512 × 512 pixel density operating at TV rate. In addition, current slow-scan commercial frame-grabber cards, directly applicable to the SEM, can have upwards of 12-14 bit lateral resolution permitting image acquisition at 4096 × 4096 resolution or greater. The two major categories of SEM systems to which digital technology have been applied are:In the analog SEM system the scan generator is normally operated in an analog manner and the image is displayed in an analog or "slow scan" mode.


Sign in / Sign up

Export Citation Format

Share Document